## QUESTION ONE

Which example demonstrates text summarization, and how could you adapt it for summarizing long academic papers in your field?
- The text summarization example shows how an AI condenses long text into key ideas. For academic papers in Data Science, it can summarize research findings, methods, and results concisely to help students grasp the core content quickly. Summarization is a fundamental NLP task. By adjusting the prompt to include “summarize academic research for Bachelor’s level,” the model focuses on clarity and avoids complex jargon. The temperature can be kept low (e.g., 0.4–0.6) for factual accuracy. Example A student at Uganda Christian University could upload a long paper on machine learning and crop yield prediction to receive a simplified, student-friendly summary.

In [32]:
import warnings
warnings.filterwarnings('ignore')

import google.generativeai as genai
import os
from tqdm.autonotebook import tqdm as notebook_tqdm

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

model = genai.GenerativeModel("gemini-2.0-flash")

prompt = """
Summarize the following academic paper for a Bachelor's level Data Science student:
Title: Machine Learning and Crop Yield Prediction in Uganda.
Focus on methodology, findings, and conclusion.
"""

try:
    response = model.generate_content(prompt)
    print("Summary:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Summary:
 Okay, here's a summary of a hypothetical paper titled "Machine Learning and Crop Yield Prediction in Uganda," geared towards a Bachelor's level Data Science student, focusing on the requested aspects.  Since I don't have the *actual* paper, I'll create a plausible and common structure and content based on similar research in the field.

**Paper Title:** Machine Learning and Crop Yield Prediction in Uganda

**Summary for a Bachelor's Level Data Science Student:**

This paper investigates the application of machine learning (ML) techniques to predict crop yields in Uganda, aiming to improve agricultural planning and food security.  The study addresses the challenges of traditional yield estimation methods, which are often inaccurate and resource-intensive.

**Methodology:**

*   **Data Sources:** The researchers likely used a combination of datasets, including:
    *   **Historical Yield Data:**  Collected from agricultural surveys, government records (e.g., Ministry of Agricul

## QUESTION 2

In the code generation example, what risks do you see in relying on AI to write code? How would you mitigate them?

- AI-generated code may contain logical errors, security vulnerabilities, or inefficient structures. Relying solely on AI can also limit human understanding of the underlying logic. AI models lack contextual judgment and may produce syntactically correct but semantically wrong code. To mitigate risks, always review, test, and debug AI-generated scripts manually and use unit tests for verification. Example In Uganda, a Data Science student using AI to automate rainfall data processing should test outputs before integrating them into a national database

In [33]:
prompt = """
Generate a Python function that reads a CSV file of rainfall data
and calculates the average rainfall per district in Uganda.
Ensure code readability and data validation.
"""

try:
    response = model.generate_content(prompt)
    print("Generated Code:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Generated Code:
 ```python
import csv

def calculate_average_rainfall(csv_file_path):
    """
    Calculates the average rainfall per district in Uganda from a CSV file.

    Args:
        csv_file_path (str): The path to the CSV file containing rainfall data.
            The CSV file should have at least two columns:
            - 'District': The name of the district.
            - 'Rainfall': The rainfall amount (numeric).
            Other columns are ignored.

    Returns:
        dict: A dictionary where keys are district names and values are their average rainfall.
              Returns an empty dictionary if the input file is invalid or data is missing.
              Returns None if there's a file reading error.

    Raises:
        FileNotFoundError: If the specified CSV file does not exist.
        ValueError: If rainfall data is not numeric.
    """

    try:
        district_rainfall = {}
        district_counts = {}

        with open(csv_file_path, 'r', newline='') as csvf

## QUESTION 3

The classification example shows labeling inputs. How would you adjust the prompt to classify financial documents in Uganda’s banking sector?
- You would tailor the prompt to reflect Ugandan banking categories, such as “loan documents,” “account statements,” or “KYC forms.” By specifying contextual categories, the model learns the right classification boundaries. Adding examples (few-shot prompting) increases precision and reduces ambiguity. Example A Data Analyst at Stanbic Bank could use this for automatically sorting scanned financial documents.


In [34]:
prompt = """
Classify each of the following financial documents as one of:
["Loan Document", "Account Statement", "KYC Form", "Transaction Receipt"]

1. Customer ID form with photo and signature.
2. Monthly statement showing credits and debits.
3. Loan repayment schedule for 2025.
"""

try:
    response = model.generate_content(prompt)
    print("Classifications:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Classifications:
 Here's the classification of each document:

1.  **Customer ID form with photo and signature:** KYC Form
2.  **Monthly statement showing credits and debits:** Account Statement
3.  **Loan repayment schedule for 2025:** Loan Document


## QUESTION 4

In the conversation example, how does the prompt structure influence whether the model gives concise vs. elaborate responses?
- The structure—especially system role instructions—determines response length and tone. Telling the model to “respond briefly” yields concise answers, while “explain in detail” triggers elaborate reasoning. Prompt constraints act like communication rules. Models follow context instructions tightly, especially if role or style cues are provided early. Example In an AI chatbot for Data Science tutoring, concise answers are better for quizzes, while detailed ones help explain code concepts.

In [35]:
prompt = """
You are a Data Science tutor.
Answer the following question concisely:
What is overfitting in machine learning?
"""

try:
    response = model.generate_content(prompt)
    print("Concise Answer:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Concise Answer:
 Overfitting is when a machine learning model learns the training data *too well*, capturing noise and specific details instead of the underlying patterns. This leads to excellent performance on the training data but poor performance on new, unseen data (generalization).


# Question 5
**Which example best illustrates the use of few-shot prompting? What changes if you reduce the number of examples provided?**
- The few-shot example includes several input-output pairs to guide the model’s pattern learning. Reducing examples may lower accuracy and consistency. Few-shot prompting improves results by showing the model how to respond through examples rather than explicit instructions. With fewer examples, the model relies more on its pretraining, which can reduce domain-specific relevance. Example To classify crop diseases from text, providing two labeled examples (“coffee wilt disease,” “maize streak virus”) helps the model learn the expected style.

In [36]:
prompt = """
Classify these crop issues as 'Disease' or 'Pest':

Example 1:
Input: Coffee wilt disease affects yield.
Output: Disease

Example 2:
Input: Locust invasion damaged crops.
Output: Pest

Now classify:
Input: Maize streak virus reduces plant growth.
"""

try:
    response = model.generate_content(prompt)
    print("Classification:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Classification:
 Output: Disease


## QUESTION 6
Which example best illustrates the use of few-shot prompting? What changes if you reduce the number of examples provided?

- You need to explicitly instruct the model to translate into Luganda and include language context to guide tone and formality.Language prompts benefit from clear target-language specification. Adding sample translations improves fluency. Example Translating agricultural training materials from English to Luganda for rural farmers in Uganda.


In [37]:
prompt = "Translate to Luganda: Good morning, how are your crops today?"

try:
    response = model.generate_content(prompt)
    print("Luganda Translation:", response.text.strip())
except Exception as e:
    print("Error:", e)


Luganda Translation: The best translation of "Good morning, how are your crops today?" into Luganda is:

**"Olwale olutyo, emmere yo eri etya leero?"**

Here's a breakdown:

*   **Olwale olutyo:** Good morning. It literally means "The morning is beautiful"
*   **emmere yo:** Your crops. "emmere" means "food" or "crops", and "yo" means "your".
*   **eri etya:** Is how? How are...?
*   **leero:** Today.

So, the whole sentence literally means: "Good morning, your crops, how are they today?"

Alternatively, you could also say:

**"Wasuze otya, emmere yo eri etya leero?"**

*   **Wasuze otya?:** How was your night? (A common Luganda greeting)
*   **emmere yo eri etya leero?:** Your crops, how are they today? (same as above)

Both are perfectly acceptable and polite ways to ask about someone's crops. Choosing between them depends on the context of the relationship.


## QUESTION 7
Take the summarization example and experiment with different temperature values. How does the output change in style and precision?
- A low temperature (0.2) yields factual, consistent summaries, while a high temperature (0.8–1.0) produces creative, varied interpretations.Temperature controls randomness in output. Lower values make the model deterministic and concise; higher values generate more exploratory text. Example For summarizing academic reports, use 0.4; for brainstorming new research titles, use 0.9.

In [38]:
prompt = "Summarize: The study explores AI applications in smart farming for East Africa."

# Gemini doesn’t expose temperature directly, but you can emulate variation by re-prompting
for i in range(3):
    print(f"\n=== Attempt {i+1} ===")
    response = model.generate_content(prompt)
    print(response.text.strip())



=== Attempt 1 ===
The study investigates how Artificial Intelligence (AI) can be used to improve farming practices in East Africa, focusing on its potential to enhance agricultural productivity and sustainability.

=== Attempt 2 ===
This study investigates how artificial intelligence (AI) can be used in smart farming practices to improve agriculture in East Africa.

=== Attempt 3 ===
The study focuses on how artificial intelligence (AI) can be used to improve agricultural practices and outcomes in the context of smart farming in East Africa.


## QUESTION 8

In the Q&A example, add a constraint (e.g., answer in under 20 words). What happens when the model cannot fit the answer within the constraint?
- The model attempts to comply, sometimes oversimplifying or omitting detail. If constraints conflict with completeness, brevity takes precedence.Constraints limit token length or style. Well-defined constraints enhance focus but can cause information loss if too strict. Example For chatbot assessments, students might be required to answer concisely in under 20 words.

In [39]:
prompt = "Answer in under 20 words: What is artificial intelligence?"

try:
    response = model.generate_content(prompt)
    print("Constrained Answer:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Constrained Answer:
 Artificial intelligence is the simulation of human intelligence processes by computer systems.


## QUESTION 9

Use the text-to-SQL example but adapt it for a dataset of students’ exam records.
- You can prompt the model to generate SQL queries that analyze student performance by subject or average scores. Text-to-SQL leverages natural language understanding to build structured database queries. Contextual prompts improve column mapping accuracy. Example A Data Science system at UCU can automatically query student averages or identify students failing two or more subjects.

In [40]:
prompt = """
Convert this question into an SQL query:
"Find all students with average marks above 80 in the 'Data Science' course."
Assume table name = student_records(columns: student_id, name, course, marks)
"""

try:
    response = model.generate_content(prompt)
    print("Generated SQL:\n", response.text.strip())
except Exception as e:
    print("Error:", e)


Generated SQL:
 ```sql
SELECT student_id, name
FROM student_records
WHERE course = 'Data Science'
GROUP BY student_id, name
HAVING AVG(marks) > 80;
```

**Explanation:**

1. **`SELECT student_id, name`**: This selects the student ID and name that satisfy the condition.
2. **`FROM student_records`**:  Specifies the table to query.
3. **`WHERE course = 'Data Science'`**: This filters the records to only include those where the course is 'Data Science'.
4. **`GROUP BY student_id, name`**:  This groups the records by student ID and name. This is crucial because we want to calculate the *average* marks *per student*.  Without this, `AVG(marks)` would calculate the average across *all* students in the 'Data Science' course, which is not what we want.
5. **`HAVING AVG(marks) > 80`**:  This filters the *grouped* results.  It only keeps the groups (i.e., the students) whose average marks are greater than 80.  `HAVING` is used for filtering *after* aggregation (like `AVG`, `SUM`, `COUNT`), while

##  QUESTION 10

Compare the outputs of creative writing and code generation prompts using the same seed idea (e.g., “design an app for Harba Store”). What differences do you observe?
- Creative writing focuses on imagination, narrative, and user experience, while code generation emphasizes structure, logic, and implementation.Different prompt types activate different reasoning modes in the model. Creative prompts yield descriptive outputs; code prompts generate structured, syntax-based results. Example A Data Science student designing a store inventory app will get an app idea (creative prompt) or an actual Python script (code prompt).


In [41]:
# Creative version
creative_prompt = "Write a short concept for a smart inventory app for Harba Store in Uganda."
creative_response = model.generate_content(creative_prompt)
print("Creative Idea:\n", creative_response.text.strip())

# Code generation version
code_prompt = "Write a Python Flask API that manages product inventory for Harba Store."
code_response = model.generate_content(code_prompt)
print("\nGenerated Code:\n", code_response.text.strip())


Creative Idea:
 ## Harba Store Smart Inventory App: "HarbaStock"

**Concept:** HarbaStock is a mobile-first smart inventory management application designed specifically for Harba Store in Uganda, addressing the unique challenges of fluctuating demand, unreliable infrastructure, and limited connectivity.

**Core Functionality:**

*   **Real-time Stock Tracking:**  Allows staff to easily track stock levels for each item across all store locations (if applicable).  Utilizes barcode scanning and manual entry for efficient updates.
*   **Low Stock Alerts:**  Automatically notifies staff via SMS and in-app notifications when stock levels reach pre-defined thresholds, preventing stockouts and maximizing sales.
*   **Order Management:**  Simplifies the ordering process by generating purchase orders based on low stock alerts and sales trends. Tracks order status and manages supplier information.
*   **Sales Data Analysis:**  Provides insightful reports on sales trends, popular items, and slow-m