# Hackathon AI Indus Week
## Data Science and AI - Batch 3
Note: select any dataset of sales of your choice. Make sure you do not change the actual code in the notebook. Else auto grader will not grade your task correctly.

# Section 1: Python Problem Solving

## Problem 1: String Compression
Task: Write a function compress_string(s) that compresses a string using the counts of repeated characters. Example: "aaabbc" â†’ "a3b2c1"

In [2]:
def compress_string(s: str) -> str:
    """
    Compress the string by counting consecutive characters.
    Example: 'aaabbc' -> 'a3b2c1'
    """

def compress_string(s: str) -> str:
    if not s:
        return ""

    compressed = ""
    count = 1

    for i in range(1, len(s)):
        if s[i] == s[i - 1]:
            count += 1
        else:
            compressed += s[i - 1] + str(count)
            count = 1

    compressed += s[-1] + str(count)
    return compressed



pass


## Problem 2: Reverse a String
Task: Write reverse_string(s) that returns the reversed version of the input string.

In [3]:
def reverse_string(s: str) -> str:
    """
    Return the reversed string.
    Example: 'hello' -> 'olleh'
    """
    return s[::-1]

pass

## Problem 3: Sum of List
Task: Write sum_list(numbers) that returns the sum of all numbers in a list.

In [4]:
def sum_list(numbers: list[int]) -> int:
    """
    Return the sum of all numbers in the list.
    Example: [1,2,3] -> 6
    """
    return sum(numbers)


## Problem 4: Palindrome Check
Task: Write is_palindrome(s) that checks if a string reads the same forward and backward.

In [5]:
def is_palindrome(s: str) -> bool:
    """
    Check if the string is a palindrome.
    Example: 'madam' -> True
    """
    s = s.lower()          # ignore case
    s = s.replace(" ", "") # ignore spaces
    return s == s[::-1]


## Problem 5: Check Balanced Parentheses
Task: Write is_balanced(s) that checks if a string of parentheses is balanced.
Example: "(()())" â†’ True, "(()" â†’ False

In [6]:
# Starter Code
def is_balanced(s: str) -> bool:
    """
    Check if parentheses are balanced.
    Example: '(()())' -> True, '(()' -> False
    """
    stack = []

    for char in s:
        if char == '(':
            stack.append(char)
        elif char == ')':
            if not stack:
                return False
            stack.pop()

    return len(stack) == 0


# Section 2: Data Analysis & Visualization
You can select any sales data from kaggle.

## Problem 1: Average Sales by Region

In [None]:
import pandas as pd

# Replace 'sales_data.csv' with your file path
df = pd.read_csv("")

# Check first 5 rows
df.head()


In [7]:
import pandas as pd

def average_sales_by_region(df: pd.DataFrame) -> pd.DataFrame:
    """
    Calculate the average sales per region.
    """
    result = df.groupby('Region', as_index=False)['Sales'].mean()
    result.rename(columns={'Sales': 'Average_Sales'}, inplace=True)

    return result


## Problem 2: Monthly Sales Trend

In [1]:
import matplotlib.pyplot as plt
import pandas as pd

def plot_sales_trend(df: pd.DataFrame):
    """
    Plot line chart of sales over months.
    Assumes df has 'Month' and 'Sales' columns.
    """
    fig, ax = plt.subplots(figsize=(10,6))  # create figure and axis
    ax.plot(df['Month'], df['Sales'], marker='o', linestyle='-')
    ax.set_title('Monthly Sales Trend')
    ax.set_xlabel('Month')
    ax.set_ylabel('Sales')
    ax.grid(True)

    plt.xticks(rotation=45)  # rotate month labels if needed
    plt.tight_layout()
    plt.show()

    return ax



## Problem 3: Customer Retention Rate Analysis
Scenario:  
You are given a DataFrame df with columns:

CustomerID

Month (e.g., "2025-01", "2025-02")

PurchaseAmount

Each row represents a purchase made by a customer in a given month.

Task:  
Write a function customer_retention(df) that calculates the monthly retention rate:

Retention rate = (Number of customers who purchased in both current and previous month) Ã· (Number of customers who purchased in the previous month).

Return a DataFrame with columns: Month and RetentionRate.

This requires students to think about set intersections across months, not just simple grouping.

In [10]:
import pandas as pd

def customer_retention(df: pd.DataFrame) -> pd.DataFrame:
    # 1. Get unique customers for each month
    # We use a set for each month to make intersections easy later
    monthly_customers = df.groupby('Month')['CustomerID'].apply(set)

    # 2. Sort index to ensure we are comparing consecutive months correctly
    monthly_customers = monthly_customers.sort_index()

    retention_results = []
    months = monthly_customers.index

    # 3. Iterate starting from the second month (index 1)
    # because the first month has no "previous" month to compare to
    for i in range(1, len(months)):
        prev_month = months[i-1]
        curr_month = months[i]

        prev_set = monthly_customers[prev_month]
        curr_set = monthly_customers[curr_month]

        # Intersection: customers in BOTH months
        retained_customers = curr_set.intersection(prev_set)

        # Calculation: (Both) / (Previous)
        # We check if prev_set is empty to avoid division by zero
        rate = len(retained_customers) / len(prev_set) if len(prev_set) > 0 else 0.0

        retention_results.append({
            'Month': curr_month,
            'RetentionRate': rate
        })

    return pd.DataFrame(retention_results)

# Section 3: Generative AI Chatbot
Create a chatbot for Study Aid providing detailed notes with visualization on the topic requested by user. Use both text and image models. Make sure to use any open source free model. List your chatbot features. Create UI with gradio.

In [16]:
import gradio as gr
from huggingface_hub import InferenceClient

# Apna secret token yahan dalein
client = InferenceClient(token="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")

def study_bot(topic):
    try:
        # 1. AI se Text mangwana
        text_res = client.chat_completion(
            model="mistralai/Mistral-7B-Instruct-v0.2",
            messages=[{"role": "user", "content": f"Write short study notes on {topic}"}],
            max_tokens=300
        )
        notes = text_res.choices[0].message.content

        # 2. AI se Image mangwana
        image = client.text_to_image(f"Educational diagram of {topic}", model="stabilityai/stable-diffusion-xl-base-1.0")

        return notes, image
    except Exception as e:
        return f"Error aya hai: {str(e)}", None

# Gradio Interface
with gr.Blocks() as demo:
    gr.Markdown("# ðŸŽ“ My Study Chatbot")
    input_box = gr.Textbox(label="Topic Likhein (e.g. Photosynthesis)")
    btn = gr.Button("Generate")

    with gr.Row():
        txt_out = gr.Textbox(label="Notes")
        img_out = gr.Image(label="Diagram")

    btn.click(study_bot, inputs=input_box, outputs=[txt_out, img_out])

demo.launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://34abd2f07fdda6ced0.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


