# 🧪 Lab 2: Building a Streamlit App with GenAI, Data Handling, and Visualization

📌 **Important:** This code needs to be copied into a streamlit app. It will not run properly in Jupyter Notebook.  

Save the file as streamlit_app.py. You can then run it from the command line as:

```bash
streamlit run streamlit_app.py
```
This will launch your app in the browser.

In this lab, you'll build a simple prototype using Streamlit and OpenAI's GPT model to classify product reviews and visualize the results.

## Starting Code
In a previous lesson, you built a working GenAI prototype that loads real data, applies a model, and visualizes the output interactively with filtering by product and date.

Here is the code: 

In [None]:
import pandas as pd
import streamlit as st

df = pd.read_csv("customer_reviews.csv")
df.head()

df.dropna(inplace=True)
df["SUMMARY"] = df["SUMMARY"].astype(str)

@st.cache_data
def get_sentiment(text):
    import openai
    from dotenv import load_dotenv
    import os
    load_dotenv()
    openai.api_key = os.getenv("OPENAI_API_KEY")

    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": f"What’s the sentiment score of this review? {text}"}]
    )
    return response.choices[0].message.content

# if the button is clicked on, do sentiment analysis
if st.button("Analyze Reviews"):
    df["Sentiment"] = df["SUMMARY"].apply(get_sentiment)
    df

# show the results as a bar chart
st.bar_chart(df["Sentiment"].value_counts())

# add a drop-down menu to select products
product = st.selectbox("Choose a product", df["Product"].unique())
filtered = df[df["Product"] == product]
filtered

## Lab 2 Instructions: 
Take the prototype you just built and customize it independently to analyze a different kind of user input using the same GenAI integration.

Instead of analyzing customer sentiment, imagine you’re building a prototype for a support ticket triage system. 

The goal is to classify support requests as either: 
<ul>
	<li>"bug" </li>
	<li>"feature request"</li> 
	<li>"question" </li>	
</ul>
 
✅ What You’ll Do:
<ul>
    <li>1.	Replace the dataset with a file called support_tickets.csv, located in the M1/Lab2 folder.</li> 
	<li>2.	Update the GenAI prompt in your get_sentiment() function to instead ask:</li>
    <li>    What type of support request is this: bug, feature request, or question?</li>    
	<li>3.	Rename the function to classify_ticket() for clarity. </li>
	<li>4.	Change the output column name from "Sentiment" to "Category". </li>
	<li>5.	Update your charts to show counts of each ticket category instead of sentiment. </li>
	<li>6.	[Optional] Add a product filter if your data includes a product field. </li>
    
</ul>

You can add a date range selector using the following function. Make sure to add it to the right place in your code: 

In [None]:
# Ensure your DATE column is in datetime format
df["DATE"] = pd.to_datetime(df["DATE"])

# create summary stats for the date range
date_range = st.date_input(
    "Select dates",
    (df['DATE'].min().date(), df['DATE'].median().date()),
    df['DATE'].min().date(),
    df['DATE'].max().date()
)

Explanation:  
	•	This creates a two-part date picker.  
	•	The default selection spans from the earliest date in your dataset to the median date.  
	•	The widget limits selectable dates to your dataset’s min and max range—so the user can’t choose a date that doesn’t exist.  

🗣️ “We’re using the st.date_input() widget to create a user-friendly way to select a custom date range. The dataset’s minimum and median dates serve as our default range.

In [None]:
if len(date_range) == 2:
    start_date = pd.to_datetime(date_range[0])
    end_date = pd.to_datetime(date_range[1])

    st.write("Selected dates:", start_date.date(), "to", end_date.date())

    df_filtered = df[(df["DATE"] >= start_date) & (df["DATE"] <= end_date)]
    df_filtered

Explanation:  
	•	We check that the user actually picked two dates (start and end).  
	•	Then we convert those to datetime format using Pandas.  
	•	Finally, we filter the DataFrame and display the result.  

🗣️ “Once a range is selected, we filter the DataFrame to only include rows where the ‘DATE’ column falls within that range. This helps users zoom in on specific time periods—like analyzing just the last 30 days, or comparing different quarters.”