# Youtube Link
https://youtu.be/SMxRzBQ6eHA

![image.png](attachment:9525bef1-5e2a-4c94-970f-aa18e07ccaa3.png)

Generative AI, an innovative subset of artificial intelligence, holds profound significance in the realm of data science due to its remarkable capabilities in generating new, synthetic data and content. Here, we delve into why generative AI is both compelling and pertinent within the domain of data science, along with exploring its theoretical underpinnings.

**1. Why Generative AI is Interesting and Relevant in Data Science:**

**Data Augmentation:** Generative AI techniques provide a novel approach to data augmentation, which is crucial in data science. By generating synthetic data samples, these models can address issues of data scarcity and imbalance, thereby improving the robustness and generalization of machine learning models. This augmentation process enhances the diversity and completeness of datasets, leading to more accurate and reliable model predictions.

**Content Generation:** Generative AI models excel at producing human-like text, images, and other forms of content based on given prompts. In data science, this capability is invaluable for various applications such as natural language processing, text summarization, and dialogue systems. For instance, in text-based analysis, generative models can aid in content creation, automated report generation, and writing assistance tasks, thereby streamlining workflows and increasing productivity.

**Personalization:** Personalization is a key aspect of many data-driven applications, including recommendation systems, personalized marketing, and customer service. Generative AI models enable the creation of personalized content tailored to individual preferences and requirements. By understanding user input and generating contextually relevant responses, these models enhance user engagement and satisfaction, leading to more effective and targeted interactions.

**Creative Applications:** Beyond traditional data analysis tasks, generative AI fosters creativity and innovation in data science. These models empower users to explore new avenues of expression through art generation, music composition, and storytelling. By generating novel and diverse outputs, generative AI expands the possibilities of what can be achieved with AI, inspiring new forms of expression and exploration.

**2. Theoretical Foundations behind Generative AI:**

**Self-Attention Mechanism:** Generative AI models, particularly transformer-based architectures like GPT (Generative Pre-trained Transformer), leverage self-attention mechanisms to capture dependencies between different parts of the input sequence. This mechanism enables the model to weigh the importance of each input token based on its context within the sequence, facilitating effective long-range dependency modeling.

**Language Modeling:** Generative AI models are typically trained using unsupervised learning objectives such as language modeling. Language models learn the probability distribution of sequences in a given language by predicting the next token in a sequence given the previous tokens. This process enables the model to generate coherent and contextually relevant text based on the input prompt.

**Probabilistic Sampling:** Generative AI models learn probability distributions over the output space, allowing them to generate diverse and realistic samples. Sampling from these distributions produces novel outputs that resemble human-generated data. By incorporating probabilistic sampling techniques, generative AI models can produce outputs with varying degrees of creativity and diversity, catering to different application requirements and preferences.

In summary, generative AI offers a wealth of opportunities and advancements in data science, from enhancing data augmentation and content generation to fostering creativity and personalization. By leveraging theoretical foundations such as self-attention mechanisms, language modeling, and probabilistic sampling, generative AI models pave the way for innovative solutions and applications in the field of data science.

This code creates a Budgeting Assistant using Streamlit and OpenAI's GPT-3.5 model to generate personalized budget plans based on user input.

### 1. Setting up the Environment:
- The code imports necessary libraries such as Streamlit, OpenAI, and `dotenv` for loading environment variables.
- `load_dotenv()` loads environment variables from a `.env` file.
- The OpenAI API key is set using `os.getenv()` to securely access the OpenAI API.



In [None]:
import streamlit as st
import openai
import os
from dotenv import load_dotenv

# Load environment variables


In [None]:

load_dotenv()
# Set your OpenAI API key
openai.api_key = os.getenv("OPEN_AI_KEY")



### 2. Defining the `generate_budget_plan` Function:

- This function takes various parameters representing the user's financial details such as monthly income, expenses, savings goals, etc.
  
- It constructs a prompt string based on these input parameters, specifying the user's financial situation and requirements.
  
- The constructed prompt is then used to interact with the GPT-3.5 model through the `openai.ChatCompletion.create()` method, which generates a personalized budget plan based on the provided information.

In [None]:
def generate_budget_plan(monthly_income, expenses, savings_goals, time_to_achieve_goal,budget_to_achieve_goal, spending_categories):
    # Generate personalized budget plan using GPT-3
    initial_prompt = "You are a budget planning assistant with extensive knowledge in financial planning."
    question1 = f"Generate a monthly budget plan for a user with the following financial details:\n\
    - Monthly Income: ${monthly_income}\n\
    - Monthly Expenses: {expenses}\n\
    - Savings Goals: {savings_goals}\n\
    - Timeframe to Achieve Goals: {time_to_achieve_goal} months\n\
    - Budget Allocation for Goals: {budget_to_achieve_goal}\n\
    - Spending Categories: {spending_categories}\n\
    Ensure the budget plan includes allocations for each spending category and savings goals, and provides recommendations on how to distribute the available income effectively."

    response1 = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": initial_prompt},
            {"role": "user", "content": question1},
        ]
    )

    return response1['choices'][0]['message']['content']




### 3. Main Streamlit Application:

- **Title and Sidebar Inputs**: The `main()` function sets up the Streamlit application, displaying a title and a sidebar for input parameters.
  
- **Input Parameters**: Users can input their monthly income, expenses, savings goals, timeframe to achieve goals, budget for goals, and spending categories.
  
- **Generate Budget Plan Button**: When the user clicks the "Generate Budget Plan" button, the input parameters are passed to the `generate_budget_plan` function to generate a personalized budget plan.
  
- **Display of Budget Plan**: The generated budget plan is displayed in the main area of the Streamlit application interface, allowing users to view their personalized budget recommendations.

In [None]:
def main():
    st.title("Budgeting Assistant")

    # Input fields on the left side panel
    st.sidebar.title("Input Parameters")
    monthly_income = st.sidebar.number_input("Enter your monthly income:", value=0)
    expenses = st.sidebar.text_input("Enter your monthly expenses:", "")
    savings_goals = st.sidebar.text_input("Enter your savings goals:", "")
    time_to_achieve_goal = st.sidebar.text_input("Enter time you want to achieve goals in:", "")
    budget_to_achieve_goal = st.sidebar.text_input("Enter budget to achieve goals:", "")
    spending_categories = st.sidebar.text_input("Enter your spending categories separated by commas:", "")

    if st.sidebar.button("Generate Budget Plan"):
        # Output displayed in the middle area
        st.subheader("Your Personalized Budget Plan:")
        budget_plan = generate_budget_plan(monthly_income, expenses, savings_goals, time_to_achieve_goal,budget_to_achieve_goal, spending_categories)
        st.write(budget_plan)

### 4. Running the Application:
- The `if __name__ == "__main__":` block ensures that the `main()` function is executed when the script is run as the main program.



In [None]:
if __name__ == "__main__":
    main()


### Summary:

This code demonstrates how to build a Budgeting Assistant using Streamlit and OpenAI's GPT-3.5 model. Users can input their financial details, and the system generates a personalized budget plan tailored to their income, expenses, savings goals, and spending categories. This illustrates the practical application of Generative AI techniques in providing personalized financial recommendations and assisting users in managing their finances effectively.

![Screenshot 2024-04-05 at 12.24.06 AM.png](attachment:eaf45a1a-cf76-4bb6-9bad-534859659808.png)

### Validation 

In [1]:
pip install rouge-score


Collecting rouge-score
  Using cached rouge_score-0.1.2.tar.gz (17 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting absl-py (from rouge-score)
  Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting nltk (from rouge-score)
  Using cached nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB)
Using cached absl_py-2.1.0-py3-none-any.whl (133 kB)
Using cached nltk-3.8.1-py3-none-any.whl (1.5 MB)
Building wheels for collected packages: rouge-score
  Building wheel for rouge-score (pyproject.toml) ... [?25ldone
[?25h  Created wheel for rouge-score: filename=rouge_score-0.1.2-py3-none-any.whl size=24934 sha256=3c100f08da930e50e8aa198d3e175a44978cd06ea8982a5517876cba8297327b
  Stored in directory: /Users/rutujakute/Library/Caches/pip/wheels/85/9d/af/01feefbe7d55ef5468796f0c68225b6788e85d9d0a2

In [2]:
# Import necessary libraries
from rouge_score import rouge_scorer

In [3]:
# Example of reference and generated summaries
reference_summary = """Your Personalized Budget Plan:
Based on the financial details provided, here is a monthly budget plan:

**Monthly Income:** $10,000  
**Monthly Expenses:** $4,000  

**Budget Allocation:**

- **Savings Goals (Buying new car):** $3,000 (30%)
- **Luxury:** $1,000 (10%)
- **Travel:** $1,000 (10%)
- **House EMI:** $1,000 (10%)
- **Remaining for regular spending:** $4,000 (40%)

**Recommendations:**

1. **Savings Goals (Buying new car):**
   - Allocate 30% of your monthly income towards your savings goal of buying a new car. This will help you reach your goal in 1 year.

2. **Luxury, Travel, and House EMI:**
   - Allocate 10% each towards luxury expenses, travel expenses, and house EMI to ensure you are saving for your goals while also enjoying some luxuries and travels.

3. **Regular Spending:**
   - Allocate the remaining 40% towards regular spending categories like groceries, utilities, entertainment, and other daily expenses.

With this budget plan, you can effectively distribute your income to cover your expenses, save for your goals, and still have room for discretionary spending. Adjust the allocations based on your priorities and financial goals to ensure a balanced approach to budgeting. """

In [4]:
generated_summary ="""Your Personalized Budget Plan:
Based on the financial details provided, here is a monthly budget plan:

Monthly Income: $10000
Monthly Expenses: $4000
Budget Allocation:

Savings Goals (Buying new car): $3000 (30%)
Luxury: $1000 (10%)
Travel: $1000 (10%)
House EMI: $1000 (10%)
Remaining for regular spending: $4000 (40%)
Recommendations:

Allocate 30% of your monthly income towards your savings goal of buying a new car. This will help you reach your goal in 1 year.
Allocate 10% each towards luxury expenses, travel expenses, and house EMI to ensure you are saving for your goals while also enjoying some luxuries and travels.
Allocate the remaining 40% towards regular spending categories like groceries, utilities, entertainment, and other daily expenses.
With this budget plan, you can effectively distribute your income to cover your expenses, save for your goals, and still have room for discretionary spending. Adjust the allocations based on your priorities and financial goals to ensure a balanced approach to budgeting. """

In [5]:
# Initialize the ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeLsum'], use_stemmer=True)

# Calculate ROUGE scores
rouge_scores = scorer.score(reference_summary, generated_summary)

# Print the ROUGE scores
print("ROUGE-1 F1 Score:", rouge_scores['rouge1'].fmeasure)
print("ROUGE-2 F1 Score:", rouge_scores['rouge2'].fmeasure)
print("ROUGE-L F1 Score:", rouge_scores['rougeLsum'].fmeasure)

ROUGE-1 F1 Score: 0.8928571428571428
ROUGE-2 F1 Score: 0.8323353293413175
ROUGE-L F1 Score: 0.8928571428571428


ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate the quality of text summarization or machine-generated text against a set of reference summaries or human-generated summaries. These metrics are commonly used in natural language processing tasks, particularly in tasks involving text summarization and machine translation.

Let's break down the interpretation of the ROUGE scores provided:

1. **ROUGE-1 F1 Score:** 0.8928571428571428
   - ROUGE-1 measures overlap of unigrams (single words) between the generated text and the reference summary.
   - The F1 score is the harmonic mean of precision and recall, where precision measures the proportion of relevant items among the retrieved items, and recall measures the proportion of relevant items that were retrieved.
   - In this case, the ROUGE-1 F1 score of approximately 0.89 indicates that about 89% of the unigrams in the generated text overlap with those in the reference summary.

2. **ROUGE-2 F1 Score:** 0.8323353293413175
   - ROUGE-2 measures overlap of bigrams (sequences of two adjacent words) between the generated text and the reference summary.
   - Similar to ROUGE-1, the F1 score represents the balance between precision and recall of bigrams.
   - The ROUGE-2 F1 score of around 0.83 suggests that approximately 83% of the bigrams in the generated text match those in the reference summary.

3. **ROUGE-L F1 Score:** 0.8928571428571428
   - ROUGE-L considers the longest common subsequence (LCS) between the generated text and the reference summary, which is a more flexible measure than exact word matches.
   - Once again, the F1 score indicates the harmonic mean of precision and recall for the LCS.
   - The ROUGE-L F1 score, matching the ROUGE-1 score, also stands at approximately 0.89, suggesting a high overlap between the generated text and the reference summary when considering the longest common subsequences.

In summary, these ROUGE scores suggest that the generated text has a high level of overlap with the reference summary in terms of unigrams, bigrams, and longest common subsequences, indicating good performance in capturing the essential information from the reference summary.

## References and Resources:

OpenAI. "GPT-3.5: An Autoregressive Language Model with a Trillion Parameters." https://openai.com/research/gpt-3-5/

Radford, A., et al. "Language Models are Unsupervised Multitask Learners." 
https://cdn.openai.com/papers/LanguageModelsAreUnsupervisedMultitaskLearners.pdf

Vaswani, A., et al. "Attention is All You Need." https://arxiv.org/abs/1706.03762

Streamlit Documentation. https://docs.streamlit.io/en/stable/

This project demonstrates the practical application of generative AI techniques in data science, showcasing how advanced language models like GPT-3.5 can be leveraged to generate personalized content and enhance user experiences.

### License
MIT License

Copyright (c) 2024 Rutuja Kute

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.