# Youtube Link

https://youtu.be/SMxRzBQ6eHA

# Step 1: Theoretical Foundations of Generative AI

![image.png](attachment:9525bef1-5e2a-4c94-970f-aa18e07ccaa3.png)

Generative AI, an innovative subset of artificial intelligence, holds profound significance in the realm of data science due to its remarkable capabilities in generating new, synthetic data and content. Here, we delve into why generative AI is both compelling and pertinent within the domain of data science, along with exploring its theoretical underpinnings.

**1. Why Generative AI is Interesting and Relevant in Data Science:**

**Data Augmentation:** Generative AI techniques provide a novel approach to data augmentation, which is crucial in data science. By generating synthetic data samples, these models can address issues of data scarcity and imbalance, thereby improving the robustness and generalization of machine learning models. This augmentation process enhances the diversity and completeness of datasets, leading to more accurate and reliable model predictions.

**Content Generation:** Generative AI models excel at producing human-like text, images, and other forms of content based on given prompts. In data science, this capability is invaluable for various applications such as natural language processing, text summarization, and dialogue systems. For instance, in text-based analysis, generative models can aid in content creation, automated report generation, and writing assistance tasks, thereby streamlining workflows and increasing productivity.

**Personalization:** Personalization is a key aspect of many data-driven applications, including recommendation systems, personalized marketing, and customer service. Generative AI models enable the creation of personalized content tailored to individual preferences and requirements. By understanding user input and generating contextually relevant responses, these models enhance user engagement and satisfaction, leading to more effective and targeted interactions.

**Creative Applications:** Beyond traditional data analysis tasks, generative AI fosters creativity and innovation in data science. These models empower users to explore new avenues of expression through art generation, music composition, and storytelling. By generating novel and diverse outputs, generative AI expands the possibilities of what can be achieved with AI, inspiring new forms of expression and exploration.

**2. Theoretical Foundations behind Generative AI:**

**Self-Attention Mechanism:** Generative AI models, particularly transformer-based architectures like GPT (Generative Pre-trained Transformer), leverage self-attention mechanisms to capture dependencies between different parts of the input sequence. This mechanism enables the model to weigh the importance of each input token based on its context within the sequence, facilitating effective long-range dependency modeling.

**Language Modeling:** Generative AI models are typically trained using unsupervised learning objectives such as language modeling. Language models learn the probability distribution of sequences in a given language by predicting the next token in a sequence given the previous tokens. This process enables the model to generate coherent and contextually relevant text based on the input prompt.

**Probabilistic Sampling:** Generative AI models learn probability distributions over the output space, allowing them to generate diverse and realistic samples. Sampling from these distributions produces novel outputs that resemble human-generated data. By incorporating probabilistic sampling techniques, generative AI models can produce outputs with varying degrees of creativity and diversity, catering to different application requirements and preferences.

In summary, generative AI offers a wealth of opportunities and advancements in data science, from enhancing data augmentation and content generation to fostering creativity and personalization. By leveraging theoretical foundations such as self-attention mechanisms, language modeling, and probabilistic sampling, generative AI models pave the way for innovative solutions and applications in the field of data science.

# Step 2: Introduction to Data Generation

Introduction:
Data generation using generative AI techniques involves creating new data samples that resemble a given dataset. This process is crucial in various fields such as image synthesis, text generation, and music composition. In the context of data science, data generation plays a vital role in tasks such as data augmentation, synthetic data creation, and anomaly detection. Generative AI techniques enable the generation of diverse and realistic data samples, which can enhance the performance and robustness of machine learning models.

Significance:
The significance of data generation using generative AI lies in its ability to address challenges related to data scarcity, diversity, and augmentation. In many real-world scenarios, obtaining large and diverse datasets for training machine learning models can be expensive, time-consuming, or even infeasible. Generative AI techniques offer a solution by synthesizing new data samples that capture the underlying patterns and characteristics of the original dataset. This enables researchers and practitioners to create larger and more diverse datasets, improving the generalization and effectiveness of machine learning models.

Principles behind Data Generation:
The chosen generative AI technique for data generation plays a crucial role in determining the effectiveness and quality of the generated data. One common technique is Generative Adversarial Networks (GANs). GANs consist of two neural networks – a generator and a discriminator – trained simultaneously through a competitive process. The generator learns to produce synthetic data samples, while the discriminator learns to distinguish between real and generated samples. Through adversarial training, the generator improves its ability to create realistic data samples that are indistinguishable from real data, while the discriminator improves its ability to accurately classify real and generated samples.

Data Generation Technique: Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs) have emerged as a powerful technique for data generation across various domains. In the context of data science, GANs are used to generate synthetic data that closely resembles real data. The key components of GANs are:

1. Generator: The generator network takes random noise as input and generates synthetic data samples. It learns to map the input noise to output data samples, gradually improving its ability to generate realistic data through training.

2. Discriminator: The discriminator network takes both real and generated data samples as input and aims to distinguish between them. It learns to classify input samples as either real or generated, providing feedback to the generator on how to improve its data generation process.

The training process of GANs involves a competitive game between the generator and the discriminator. The generator aims to produce data samples that are indistinguishable from real data, while the discriminator aims to accurately classify real and generated samples. Through adversarial training, both networks improve their performance iteratively, leading to the generation of high-quality synthetic data.

The purpose of using GANs for data generation is to create new data samples that capture the underlying distribution and characteristics of the original dataset. This enables researchers and practitioners to address data scarcity, diversity, and augmentation challenges in data science tasks, ultimately improving the performance and generalization of machine learning models.

# Step 3: Analyzing the Generated Data

Examine the data generated by the chosen technique:

Data Characteristics:
The nature and properties of the data generated by the chosen generative AI technique, such as Generative Adversarial Networks (GANs), depend on various factors including the training data, model architecture, and training parameters. Generally, the generated data exhibits characteristics similar to the real data it was trained on, but may also contain variations and nuances introduced by the generative process.

1. Similarity to Real Data: The generated data typically resembles the distribution and patterns of the real data used for training the generative AI model. This includes statistical properties, structural patterns, and semantic features.

2. Variation and Novelty: Despite resembling real data, the generated samples may exhibit variations and novel combinations not present in the original dataset. This can result in the creation of new and diverse data samples that capture different aspects of the underlying data distribution.

3. Quality and Fidelity: The quality of the generated data can vary depending on the effectiveness of the generative AI model. High-quality generated data closely matches the characteristics of real data, while lower-quality data may exhibit artifacts, distortions, or inconsistencies.

Application Areas:
The generated data can be applied in various domains and applications, leveraging its similarity to real data while also offering opportunities for exploration and innovation.

1. Data Augmentation: Generated data can be used to augment existing datasets, increasing their size and diversity. This is particularly useful in machine learning tasks where larger and more varied datasets lead to improved model performance and generalization.

2. Synthetic Data Generation: In scenarios where real data is limited, sensitive, or difficult to obtain, generated data can serve as a substitute. Synthetic data can be used for training machine learning models, conducting simulations, and performing analysis without relying on real-world data.

3. Anomaly Detection and Outlier Analysis: The generated data can be utilized to train anomaly detection models or identify outliers within the dataset. By examining deviations from the generated data distribution, anomalies and outliers can be detected, helping to identify potential issues or anomalies in real-world data.

Analytical Insights:
Analyzing the generated data can provide valuable insights into the underlying data distribution, model performance, and potential applications. Some potential insights include:

1. Distribution Analysis: Comparing the distribution of generated data with real data can reveal similarities, differences, and potential biases in the generative AI model. Understanding the distribution of generated data is crucial for assessing its suitability for specific tasks and applications.

2. Novelty Detection: Identifying novel patterns, combinations, or outliers within the generated data can lead to insights into emerging trends, hidden relationships, or previously unrecognized phenomena within the dataset.

3. Model Evaluation: Analyzing the quality, diversity, and fidelity of generated data provides insights into the effectiveness of the generative AI technique. This includes assessing the model's ability to capture complex data patterns, generate realistic samples, and generalize to unseen data.

Overall, analyzing the generated data enables researchers and practitioners to gain deeper insights into the capabilities of the generative AI technique, explore potential applications, and make informed decisions regarding its use in data science tasks and applications.

# Step 4: Engaging with Generative AI for Data Generation

Engaging with the generative AI technique involves exploring its capabilities, understanding its data generation process, and validating the quality and diversity of the generated data. Instead of instructing the generative AI to create data directly, it is used as a tool to guide and validate the data generation process.

1. Querying the generative AI for insights into its data generation process:
   - To understand the data generation process, queries can be made to the generative AI model regarding its architecture, training data, optimization objectives, and generation mechanisms.
   - Questions can be asked to gain insights into how the generative AI model learns to capture underlying patterns and distributions in the data, how it balances exploration and exploitation during training, and how it handles various challenges such as mode collapse or overfitting.
   - Through interactions with the generative AI model, researchers can gain a deeper understanding of its inner workings, limitations, and areas for improvement.

2. Exploring various data generation scenarios using the technique:
   - Experimentation with different input parameters, model architectures, and training strategies can help explore various data generation scenarios.
   - By adjusting parameters such as noise dimensions, learning rates, and network architectures, researchers can observe how these changes affect the diversity, quality, and realism of the generated data.
   - Exploring different data generation scenarios enables researchers to understand the robustness and flexibility of the generative AI technique across different domains and applications.

3. Validating the quality and diversity of generated data:
   - Validation of generated data involves assessing its quality, diversity, and fidelity compared to real data.
   - Quality evaluation measures such as perceptual quality, reconstruction error, or likelihood scores can be used to assess how well the generated data matches the characteristics of real data.
   - Diversity evaluation involves analyzing the variety and richness of the generated data, ensuring that it covers a wide range of data patterns and samples.
   - Validation techniques such as cross-validation, quantitative metrics, and qualitative analysis can be employed to validate the quality and diversity of the generated data.
   - Additionally, human evaluation through expert judgment or user studies can provide valuable feedback on the subjective aspects of generated data, such as relevance, coherence, and usefulness.

By engaging with the generative AI technique in this manner, researchers can gain insights into its capabilities, limitations, and potential applications. This iterative process of exploration, interrogation, and validation helps in refining the data generation process, improving the quality of generated data, and guiding its application in various data science tasks and domains.

# Step 5: Crafting Your Generated Data

Crafting generated data involves defining the specific data generation task, specifying the format of the generated data, providing illustrative examples, and establishing constraints to ensure the generated data meets the desired criteria. This step requires a clear understanding of the generative AI technique and its capabilities.

1. Define the specific data generation task:
   - The first step is to clearly define the objective of the data generation task. This involves identifying the target domain, characteristics of the data to be generated, and the purpose of the generated data.
   - For example, in the context of image generation, the task might be to generate realistic images of landscapes, animals, or human faces. In text generation, the task could involve creating coherent paragraphs of text, generating poetry, or composing music.
   - The data generation task should be well-defined and aligned with the objectives of the project or application.

2. Specify the format of the generated data:
   - Once the data generation task is defined, the next step is to specify the format of the generated data. This includes determining the data structure, attributes, and any other relevant metadata.
   - For example, if the task involves generating images, the format specification may include image resolution, color depth, and file format. For text generation, it may include the length of the text, language style, and formatting guidelines.
   - Specifying the format of the generated data ensures consistency and compatibility with downstream applications or analysis.

3. Provide illustrative examples of the generated data:
   - Illustrative examples of the generated data help visualize the output of the generative AI model and demonstrate its capabilities.
   - Generated data samples can be presented in various formats such as images, text snippets, or audio clips, depending on the nature of the data generation task.
   - These examples showcase the diversity, quality, and realism of the generated data, providing insights into its potential applications and use cases.

4. Establish constraints to ensure the generated data meets the desired criteria:
   - Constraints are rules or conditions imposed on the data generation process to ensure that the generated data meets specific criteria or requirements.
   - Constraints can be related to various aspects of the data, including quality, diversity, relevance, and usability.
   - Examples of constraints include ensuring that generated images are within a certain resolution range, text is grammatically correct and coherent, or generated music adheres to specified genre or style guidelines.
   - Establishing constraints helps maintain the quality and relevance of the generated data, ensuring that it meets the needs of the intended application or analysis.

By following these steps, researchers and practitioners can effectively craft generated data that aligns with the objectives of the data generation task, adheres to specified format requirements, provides illustrative examples of the generated data, and meets predefined constraints. This ensures that the generated data is of high quality, relevant, and suitable for use in various data science tasks and applications.

# Step 6: Demonstrating Data Generation

This code below creates a Travel Itinerary Planner using Streamlit and OpenAI's GPT-3.5 model to generate personalized travel itineraries based on user input.

### 1. Setting up the Environment:
- The code imports necessary libraries such as Streamlit, OpenAI, and `dotenv` for loading environment variables.
- `load_dotenv()` loads environment variables from a `.env` file.
- The OpenAI API key is set using `os.getenv()` to securely access the OpenAI API.



In [None]:
import streamlit as st
import openai
import os
from dotenv import load_dotenv

### 2. Defining the `generate_itinerary` Function:
- This function takes various parameters representing user input such as destination, dates, budget, interests, etc.
- It constructs a prompt string based on the input parameters.
- The constructed prompt is then used to interact with OpenAI's GPT-3.5 model using `openai.ChatCompletion.create()` to generate the travel itinerary.
- The generated itinerary is returned as a response.



In [None]:
# Load environment variables
load_dotenv()

# Set your OpenAI API key
openai.api_key = os.getenv("OPEN_AI_KEY")

In [None]:
def generate_itinerary(destination, dates, budget, interests, accommodations, travel_style, transportation, group_composition, accessibility, language, weather, cultural_interests, budget_breakdown, previous_experiences, special_occasion):
    # Generate personalized budget plan using GPT-3
    initial_prompt = "You are a travel itinerary planning assistant with extensive knowledge in travel itinerary planning."
    question1 = f"""Generate a personalized travel itinerary for a trip to {destination} from 
    {dates[0]} to {dates[1]}. The budget is {budget}. Interests include {', '.join(interests)}. 
    Preferred accommodations: {accommodations}. Travel style: {travel_style}. Preferred transportation: {transportation}. 
    Group composition: {group_composition}. Accessibility needs: {accessibility}. Language preference: {language}. 
    Weather preference: {weather}. Cultural interests: {cultural_interests}. Budget breakdown: {budget_breakdown}. 
    Previous travel experiences: {previous_experiences}. Special occasion: {special_occasion}."""

    response1 = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": initial_prompt},
            {"role": "user", "content": question1},
        ]
    )

    return response1['choices'][0]['message']['content']




### 3. Main Streamlit Application:
- The `main()` function sets up the Streamlit application.
- It creates a title and welcome message for the Travel Itinerary Planner.
- Input parameters such as destination, dates, budget, etc., are presented using Streamlit sidebar widgets.
- When the user clicks the "Generate Itinerary" button, the `generate_itinerary` function is called with the provided input parameters.
- The generated itinerary is then displayed in a subheader and written to the Streamlit app interface.



In [None]:
def main():
    st.title("Travel Itinerary Planner")
    st.write("Welcome to the Travel Itinerary Planner! Please fill out the following details to generate your personalized travel itinerary.")
    
    st.sidebar.title("Input Parameters")
    destination = st.sidebar.text_input("Destination:")
    dates = st.sidebar.date_input("Travel Dates:", [])
    budget = st.sidebar.number_input("Budget (in USD):", value=1000)
    interests = st.sidebar.multiselect("Interests:", ["Sightseeing", "Outdoor Activities", "Food"])
    accommodations = st.sidebar.selectbox("Preferred Accommodations:", ["Hotel", "Airbnb", "Hostel"])
    travel_style = st.sidebar.selectbox("Travel Style:", ["Adventurous", "Relaxed", "Cultural", "Luxury"])
    transportation = st.sidebar.multiselect("Preferred Transportation:", ["Walking", "Public Transit", "Rental Car", "Organized Tours"])
    group_composition = st.sidebar.selectbox("Group Composition:", ["Solo", "Family", "Friends", "Couple"])
    accessibility = st.sidebar.checkbox("Accessibility Needs")
    language = st.sidebar.selectbox("Language Preference:", ["English", "Spanish", "French", "Other"])
    weather = st.sidebar.selectbox("Weather Preference:", ["Sunny", "Mild", "Rainy", "Snowy"])
    cultural_interests = st.sidebar.multiselect("Cultural Interests:", ["History", "Art", "Music", "Festivals"])
    budget_breakdown = st.sidebar.slider("Budget Breakdown:", min_value=0, max_value=budget, step=50)
    previous_experiences = st.sidebar.text_area("Previous Travel Experiences:")
    special_occasion = st.sidebar.checkbox("Special Occasion")

    if st.sidebar.button("Generate Itinerary"):
        if destination and dates and budget and interests and accommodations:
            itinerary = generate_itinerary(destination, dates, budget, interests, accommodations, travel_style, transportation, group_composition, accessibility, language, weather, cultural_interests, budget_breakdown, previous_experiences, special_occasion)
            st.subheader("Generated Itinerary:")
            st.write(itinerary)
        else:
            st.error("Please fill out all the details.")

### 4. Running the Application:
- The `if __name__ == "__main__":` block ensures that the `main()` function is executed when the script is run as the main program.



In [None]:
if __name__ == "__main__":
    main()


### Summary:
This code creates a user-friendly interface for generating personalized travel itineraries. It utilizes OpenAI's GPT-3.5 model to generate detailed itineraries based on user preferences and constraints. Users can input various parameters such as destination, budget, interests, etc., and the system generates a custom itinerary accordingly. This demonstrates the practical application of Generative AI techniques in creating valuable and personalized experiences for users.

![Screenshot 2024-04-05 at 5.34.27 PM.png](attachment:97f1c274-9639-419b-92e2-46580587dce0.png)

![Screenshot 2024-04-05 at 5.35.12 PM.png](attachment:e581a744-3844-460a-832a-c738c12a8bb0.png)


![Screenshot 2024-04-05 at 5.35.23 PM.png](attachment:168b4b6b-f74e-4307-a243-a688137c7b76.png)


![Screenshot 2024-04-05 at 5.35.33 PM.png](attachment:4bf7b52d-a0a9-4e66-83b9-dfc052dbca65.png)

![Screenshot 2024-04-05 at 5.35.39 PM.png](attachment:98dc3ee7-2781-4054-8038-23188358b36f.png)

# Step 7: Evaluation and Justification

### Validation 

In [4]:
pip install rouge-score


Collecting rouge-score
  Downloading rouge_score-0.1.2.tar.gz (17 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting absl-py (from rouge-score)
  Downloading absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting nltk (from rouge-score)
  Downloading nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB)
Downloading absl_py-2.1.0-py3-none-any.whl (133 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m133.7/133.7 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading nltk-3.8.1-py3-none-any.whl (1.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hBuilding wheels for collected packages: rouge-score
  Building wheel for rouge-score (pyproject.toml) ... [?25ldone
[?25

In [5]:

# Import necessary libraries
from rouge_score import rouge_scorer



In [9]:
# Example of reference and generated summaries
reference_summary = """Sure, here's the travel itinerary in a proper format:

---

**Generated Itinerary:**

Based on your preferences, here is a personalized travel itinerary for your trip to Chicago from April 19th to April 25th, 2024:

**Day 1: Arrival in Chicago**
- Accommodation: Check into a budget-friendly hotel in the downtown area.
- Sightseeing: Explore Millennium Park, Cloud Gate (The Bean), and Grant Park.
- Food: Enjoy a deep-dish pizza at one of Chicago's famous pizzerias.
- Transportation: Pick up your rental car for convenient travel around the city.
- Accessibility: Ensure all venues are accessible.
- Weather: Enjoy the sunny weather while sightseeing.
- Budget: Allocate $150 for accommodations, food, and transportation.

**Day 2: Cultural Exploration**
- Sightseeing: Visit the Art Institute of Chicago and Museum of Contemporary Art.
- Food: Have lunch at a local food festival featuring diverse cuisines.
- Cultural Interests: Experience live music performances at local venues.
- Budget: Allocate $150 for admissions, food, and souvenirs.

**Day 3: Neighborhood Tour**
- Sightseeing: Explore the vibrant neighborhoods of Wicker Park and Bucktown.
- Food: Indulge in a food tour to sample Chicago's culinary delights.
- Cultural Interests: Attend a music festival or concert in the evening.
- Budget: Allocate $100 for neighborhood tours and food experiences.

**Day 4: Lakefront Adventure**
- Sightseeing: Stroll along the Lakefront Trail and Navy Pier.
- Food: Try Chicago-style hot dogs and enjoy lakefront dining.
- Activities: Consider a boat tour on Lake Michigan for a different perspective of the city.
- Budget: Allocate $150 for activities, food, and tours.

**Day 5: Historic Chicago**
- Sightseeing: Visit the historic sites of Chicago such as The Chicago Theatre and Historic Water Tower.
- Food: Have brunch at a local diner known for its classic American breakfast.
- Shopping: Explore the boutiques on the Magnificent Mile for souvenirs.
- Budget: Allocate $150 for historical visits, meals, and shopping.

**Day 6: Local Markets and Parks**
- Sightseeing: Visit local markets like the Green City Market or Maxwell Street Market.
- Food: Enjoy a picnic in one of Chicago's beautiful parks like Lincoln Park.
- Activities: Participate in outdoor events or workshops at the park.
- Budget: Allocate $100 for market purchases and outdoor activities.

**Day 7: Departure**
- Morning: Enjoy a leisurely breakfast at a local cafe.
- Souvenirs: Pick up any last-minute souvenirs.
- Check-out: Return the rental car and head to the airport for departure.

This itinerary offers a mix of sightseeing, food experiences, and cultural activities in Chicago while staying within your budget of $800. Feel free to adjust the schedule based on your interests and pace. Enjoy your adventurous trip to Chicago!

--- """

In [10]:
generated_summary ="""Generated Itinerary:
Based on your preferences, here is a personalized travel itinerary for your trip to Chicago from April 19th to April 25th, 2024:

Day 1: Arrival in Chicago
Accommodation: Check into a budget-friendly hotel in the downtown area.
Sightseeing: Explore Millennium Park, Cloud Gate (The Bean), and Grant Park.
Food: Enjoy a deep-dish pizza at one of Chicago's famous pizzerias.
Transportation: Pick up your rental car for convenient travel around the city.
Accessibility: Ensure all venues are accessible.
Weather: Enjoy the sunny weather while sightseeing.
Budget: Allocate $150 for accommodations, food, and transportation.
Day 2: Cultural Exploration
Sightseeing: Visit the Art Institute of Chicago and Museum of Contemporary Art.
Food: Have lunch at a local food festival featuring diverse cuisines.
Cultural Interests: Experience live music performances at local venues.
Budget: Allocate $150 for admissions, food, and souvenirs.
Day 3: Neighborhood Tour
Sightseeing: Explore the vibrant neighborhoods of Wicker Park and Bucktown.
Food: Indulge in a food tour to sample Chicago's culinary delights.
Cultural Interests: Attend a music festival or concert in the evening.
Budget: Allocate $100 for neighborhood tours and food experiences.
Day 4: Lakefront Adventure
Sightseeing: Stroll along the Lakefront Trail and Navy Pier.
Food: Try Chicago-style hot dogs and enjoy lakefront dining.
Activities: Consider a boat tour on Lake Michigan for a different perspective of the city.
Budget: Allocate $150 for activities, food, and tours.
Day 5: Historic Chicago
Sightseeing: Visit the historic sites of Chicago such as The Chicago Theatre and Historic Water Tower.
Food: Have brunch at a local diner known for its classic American breakfast.
Shopping: Explore the boutiques on the Magnificent Mile for souvenirs.
Budget: Allocate $150 for historical visits, meals, and shopping.
Day 6: Local Markets and Parks
Sightseeing: Visit local markets like the Green City Market or Maxwell Street Market.
Food: Enjoy a picnic in one of Chicago's beautiful parks like Lincoln Park.
Activities: Participate in outdoor events or workshops at the park.
Budget: Allocate $100 for market purchases and outdoor activities.
Day 7: Departure
Morning: Enjoy a leisurely breakfast at a local cafe.
Souvenirs: Pick up any last-minute souvenirs.
Check-out: Return the rental car and head to the airport for departure.
This itinerary offers a mix of sightseeing, food experiences, and cultural activities in Chicago while staying within your budget of $800. Feel free to adjust the schedule based on your interests and pace. Enjoy your adventurous trip to Chicago! """

In [11]:
# Initialize the ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeLsum'], use_stemmer=True)

# Calculate ROUGE scores
rouge_scores = scorer.score(reference_summary, generated_summary)

# Print the ROUGE scores
print("ROUGE-1 F1 Score:", rouge_scores['rouge1'].fmeasure)
print("ROUGE-2 F1 Score:", rouge_scores['rouge2'].fmeasure)
print("ROUGE-L F1 Score:", rouge_scores['rougeLsum'].fmeasure)

ROUGE-1 F1 Score: 0.9878934624697336
ROUGE-2 F1 Score: 0.9878640776699029
ROUGE-L F1 Score: 0.9878934624697336


ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate the quality of automatic summarization and machine translation outputs by comparing them to reference summaries or translations. The ROUGE scores indicate how well the generated summary or translation captures the content and meaning of the reference text.

Here's an interpretation of the ROUGE scores you provided:

1. **ROUGE-1 F1 Score: 0.9878934624697336**
   - This score represents the F1 score for unigram overlap between the generated summary and the reference summary.
   - An F1 score of 0.9878934624697336 indicates that the generated summary has a high precision and recall for unigrams compared to the reference summary. In other words, the generated summary contains almost all of the important single words present in the reference summary, and it also doesn't contain many extraneous words.

2. **ROUGE-2 F1 Score: 0.9878640776699029**
   - This score represents the F1 score for bigram overlap between the generated summary and the reference summary.
   - An F1 score of 0.9878640776699029 indicates that the generated summary has a high precision and recall for bigrams compared to the reference summary. In other words, the generated summary contains almost all of the important two-word phrases present in the reference summary.

3. **ROUGE-L F1 Score: 0.9878934624697336**
   - This score represents the F1 score for longest common subsequence (LCS) between the generated summary and the reference summary.
   - An F1 score of 0.9878934624697336 indicates that the generated summary has a high precision and recall for the longest common subsequence compared to the reference summary. In other words, the generated summary contains a large portion of the reference summary in the correct order, capturing the overall structure and content.

Overall, these high ROUGE scores suggest that the generated summary closely resembles the reference summary in terms of unigrams, bigrams, and overall sequence of content. It indicates that the summarization model has performed well in capturing the key information from the source text and generating a concise and accurate summary.

## References and Resources:

OpenAI. "GPT-3.5: An Autoregressive Language Model with a Trillion Parameters." https://openai.com/research/gpt-3-5/

Radford, A., et al. "Language Models are Unsupervised Multitask Learners." 
https://cdn.openai.com/papers/LanguageModelsAreUnsupervisedMultitaskLearners.pdf

Vaswani, A., et al. "Attention is All You Need." https://arxiv.org/abs/1706.03762

Streamlit Documentation. https://docs.streamlit.io/en/stable/

This project demonstrates the practical application of generative AI techniques in data science, showcasing how advanced language models like GPT-3.5 can be leveraged to generate personalized content and enhance user experiences.

### License
MIT License

Copyright (c) 2024 Rutuja Kute

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.