# LLM Evaluation - ROUGE

## Libraries

In [1]:
import nbformat
from nbformat import NO_CONVERT
import io
import os
import sys

# Rouge
from rouge_score import rouge_scorer
import pandas as pd

## Import Notebook Content

In [2]:
def run_notebook(path):
    with open(path) as f:
        notebook_content = nbformat.read(f, as_version=NO_CONVERT)
    code_cells = [cell['source'] for cell in notebook_content.cells if cell.cell_type == 'code']
    code = '\n'.join(code_cells)
    
    # Omit Output
    old_stdout = sys.stdout
    old_stderr = sys.stderr
    sys.stdout = io.StringIO()
    sys.stderr = io.StringIO()
    
    try:
        exec(code, globals())
    finally:
        sys.stdout = old_stdout
        sys.stderr = old_stderr

## ROUGE Scorer

In [3]:
scorer = rouge_scorer.RougeScorer(
    ['rouge1', 'rouge2', 'rougeL'], 
    use_stemmer=True,
)

## Gemini 1.5 Flash

In [4]:
# Gemini 1.5 Flash
notebook_path = '/home/jupyter/2. Capstone 2/2. New Features for Testing/Model Evaluation/CC_Final_Agent_Flash.ipynb'
run_notebook(notebook_path)

Enter new database host (or press Enter to keep current):  
Enter new database user (or press Enter to keep current):  
Enter new database password (input hidden, or press Enter to keep current):  ········
Enter new database name (or press Enter to keep current):  




#### Question 1

In [5]:
print("Question 1:\n" + question1_FLASH)

Question 1:
Do I have anything due on July 21st?


In [6]:
print("Ed's Response for Question 1:")
print(response1_FLASH)

Ed's Response for Question 1:
On July 21st, the student has a **Statistical Analysis Quiz 2** due at 23:59:00.


In [7]:
reference_answer_q1 = "On July 21st, you have Quiz 2 in Statistical Analysis due at 11:59."
print("Reference Answer for Question 1:\n" + reference_answer_q1) 

Reference Answer for Question 1:
On July 21st, you have Quiz 2 in Statistical Analysis due at 11:59.


In [8]:
scores = scorer.score(response1_FLASH, reference_answer_q1)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,0.714286,0.625,0.666667
rouge2,0.384615,0.333333,0.357143
rougeL,0.571429,0.5,0.533333


#### Question 2

In [9]:
print("Question 2:\n" + question2_FLASH)

Question 2:
What is the course description for Marketing Analytics?


In [10]:
print("Ed's Response for Question 2:")
print(response2_FLASH)

Ed's Response for Question 2:
## Marketing Analytics Syllabus

This syllabus outlines the course "Marketing Analytics" (MKTG 450), taught by [Instructor Name]. 

**Course Description:**

This course introduces the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions. The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies.

**Learning Objectives:**

By the end of this course, students will be able to:

* Understand the role of analytics in marketing decision-making.
* Collect and process marketing data from various sources.
* Apply statistical and machine learning techniques to analyze marketing data.
* Use data visualization tools to communicate marketing insights effectively.
* Develop data-driven marketing strategies to improve business outcomes.

**Required Materials:**

* Textbook: 

In [11]:
reference_answer_q2 = "This course provides an introduction to the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions.  The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies."
print("Reference Answer for Question 2:\n" + reference_answer_q2) 

Reference Answer for Question 2:
This course provides an introduction to the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions.  The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies.


In [12]:
scores = scorer.score(response2_FLASH, reference_answer_q2)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,0.958333,0.112195,0.200873
rouge2,0.93617,0.107579,0.192982
rougeL,0.916667,0.107317,0.19214


#### Question 3

In [13]:
print("Question 3:\n" + question3_FLASH)

Question 3:
What assignments are due on July 25th and what do I have to do for them?


In [14]:
print("Ed's Response for Question 3:")
print(response3_FLASH)

Ed's Response for Question 3:
The only assignment due on July 25th is **Marketing Analytics Assignment 4**. 


## Marketing Analytics Assignment 4 Instructions:

This assignment requires you to design and execute a comprehensive marketing analytics project. Here's a breakdown of the steps:

**1. Choose a Marketing Problem:**

* Select a marketing problem or question that interests you. This could be related to customer behavior, market trends, campaign effectiveness, etc.

**2. Collect and Prepare Data:**

* Gather relevant data from available sources (company databases, online datasets, surveys).
* Ensure your data is clean, complete, and properly formatted.

**3. Conduct Analyses:**

* **Data Cleaning and Preparation:** Ensure the data is clean, complete, and formatted correctly.
* **Exploratory Data Analysis (EDA):** Explore the data to understand its structure and identify key patterns.
* **Hypothesis Testing:** Formulate and test hypotheses related to your marketing problem.
* **P

In [15]:
reference_answer_q3 = """
On July 25th, you have Assignment 4 in Marketing Analytics due at 11:59. For this assignment you have to:

1. Choose a marketing problem or question that interests you. This could be related to customer behavior, market trends, campaign effectiveness, etc.    

2. Collect relevant data from available sources (e.g., company databases, online datasets, surveys). Ensure that your data is clean and properly formatted.  

3. Use appropriate software tools (e.g., R, Python, Tableau) to conduct the following analyses:   
• a. Data Cleaning and Preparation: Ensure the data is clean, complete, and formatted correctly.  
• b. Exploratory Data Analysis (EDA): Explore the data to understand its structure and identify key patterns.  
• c. Hypothesis Testing: Formulate and test hypotheses related to your marketing problem.    
• d. Predictive Modeling: Build models to predict future outcomes related to your marketing problem.      
• e. Optimization Analysis: Identify the optimal marketing strategies based on your analysis.   

4. Create visualizations to support your analysis and findings.

5. Write a comprehensive report that includes the following sections:   
• a. Introduction 
• b. Data Collection and Preparation      
• c. Exploratory Data Analysis      
• d. Hypothesis Testing 
• e. Predictive Modeling      
• f. Optimization Analysis    
• g. Strategic Recommendations      
• h. Conclusion   

6. Prepare a presentation that summarizes your project, key findings, and recommendations. Your presentation should be clear, concise, and supported by visuals.    

7. Submit your report, presentation, and any supporting files (e.g., code, data, visualizations) through the course's online submission system.
"""
print("Reference Answer for Question 3:\n" + reference_answer_q3)

Reference Answer for Question 3:

On July 25th, you have Assignment 4 in Marketing Analytics due at 11:59. For this assignment you have to:

1. Choose a marketing problem or question that interests you. This could be related to customer behavior, market trends, campaign effectiveness, etc.    

2. Collect relevant data from available sources (e.g., company databases, online datasets, surveys). Ensure that your data is clean and properly formatted.  

3. Use appropriate software tools (e.g., R, Python, Tableau) to conduct the following analyses:   
• a. Data Cleaning and Preparation: Ensure the data is clean, complete, and formatted correctly.  
• b. Exploratory Data Analysis (EDA): Explore the data to understand its structure and identify key patterns.  
• c. Hypothesis Testing: Formulate and test hypotheses related to your marketing problem.    
• d. Predictive Modeling: Build models to predict future outcomes related to your marketing problem.      
• e. Optimization Analysis: Identi

In [16]:
scores = scorer.score(response3_FLASH, reference_answer_q3)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,0.846809,0.526455,0.649266
rouge2,0.662393,0.411141,0.507365
rougeL,0.782979,0.486772,0.600326


## Gemini 1.5 Pro

In [17]:
# Gemini 1.5 Pro
notebook_path = '/home/jupyter/2. Capstone 2/2. New Features for Testing/Model Evaluation/CC_Final_Agent_Pro.ipynb'
run_notebook(notebook_path)

Enter new database host (or press Enter to keep current):  
Enter new database user (or press Enter to keep current):  
Enter new database password (input hidden, or press Enter to keep current):  ········
Enter new database name (or press Enter to keep current):  


#### Question 1

In [18]:
print("Question 1:\n" + question1_PRO)

Question 1:
Do I have anything due on July 21st?


In [19]:
print("Ed's Response for Question 1:")
print(response1_PRO)

Ed's Response for Question 1:
The student has the Statistical Analysis Quiz 2 due on July 21st.


In [20]:
reference_answer_q1 = "On July 21st, you have Quiz 2 in Statistical Analysis due at 11:59."
print("Reference Answer for Question 1:\n" + reference_answer_q1) 

Reference Answer for Question 1:
On July 21st, you have Quiz 2 in Statistical Analysis due at 11:59.


In [21]:
scores = scorer.score(response1_PRO, reference_answer_q1)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,0.571429,0.666667,0.615385
rouge2,0.307692,0.363636,0.333333
rougeL,0.214286,0.25,0.230769


#### Question 2

In [22]:
print("Question 2:\n" + question2_PRO)

Question 2:
What is the course description for Marketing Analytics?


In [23]:
print("Ed's Response for Question 2:")
print(response2_PRO)

Ed's Response for Question 2:
This course provides an introduction to the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions. The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies.


In [24]:
reference_answer_q2 = "This course provides an introduction to the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions.  The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies."
print("Reference Answer for Question 2:\n" + reference_answer_q2) 

Reference Answer for Question 2:
This course provides an introduction to the principles and practices of marketing analytics. Students will learn how to collect, analyze, and interpret data to make informed marketing decisions.  The course covers various analytical techniques and tools used to measure marketing performance, understand customer behavior, and optimize marketing strategies.


In [25]:
scores = scorer.score(response2_PRO, reference_answer_q2)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,1.0,1.0,1.0
rouge2,1.0,1.0,1.0
rougeL,1.0,1.0,1.0


#### Question 3

In [26]:
print("Question 3:\n" + question3_PRO)

Question 3:
What assignments are due on July 25th and what do I have to do for them?


In [27]:
print("Ed's Response for Question 3:")
print(response3_PRO)

Ed's Response for Question 3:
The student has **Marketing Analytics Assignment 4** due on July 25th. 


Here are the instructions for the Marketing Analytics Assignment 4, broken down step-by-step:

**Project Goal:** Design and execute a complete marketing analytics project.

**Steps:**

1. **Choose a Marketing Problem:** Select a marketing-related problem or question you find interesting. Examples: 
    * Understanding customer behavior 
    * Analyzing market trends
    * Evaluating campaign effectiveness

2. **Data Collection and Preparation:**
    * Gather relevant data from sources like company databases, online datasets, or surveys.
    * Clean and format the data to ensure accuracy and usability.

3. **Data Analysis:** Utilize software like R, Python, or Tableau to perform these analyses:
    * **Data Cleaning and Preparation:** Double-check data for completeness and correct formatting.
    * **Exploratory Data Analysis (EDA):** Uncover patterns and trends in your data.
    * **

In [28]:
reference_answer_q3 = """
On July 25th, you have Assignment 4 in Marketing Analytics due at 11:59. For this assignment you have to:

1. Choose a marketing problem or question that interests you. This could be related to customer behavior, market trends, campaign effectiveness, etc.    

2. Collect relevant data from available sources (e.g., company databases, online datasets, surveys). Ensure that your data is clean and properly formatted.  

3. Use appropriate software tools (e.g., R, Python, Tableau) to conduct the following analyses:   
• a. Data Cleaning and Preparation: Ensure the data is clean, complete, and formatted correctly.  
• b. Exploratory Data Analysis (EDA): Explore the data to understand its structure and identify key patterns.  
• c. Hypothesis Testing: Formulate and test hypotheses related to your marketing problem.    
• d. Predictive Modeling: Build models to predict future outcomes related to your marketing problem.      
• e. Optimization Analysis: Identify the optimal marketing strategies based on your analysis.   

4. Create visualizations to support your analysis and findings.

5. Write a comprehensive report that includes the following sections:   
• a. Introduction 
• b. Data Collection and Preparation      
• c. Exploratory Data Analysis      
• d. Hypothesis Testing 
• e. Predictive Modeling      
• f. Optimization Analysis    
• g. Strategic Recommendations      
• h. Conclusion   

6. Prepare a presentation that summarizes your project, key findings, and recommendations. Your presentation should be clear, concise, and supported by visuals.    

7. Submit your report, presentation, and any supporting files (e.g., code, data, visualizations) through the course's online submission system.
"""
print("Reference Answer for Question 3:\n" + reference_answer_q3)

Reference Answer for Question 3:

On July 25th, you have Assignment 4 in Marketing Analytics due at 11:59. For this assignment you have to:

1. Choose a marketing problem or question that interests you. This could be related to customer behavior, market trends, campaign effectiveness, etc.    

2. Collect relevant data from available sources (e.g., company databases, online datasets, surveys). Ensure that your data is clean and properly formatted.  

3. Use appropriate software tools (e.g., R, Python, Tableau) to conduct the following analyses:   
• a. Data Cleaning and Preparation: Ensure the data is clean, complete, and formatted correctly.  
• b. Exploratory Data Analysis (EDA): Explore the data to understand its structure and identify key patterns.  
• c. Hypothesis Testing: Formulate and test hypotheses related to your marketing problem.    
• d. Predictive Modeling: Build models to predict future outcomes related to your marketing problem.      
• e. Optimization Analysis: Identi

In [29]:
scores = scorer.score(response3_PRO, reference_answer_q3)
pd.DataFrame.from_dict(scores, orient="index")

Unnamed: 0,precision,recall,fmeasure
rouge1,0.791489,0.449275,0.57319
rouge2,0.380342,0.215496,0.275116
rougeL,0.6,0.34058,0.434515
