# **Using AI with Gemini in Colab: Visualising Student Performance and Economic Indicators (OECD PISA Scores)**

This notebook analyzes the relationship between student performance (measured by PISA test scores) and various economic indicators using AI visualizations with Gemini in Colab.

## Datasets
1. **PISA Dataset**: Contains information on PISA test scores across various countries, years, and subjects.
*   Columns: **LOCATION, TIME, SUBJECT, GENDER, Value**

2. **Economics Dataset**: Includes various economic indicators for the same countries and years.
- Columns: **country, time, sex, expenditure_on _education_pct_gdp, mortality_rate_infant, gini_index, gdp_per_capita_ppp, inflation_consumer_prices, intentional_homicides, unemployment, gross_fixed_capital_formation, population_density, suicide_mortality_rate, tax_revenue, taxes_on_income_profits_capital, alcohol_consumption_per_capita, government_health_expenditure_pct_gdp, urban_population_pct_total, rating**

### Reading the Datasets
Prompt: "Read the PISA and Economics datasets from the provided URLs."

In [3]:
import pandas as pd

# Read the datasets from the provided URLs
pisa_url = 'https://raw.githubusercontent.com/eduhubai/YouTube-Gemini-Colab-OECD-PISA-Data-Analysis/main/OECD_PISA_data.csv'
economics_url = 'https://raw.githubusercontent.com/eduhubai/YouTube-Gemini-Colab-OECD-PISA-Data-Analysis/main/economics_and_education_dataset_CSV.csv'

pisa = pd.read_csv(pisa_url)
economics = pd.read_csv(economics_url)

# Display the first few rows of pisa dataset
pisa.head()

In [3]:
# Display the first few rows of pisa dataset
pisa.head()

### Renaming Columns to Avoid Conflicts
Prompt: "Next, we will rename columns in the PISA dataset: 'LOCATION' to 'COUNTRY', 'TIME' to 'YEAR', and 'Value' to 'Test_Score'. In the Economics dataset: 'country' to 'COUNTRY', 'time' to 'YEAR', 'sex' to 'GENDER', and 'Value' to 'Economics_Value'."

### Merging the Datasets
Prompt: "Now, we'll merge the two datasets on the common columns: 'COUNTRY', 'YEAR', and 'GENDER'. This will combine the test scores with the economic indicators for the same country, year, and gender."

### Checking Data Types
Prompt: "Let's check the data types of the merged dataset to ensure that all columns are in the correct format. This will help us identify if any changes are needed."

### Changing Data Types and Handling Missing Values
Prompt: "To ensure our data is consistent, we'll change data types if necessary and handle any missing values in the 'Test_Score' column."

### Visualizations
#### 1. Distribution of Test Scores
Prompt: "Let's move on to our analysis with various visualizations. We'll start with a distribution of test scores."

#### 2. Average Test Scores by Country
Prompt: "Next, we'll compare the average test scores by country."

#### 3. Test Scores by Gender
Prompt: "Let's compare the distribution of test scores between genders."

#### 4. GDP per Capita vs. Test Scores
Prompt: "Visualize the relationship between GDP per capita and test scores. Add country as hue."

#### 5. Education Expenditure vs. Test Scores
Prompt: "We'll also explore the relationship between education expenditure (as a percentage of GDP) and test scores."

#### 6. Pair Plot for Selected Economic Indicators and Test Scores
Prompt: "Next, we'll create a pair plot for selected economic indicators and test scores using the columns ['Test_Score', 'gdp_per_capita_ppp', 'expenditure_on _education_pct_gdp', 'unemployment', 'inflation_consumer_prices']."

#### 7. Heatmap of Correlations
Prompt: "We'll generate a heatmap to show the correlations between following columns  ['Test_Score', 'gdp_per_capita_ppp', 'expenditure_on _education_pct_gdp', 'unemployment', 'inflation_consumer_prices'] and test scores."

#### 8. Test Scores by Subject
Prompt: "Let's compare the distribution of test scores across different subjects."

#### 9. Trend Analysis Over Years
Prompt: "Test_Score' on the y-axis, and uses 'COUNTRY' to differentiate the lines by country. It also includes markers for data points, titles for the plot and axes, and a legend positioned outside the plot area."

#### 10. Boxplots for Economic Indicators
Prompt: "Create boxplots for a list of economic indicators across different countries. First, define a list of economic indicators to visualize. Then, for each indicator in the list, generate a boxplot with 'COUNTRY' on the x-axis and the respective economic indicator on the y-axis. Set appropriate figsize,  titles and labels for each plot. The title should include the indicator."

#### 11. Unemployment Rate vs. Test Scores
Prompt: "Create a scatter plot to analyze the relationship between unemployment rate and test scores."

#### 12. Facet Grids for Test Scores by Gender and Subject
Prompt: "Create facet grids to visualize test scores by gender and subject. Set up a FacetGrid with columns representing 'SUBJECT' and rows representing 'GENDER', including margin titles. Within each facet, plot a histogram of 'Test_Score' with a kernel density estimate. Add a legend to the plot, and display the final visualization."