# Interactive Dashboards with Tableau

This notebook demonstrates how to create and share interactive data visualizations using Tableau, providing insights into complex datasets. It includes data preprocessing, visualization creation, and dashboard integration.

## Dataset

The datasets used in this project are:
1. **COVID-19 World Vaccination Progress**: Contains information on vaccination progress across various countries.
2. **COVID-19 World Vaccination Progress by Manufacturer**: Contains information on vaccination progress by different manufacturers.
3. **COVID-19 Worldometer Data**: Contains daily information on COVID-19 cases and deaths across various countries.
4. **World Population Data**: Contains information on the total population by country.

### Citation of the Datasets

- Preda, Gabriel. (2021). COVID-19 World Vaccination Progress. Kaggle. https://www.kaggle.com/datasets/gpreda/covid-world-vaccination-progress.
- Preda, Gabriel. (2021). COVID-19 World Vaccination Progress by Manufacturer. Kaggle. https://www.kaggle.com/datasets/gpreda/covid-world-vaccination-progress.
- Joseph Assaker. (2021). COVID-19 Global Dataset. Kaggle. https://www.kaggle.com/datasets/josephassaker/covid19-global-dataset.
- Banerjee, Sourav. (2021). World Population Dataset. Kaggle. https://www.kaggle.com/datasets/iamsouravbanerjee/world-population-dataset.


## Data Preprocessing

### Step 1: Import Libraries and load dataset

In [8]:
import pandas as pd
# Load the datasets
vaccinations = pd.read_csv('../data/country_vaccinations.csv')
vaccinations_by_manufacturer = pd.read_csv('../data/country_vaccinations_by_manufacturer.csv')
worldometer = pd.read_csv('../data/worldometer_coronavirus_daily_data.csv')
population = pd.read_csv('../data/world_population.csv')

# Display the column names to understand the structure of each dataset
print("\nVaccinations Columns:", vaccinations.columns)
print("\nVaccinations by Manufacturer Columns:", vaccinations_by_manufacturer.columns)
print("\nWorldometer Columns:", worldometer.columns)
print("\nPopulation Columns:", population.columns)


Vaccinations Columns: Index(['country', 'iso_code', 'date', 'total_vaccinations',
       'people_vaccinated', 'people_fully_vaccinated',
       'daily_vaccinations_raw', 'daily_vaccinations',
       'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred',
       'people_fully_vaccinated_per_hundred', 'daily_vaccinations_per_million',
       'vaccines', 'source_name', 'source_website'],
      dtype='object')

Vaccinations by Manufacturer Columns: Index(['location', 'date', 'vaccine', 'total_vaccinations'], dtype='object')

Worldometer Columns: Index(['date', 'country', 'cumulative_total_cases', 'daily_new_cases',
       'active_cases', 'cumulative_total_deaths', 'daily_new_deaths'],
      dtype='object')

Population Columns: Index(['Rank', 'CCA3', 'Country/Territory', 'Capital', 'Continent',
       '2022 Population', '2020 Population', '2015 Population',
       '2010 Population', '2000 Population', '1990 Population',
       '1980 Population', '1970 Population', 'Area (km²)', 

### Step 2: Convert Date Columns to Datetime Format

In [9]:
# Ensure date formats are consistent
vaccinations['date'] = pd.to_datetime(vaccinations['date'])
vaccinations_by_manufacturer['date'] = pd.to_datetime(vaccinations_by_manufacturer['date'])
worldometer['date'] = pd.to_datetime(worldometer['date'])

# Standardize the country names to uppercase
vaccinations['country'] = vaccinations['country'].str.upper()
vaccinations_by_manufacturer['location'] = vaccinations_by_manufacturer['location'].str.upper()
worldometer['country'] = worldometer['country'].str.upper()
population = population.rename(columns={'Country/Territory': 'country', '2022 Population': 'population'})
population['country'] = population['country'].str.upper()

# Keep only the country and 2022 population columns in the population data
population = population[['country', 'population']]


### Step 3: Merge Datasets

In [10]:
# Merge vaccinations and worldometer datasets on country and date
merged_data = pd.merge(vaccinations, worldometer, on=['country', 'date'], how='outer')

# Merge the result with population data
merged_data = pd.merge(merged_data, population, on='country', how='left')

# Aggregate the vaccinations_by_manufacturer data by country and sum the total_vaccinations
vaccinations_by_manufacturer_agg = vaccinations_by_manufacturer.groupby(['location', 'vaccine']).agg({'total_vaccinations': 'sum'}).reset_index()
vaccinations_by_manufacturer_agg = vaccinations_by_manufacturer_agg.rename(columns={'location': 'country'})
vaccinations_by_manufacturer_agg['country'] = vaccinations_by_manufacturer_agg['country'].str.upper()

# Merge the aggregated manufacturer data with the merged dataset
final_data = pd.merge(merged_data, vaccinations_by_manufacturer_agg, on='country', how='outer')

### Step 4: Clean the Final Dataset

In [11]:
# Drop the CCA3 column if it exists
if 'CCA3' in final_data.columns:
    final_data = final_data.drop(columns=['CCA3'])

# Save the final merged dataset
#final_data.to_csv('../data/cleaned_covid_combined_data.csv', index=False)

# Display the columns of the final merged dataset
print("\nMerged dataset Columns:", final_data.columns)


Merged dataset Columns: Index(['country', 'iso_code', 'date', 'total_vaccinations_x',
       'people_vaccinated', 'people_fully_vaccinated',
       'daily_vaccinations_raw', 'daily_vaccinations',
       'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred',
       'people_fully_vaccinated_per_hundred', 'daily_vaccinations_per_million',
       'vaccines', 'source_name', 'source_website', 'cumulative_total_cases',
       'daily_new_cases', 'active_cases', 'cumulative_total_deaths',
       'daily_new_deaths', 'population', 'vaccine', 'total_vaccinations_y'],
      dtype='object')


## Vizualization
The visualizations provide insights into:

The geographical distribution of COVID-19 vaccinations.
Comparative analysis of vaccination and deaths progress among countries.
Comparative analysis of vaccines by manufacturer.
The project explores various interactive visualizations, including:

World Map: Total vaccinations by country.
Time Series Chart: Vaccinations and deaths by country.
Pie Chart: Number of vaccines by manufacturers.
The visualizations are created using Tableau's powerful data visualization tools.


In [18]:
from IPython.display import IFrame

# Replace YOUR_DASHBOARD_URL with the actual URL of your Tableau Public dashboard
IFrame(src="https://public.tableau.com/shared/3J6YJQFY9?:display_count=n&:origin=viz_share_link", width=1300, height=900)

## Problems viewing the Framework
If you're encountering a connection issue when using a Tableau Public dashboard using an iframe, it might be due to browser or security settings blocking the connection (X-Frame-Options and Content-Security-Policy). 

You can view the interactive dashboards created with Tableau by visiting the following link:https://public.tableau.com/shared/3J6YJQFY9?:display_count=n&:origin=viz_share_link