# Bad habits vs Education
This guided project explores whether there is a connection between the student population and "bad" habits or behaviors that may negatively impact health. The habits under study are:
- Coffee addiction
- Smoking
- Video games addiction

It is important to note that, due to a lack of recent research and available data on these topics, the data sources used in this project range from 2019 to 2020.

#### Slideshow controls:
*Press the `'↓'` key* when available on the bottom-right side of your screen. 

*Press the `'→'` key* When `'↓'` is not available.

In [10]:
import pandas as pd #For reading and handling CSV files.
import numpy as np  # For numerical operations and handling missing values.
import matplotlib.pyplot as plt  # For creating basic visualizations.
import seaborn as sns  # For advanced data visualization used in the infographic.
from IPython.display import Image, display, clear_output, HTML # Used to display images not generated in this notebook
from tabulate import tabulate # This beauty lets us create smooth table style outputs in terminal
import ipywidgets as widgets # This great tool lets use code into our notebook to make quick displays and customizations

## Content:
1. Introduction.
2. Load Data.
3. Visualization Plan.
4. Loading & Cleaning Data

Also feel free to visit **https://github.com/DefoNotGus/DV_assesment** To find the project's notebook

### Is there a relationship between habits and education? 

By examining the demographic factors associated with these habits and their prevalence in different cultures, the project aims to uncover potential links between these behaviors and the student population. Such insights could help society understand whether these harmful or misused habits are directly related to the stress faced by higher education students, highlighting the need to restructure education programs in regions where these addictions are more prevalent.

<div style="text-align: center;">
  <img src="img/img1.png" alt="Image" style="width:300px;">
</div>

# Data Description.

## The datasets used to study the habits are:

1. **Coffee Consumption Dataset**: Lists coffee consumption by country with extensive coverage.
- **Source**: [Kaggle](https://www.kaggle.com/datasets/nurielreuven/coffee-consumption-by-country-2022/data) 
- **Size**: 182 rows and 3 columns
- **Domain**: Healthcare and Marketing
2. **Smoking Rates Dataset**: Provides a chronological overview of smoking rates in many countries, sourced via a Google search.
- **Source**: [World Population Review](https://worldpopulationreview.com/country-rankings/smoking-rates-by-country)  
- **Size**: 164 rows and 10 columns
- **Domain**: Healthcare
3. **Gamers Market Dataset**: Compiles 2019 gaming market overview in many countries, offering chronological alignment with the other datasets. Data Scraping  techniques used.
- **Source**: [Allcorrect Games](https://allcorrectgames.com/insights/a-global-research-of-2019-games-market/)  
- **Size**: 29 rows and 8 columns
- **Domain**: Marketing and Videogames

## The datasets used to analyze students and enrollment data are:  

1. **Education Statistics Dataset**: A massive dataset with student enrollment data by region, sourced using the World Bank DataBank tool.  
- **Source**: [World Bank](https://databank.worldbank.org/indicator/)  
- **Size**: 197235 rows and 8 columns
- **Domain**: Education
2. **Students Dataset**: Provides country-specific enrollment data filtered to align with the habits datasets, sourced from the OECD Data Explorer.  
- **Source**: [OECD Data Explorer](https://data-explorer.oecd.org/)  
- **Size**: 56 rows 26 columns
- **Domain**: Education

# Visualization Plan.

This project aims to generate a visual correlation between bad habits and education and analyze factors contributing to certain addictions using python as the tool for data handling. The approach involves, exploring and procesing data, Gathering the relevant data for visualization techniques. Creating an infographic that visually represents addiction patterns in relation to gender, geographic, and academic factors. Canva is the tool chosen for the design of the infographic and the draft.

## Key Variables:
- Country: Countries where the data was gathered, This will be our index for certain datasets.
- Coffee consumption per capita (2020): The amount of coffee divided by the population, basically the amount of coffee "per head" in Kilograms.
- Smokig Rates (2020): The porcentage of population that smokes of each country listed.  
- Smokig Rates (2020) Male: The porcentage of male population that smokes of each country listed.  
- Smokig Rates (2020)Female: The porcentage of female population that smokes of each country listed.  
- Gamers (2019): The number of gamers per country in millions.
- Students per region (2020): Number of enrolled students per region.
- Students per country (2020): Number of enrolled students per Country.

## Visualizations: 
Since the data to visualize is mainly comparative the plots to implement will be: 
- Bar charts
- Line graphs
- Column charts
- Heatmaps
- Pie chart
- Stack bar chart
- Area chart

## Accesibility, Accuracy and presentation:

This project will follow up the following considerations to provide accesibility and avoid missleading data:
- Use readable fonts and sufficient text size.
- Include legends and labels for clarity and provide alternative text for charts.
- Ensure colorblind-friendly palettes (Only one graph has red-greens contrast, but the legend is there to help and guide).
- Normalize data for fair comparisons.
- Avoid cherry-picking data and present outliers and anomalies transparently.
- Clarify causation vs. correlation to avoid misinterpretation.
- Make the data and the code open source to enable users to make their own visualizations and research.
- Present a neat and clear format for the infographic like shown below (Click for fullscreen):
 

<div style="text-align: center;">
  <img src="img/Draft.png" alt="Image" style="width:300px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


# Data Loading and Cleaning.

> The HTML.SLIDESHOW Version will only have snips of code. To see the full implementation, check the [notebook version](https://github.com/DefoNotGus/DV_assesment/blob/main/DV_assesment.slides.html).

### Content:
- Missing data handling
- One change or creation of a new feature
- One technique to handle/explore rows for at least one of the data files
- At least one merging between a minimum of two data files
- One data aggregation OR melt/pivot of a data file before/after the merge

Datasets are loaded into variables using `pd.read_csv()` from the Pandas library.

In [11]:
# Importing the CSV files
coffee_df = pd.read_csv('Datasets/coffee.csv')
smoking_df = pd.read_csv('Datasets/smoking.csv')
gamers_df = pd.read_csv('Datasets/gamers.csv')
edstats_df = pd.read_csv('Datasets/edstats.csv')
students_df = pd.read_csv('Datasets/students.csv')

## Missing Data Handling:
Missing data was handled using `.dropna()` to remove any rows containing NaN values. This is an essential step in ensuring that the dataset is clean before further processing. Missing or empty values were also checked by using .isin(['', None]) to count the remaining empty values.

<div style="text-align: center;">
  <img src="img/img2.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


## Creating or Changin a feature
example 1: we created a new feature called Rank using the `.assign()` method.

<div style="text-align: center;">
  <img src="img/img3.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


Example 2: We use `.rename()` to rename a feature

<div style="text-align: center;">
  <img src="img/img7.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


## Handle and explore rows
To explore data we used `ipywidgets`(library), `Tabulate`, `.head()` and `tail()` methods. It has been converted into HTML code for improved interface. 

<div style="text-align: center;">
  <img src="img/img4.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


## Meging Dataframes
We merge them using `.merge()` as shown below.
- how='inner' specifies that only the rows with matching index values in both dataframes are merged.
- This adds a suffix to overlapping column names in the merging database to avoid naming conflicts with columns

<div style="text-align: center;">
  <img src="img/img5.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


## Data Aggregation 
We used `.assign()` for a simple process of data aggregation. Calculating the average value of 3 features using `.mean()`and adding it in a new feature called 'Ranks'  in the same row using `(axis=1)`

<div style="text-align: center;">
  <img src="img/img6.png" alt="Image" style="width:600px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>


# Thank you for watching
"Being healthier does not mean being wise. But Being wise means being healthier. " 
> [Infographic](https://www.canva.com/design/DAGXrjer7kU/JBxo4YxgOL8f5c8Tm-RwwQ/view?utm_content=DAGXrjer7kU&utm_campaign=designshare&utm_medium=link&utm_source=editor) 

<div style="text-align: center;">
  <img src="img/ty.png" alt="Image" style="width:300px; cursor: pointer;" onclick="openFullscreen(this)">
  <img src="img/ty2.png" alt="Image" style="width:300px; cursor: pointer;" onclick="openFullscreen(this)">
</div>

<script>
  function openFullscreen(img) {
    const fullscreen = document.createElement('div');
    fullscreen.style.position = 'fixed';
    fullscreen.style.top = '0';
    fullscreen.style.left = '0';
    fullscreen.style.width = '100%';
    fullscreen.style.height = '100%';
    fullscreen.style.background = 'rgba(0, 0, 0, 0.8)';
    fullscreen.style.display = 'flex';
    fullscreen.style.alignItems = 'center';
    fullscreen.style.justifyContent = 'center';
    fullscreen.style.zIndex = '9999';
    fullscreen.style.cursor = 'pointer';

    const fullscreenImg = document.createElement('img');
    fullscreenImg.src = img.src;
    fullscreenImg.style.maxWidth = '90%';
    fullscreenImg.style.maxHeight = '120%';
    fullscreenImg.style.border = '5px solid white';
    fullscreen.appendChild(fullscreenImg);

    fullscreen.onclick = () => document.body.removeChild(fullscreen);
    document.body.appendChild(fullscreen);
  }
</script>
