# #5 | The Resolving Python Framework for Data Visualization Libraries

## Possibilities

Look at the following example as an aspiration you can achieve if you fully understand and replicate this whole tutorial with your own data.

Let's load a dataset that contains information from countries (rows) considering sociodemographical and economical variables (columns).

In [None]:
import plotly.express as px

df_countries = px.data.gapminder()
df_countries

Python contains 3 main libraries for Data Visualization:
    
1. **Matplotlib** (Mathematical Plotting)
2. **Seaborn** (High-Level based on Matplotlib)
3. **Plotly** (Animated Plots)

I personally love `plotly` because the Visualizations are interactive; you may hover the mouse over the points to get information from it:

In [None]:
df_countries_2007 = df_countries.query('year == 2007')

px.scatter(data_frame=df_countries_2007, x='gdpPercap', y='lifeExp',
           color='continent', hover_name='country', size='pop')

You can even animate the plots with a simple parameter. Click on play ↓

PS: The following example is taken from the [official plotly library website](https://plotly.com/python/animations/):

In [None]:
px.scatter(df_countries, x="gdpPercap", y="lifeExp", animation_frame="year", animation_group="country",
           size="pop", color="continent", hover_name="country",
           log_x=True, size_max=55, range_x=[100,100000], range_y=[25,90])

In this article, we'll dig in the details of Data Visualization in Python to solificate the required knowledge in order to come up with awsome visualizations like the ones we saw before.

## Matplotlib

Matplotlib is a library used for Data Visualization.

We use the **sublibrary** (module) `pyplot` from `matplotlib` library to access the functions.

In [None]:
import matplotlib.pyplot as plt

Let's make a bar plot:

In [None]:
plt.bar(x=['Real Madrid', 'Barcelona', 'Bayern Munich'],
       height=[14, 5, 6]);

We could have also done a point plot:

In [None]:
plt.scatter(x=['Real Madrid', 'Barcelona', 'Bayern Munich'],
            y=[14, 5, 6]);

But it doesn't make sense with the data we have represented.

## Visualize DataFrame

Let's create a DataFrame:

In [None]:
teams = ['Real Madrid', 'Barcelona', 'Bayern Munich']
uefa_champions = [14, 5, 6]

import pandas as pd

df_champions = pd.DataFrame(data={'Team': teams,
                   'UEFA Champions': uefa_champions})
df_champions

And visualize it using...

### Matplotlib functions

In [None]:
plt.bar(x=df_champions['Team'],
        height=df_champions['UEFA Champions']);

### DataFrame functions

In [None]:
df_champions.plot.bar(x='Team', y='UEFA Champions');

## Seaborn

Let's read another dataset: the Football Premier League classification for 2021/2022.

In [None]:
df_premier = pd.read_excel(io='premier_league.xlsx')
df_premier

We will visualize a point plot, from now own **scatter plot** to check if there is a relationship between the number of goals scored `F` versus the Points `Pts`.

In [None]:
import seaborn as sns

sns.scatterplot(x='F', y='Pts', data=df_premier);

Can we do the same plot with matplotlib `plt` library?

In [None]:
plt.scatter(x='F', y='Pts', data=df_premier);

Which are the differences between them?

1. The points: `matplotlib` points are bigger than `seaborn`
2. The axis labels: `matplotlib` axis labels are non-existent, whereas `seaborn` places the names of the columns

From which library are the objects returned by the previous functions?

In [None]:
seaborn_plot = sns.scatterplot(x='F', y='Pts', data=df_premier);

In [None]:
matplotlib_plot = plt.scatter(x='F', y='Pts', data=df_premier);

In [None]:
type(seaborn_plot)

In [None]:
type(matplotlib_plot)

Why does `seaborn` returns a `matplotlib` object?

Quoted from the [seaborn](https://seaborn.pydata.org/) official website:

> Seaborn is a Python data visualization library **based on matplotlib**. It provides a **high-level\* interface** for drawing attractive and informative statistical graphics.

\*High-level means the communication between humans and the computer is easier to understand as compared to low-level where the communication goes through 0s and 1s.

Could you place the names of the teams in the points?

In [None]:
plt.scatter(x='F', y='Pts', data=df_premier)

for idx, data in df_premier.iterrows():
    plt.text(x=data['F'], y=data['Pts'], s=data['Team'])

It's very complicated.

Is there an easier way?

Yes, you may use an interactive plot with `plotly` library and display the name of the Team as you hover the mouse in a point.

## Plotly

We use the `express` module within `plotly` library to access the functions of the plots:

In [None]:
import plotly.express as px

px.scatter(data_frame=df_premier, x='F', y='Pts', hover_name='Team')


## Practice

You'll practice the [Resolving Python Framework](https://www.craft.do/s/G80r1dqrQKrjTb) we have cover in this course by applying functions to the following `DataFrame`:

In [None]:
import pandas

df_football = pandas.read_csv('russia-world-cup.csv') #!
df_football = df_football[['Team', 'Goals Scored']].rename({'Goals Scored': 'goals', 'Team': 'team'}, axis=1).copy()
df_football

We have three main libraries in Python for Data Visualization:
    
- Matplotlib
- Seaborn
- Plotly

Therefore, you'll create the same type of figure:

1. Scatter plot
2. Bar plot
3. Pie plot
4. Line plot

With each library.

The key reason to develop this exercise is the enhancement of the Mental Model. Therefore, you should use the shortcuts:

1. `library.↹` [TAB] to get the list of functions available.
2. `function(⇧ + ↹)` [SHIFT] + [TAB] to read the documentation to know the parameters you need to pass.

> You may have different written names for the function that produces the same figure across the different libraries. If you start typing the name of the figure, you'll discover it. 

### Scatterplots

#### Matplotlib

In [None]:
import matplotlib.pyplot as plt #!

In [None]:
plt.

#### Seaborn

In [None]:
import seaborn as sns #!

In [None]:
sns.

#### Plotly

In [None]:
import plotly.express as px

In [None]:
px.

### Barplot

#### Matplotlib

In [None]:
plt.

#### Seaborn

In [None]:
sns.

#### Plotly

In [None]:
px.

### Pieplot

#### Matplotlib


#### Seaborn

#### Plotly

### Line plot

#### Matplotlib

> Look in Google if you don't find the solution by applying the Deduction Method of the Resolving Python Framework

#### Seaborn

#### Plotly

# Reflect about what you have learnt ✍️

*Double click in this cell to reflect about it*

Which mistakes have commited have you committed?

- [ ] [Write here]

How will you solve them now?

- [ ] [Write here]

What have you learnt that you didn't know before?

- [ ] [Write here]

What do you value the most from this course?

- [ ] [Write here]