# 03 - Reshaping Data
The [previous notebook](02-ABitMore.ipynb) showed a very awkward way to select some data from a long table. Now, we figure out a better way.

## Scope
- Read `csv` file (with data for all countries) into pandas data frame.
- Reshaping the original `Dataframe` to fit our needs, i.e. creating columns for all countries.
- Selecting countries and ploting data.

## Sources
- [10 minutes to pandas — pandas documentation](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html)
- [pandas.Index — pandas documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Index.html)
- [Indexing and selecting data — pandas documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.IndexSlice.html)

In [None]:
# Preparations for using pandas library and this notebook
import pandas as pd
%matplotlib inline

In [None]:
# Reading data into a pandas data frame, some cleaning up
data_source = 'ourworldindata.org'
file_id = 'life-expectancy.csv'
life_expectancy = pd.read_csv(file_id)
life_expectancy.drop(columns=['Code'], inplace=True)	# Unused column
life_expectancy.rename(columns={'Entity': 'Country'}, inplace=True)
life_expectancy.head()

In [None]:
# Reshaping data
life_expectancy_pivot = life_expectancy.pivot(index='Year', columns='Country', values='Life expectancy (years)')
life_expectancy_pivot

### Questions
- What is `NaN`? Have a look in `pandas` and `numpy` documentation!

In [None]:
all_countries = life_expectancy_pivot.columns
all_countries

In [None]:
country_list = ['Africa', 'World', 'Germany']

In [None]:
some_countries = life_expectancy_pivot[country_list]
some_countries.tail()

In [None]:
country_string = ', '.join(country_list)    # List in nicer form
plot_title = f'Countries:  {country_string}. Source: {file_id} from {data_source})'
some_countries.plot(title=plot_title, grid=True, figsize=(16, 9));

### Questions
- How to get rid of the empty part of the `some_countries` plot ([`NaN`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html))?
- How to find a certain country [without knowing the exact name](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html)?
- What does the `;` at the end of a cell do?

Next step deals with [groups of data](04-GroupingData.ipynb).