# <center> Working with Python Native Data Structures </center>

- [What are Python Native Data Structures](#section_1)
- [Pandas DataFrame() Function](#section_2)
- [Pandas from_dict() Function](#section_3)

<hr>

### What are Python Native Data Structures <a class="anchor" id="section_1"></a>

The Python programming language has a variety of built-in data structures such as `lists`, `tuples`, `dictionaries`, `strings`, and `sets`.

Python developers will use these data structures to store data during coding and program execution. 

<br>

First, let's look at a dictionary example.

We create a typical Python dictionary to save `country` information. This dictionary has five key:value pairs. 

When converted into a Python DataFrame, the dictionary keys will represent labels while the dictionary values represent the corresponding values or data.

In [13]:
# Create a Python dictionary. Refer to lesson video for details
country_info = {'country_name':'New Zealand',
               'capital_city':'Wellington',
               'country_code':'NZ',
               'population':4783063,
               'area_km2':270467}

# Display dataset
country_info

{'country_name': 'New Zealand',
 'capital_city': 'Wellington',
 'country_code': 'NZ',
 'population': 4783063,
 'area_km2': 270467}

<br>

Another commonly used Python data structure is `List` which is used to store multiple values in a single variable.

Let's have a look at another example below:

In [1]:
# Create a Python list
list_of_country_codes = ['CN','NZ','ZA','GB','US']
list_of_country_codes

['CN', 'NZ', 'ZA', 'GB', 'US']

<br>

Imagine if we have a large number of dictionaries, each one has information about a different country.

The Pandas library can convert this list of similar dictionaries into a DataFrame object. 

Let’s look at this example where we create a variable called `list_of_countries` as a Python list where each item is a Python dictionary with information about a different country.

In [2]:
# Create a list of dictionaries. Refer to lesson video for details
list_of_countries = [
    {'country_name':'China', 'capital_city':'Beijing','population':1433783686, 'area_km2':9596961},
    {'country_name':'New Zealand','capital_city':'Wellington', 'population':4783063, 'area_km2':270467},
    {'country_name':'South Africa','capital_city':'Pretoria', 'population':58558270, 'area_km2':1221037},
    {'country_name':'United Kingdom','capital_city':'London', 'population':67530172, 'area_km2':242495},
    {'country_name':'United States','capital_city':'Washington DC', 'population':329064917, 'area_km2':9525067}
]

# Display the dictionary
list_of_countries

[{'country_name': 'China',
  'capital_city': 'Beijing',
  'population': 1433783686,
  'area_km2': 9596961},
 {'country_name': 'New Zealand',
  'capital_city': 'Wellington',
  'population': 4783063,
  'area_km2': 270467},
 {'country_name': 'South Africa',
  'capital_city': 'Pretoria',
  'population': 58558270,
  'area_km2': 1221037},
 {'country_name': 'United Kingdom',
  'capital_city': 'London',
  'population': 67530172,
  'area_km2': 242495},
 {'country_name': 'United States',
  'capital_city': 'Washington DC',
  'population': 329064917,
  'area_km2': 9525067}]

However, this  method can not be effectively used to perform analytical tasks such as exploratory analyses or even data visualization.
<br>
<br>
<br>

### Pandas DataFrame() Function <a class="anchor" id="section_2"></a>

Luckily, Pandas library can convert Python data structures into DataFrame objects to allow users to easily perform data manipulation and analysis. This function is called Pandas [DataFrame()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).

In [16]:
# Import Pandas library
import pandas as pd

In [None]:
# Create a Pandas DataFrame from a list of dictionaries
df_countries = pd.DataFrame(list_of_countries)

# Display the DataFrame
df_countries

Unnamed: 0,country_name,capital_city,population,area_km2
0,China,Beijing,1433783686,9596961
1,New Zealand,Wellington,4783063,270467
2,South Africa,Pretoria,58558270,1221037
3,United Kingdom,London,67530172,242495
4,United States,Washington DC,329064917,9525067


<br>

In the following example, we apply the same function but we also make use of one of the optional parameters `index`. 

For this parameter, we pass our list variable `list_of_country_codes` to be assigned as the DataFrame index. 

In [20]:
# Create a Pandas DataFrame from a list of dictionaries
df_countries = pd.DataFrame(list_of_countries,
                           index = list_of_country_codes)

# Display the DataFrame
df_countries

Unnamed: 0,country_name,capital_city,population,area_km2
CN,China,Beijing,1433783686,9596961
NZ,New Zealand,Wellington,4783063,270467
ZA,South Africa,Pretoria,58558270,1221037
GB,United Kingdom,London,67530172,242495
US,United States,Washington DC,329064917,9525067


We can easily see the similarities and differences in these 2 examples above. Both DataFrames have the same content of 5 different countries. 

However, we assigned the dataFrame index by using the `index_col` parameter.

This is an optional parameter. The function will automatically create a numerical index value if we skip it.

<br>

### Pandas DataFrame.from_dict() Function <a class="anchor" id="section_3"></a>

The Pandas [from_dict()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_dict.html) function can be used to convert a single dictionary into a DataFrame object

In [23]:
# Create a Python dictionary with multiple values. Refer to lesson video for details
dictionary_of_countries = {
    'country_name':['China', 'New Zealand', 'South Africa', 'United Kingdom', 'United States'],
    'country_code':['CN','NZ','ZA','GB','US'],
    'capital_city':['Beijing','Wellington','Pretoria','London','Washington DC'],
    'population':[1433783686, 4783063, 58558270, 67530172, 329064917],
    'area_km2':[9596961, 270467, 1221037, 242495, 9525067]}

# Display the dictionary
dictionary_of_countries

{'country_name': ['China',
  'New Zealand',
  'South Africa',
  'United Kingdom',
  'United States'],
 'country_code': ['CN', 'NZ', 'ZA', 'GB', 'US'],
 'capital_city': ['Beijing',
  'Wellington',
  'Pretoria',
  'London',
  'Washington DC'],
 'population': [1433783686, 4783063, 58558270, 67530172, 329064917],
 'area_km2': [9596961, 270467, 1221037, 242495, 9525067]}

In [24]:
# Convert a dictionary into a DataFrame using from_dict() function
df_countries = pd.DataFrame.from_dict(dictionary_of_countries)

# Display the DataFrame
df_countries

Unnamed: 0,country_name,country_code,capital_city,population,area_km2
0,China,CN,Beijing,1433783686,9596961
1,New Zealand,NZ,Wellington,4783063,270467
2,South Africa,ZA,Pretoria,58558270,1221037
3,United Kingdom,GB,London,67530172,242495
4,United States,US,Washington DC,329064917,9525067


From these examples, we see how Pandas library gave us multiple ways to convert Python native data structures into DataFrame objects. 