# Zipping and Unpacking Lab

### Introduction

In this lesson, we'll move practice with zipping and unpacking our data by working with some data describing the density of cities.  

Let's get started.

### Loading our data

We can begin by loading our data.

In [1]:
import pandas as pd
url = "https://raw.githubusercontent.com/python-fundamentals-jigsaw/review-datatypes/main/cities_dens.csv"
cities_df = pd.read_csv(url, index_col = 0)
cities = cities_df.to_dict('records')

In [2]:
cities[:2]

[{'city': 'Malé',
  'population': '153,904[1]',
  'area_mi': '1.956[1]',
  'density_mi': 203846.0,
  'country': 'Maldives'},
 {'city': 'Manila',
  'population': '1,660,714[2]',
  'area_mi': '38.55[3]',
  'density_mi': 111576.0,
  'country': 'Philippines'}]

And, identifying the grain of the data, we can see that each dictionary represents a different city.

### Selecting our data

Now let's say that we only need the attributes of the city, and the amount of people per mile (indicated by `density_mi`).  

First, select the city names.

In [4]:
city_names = [city['city'] for city in cities]
city_names[:3]

# ['Malé', 'Manila', 'Bogor']

['Malé', 'Manila', 'Bogor']

And then let's select the density per city.

In [5]:
densities = [city['density_mi'] for city in cities]
densities[:3]

# [203846.0, 111576.0, 104037.0]

[203846.0, 111576.0, 104037.0]

Next, create a list of tuples, where each tuple's first element is the city name and the second element is the density.

In [7]:
city_densities = list(zip(city_names, densities))

city_densities[:3]

# [('Malé', 203846.0), ('Manila', 111576.0), ('Bogor', 104037.0)]

[('Malé', 203846.0), ('Manila', 111576.0), ('Bogor', 104037.0)]

Ok, and from here, we'll print our the name of and population for the first three tuples.

In [9]:
three_city_densities = city_densities[:3]

for city, density in three_city_densities:
    print('City:', city, 'Density', density)
    print(f'City: {city} Density: {density}')

City: Malé Density 203846.0
City: Malé Density: 203846.0
City: Manila Density 111576.0
City: Manila Density: 111576.0
City: Bogor Density 104037.0
City: Bogor Density: 104037.0


Even better would be to create a list of dictionaries where the first attribute is the city and the second attribute is the density.  Do this for each of our `city_densities` and store the dictionary in a list called `densities`.

In [10]:
densities = [{'city': city, 'density': density} for city, density in city_densities]

densities[:3]

# [{'city': 'Malé', 'density': 203846.0},
#  {'city': 'Manila', 'density': 111576.0},
#  {'city': 'Bogor', 'density': 104037.0}]

[{'city': 'Malé', 'density': 203846.0},
 {'city': 'Manila', 'density': 111576.0},
 {'city': 'Bogor', 'density': 104037.0}]

In [11]:
keys = ['city', 'density']
densities = [dict(zip(keys, city)) for city in city_densities]

densities[:3]

[{'city': 'Malé', 'density': 203846.0},
 {'city': 'Manila', 'density': 111576.0},
 {'city': 'Bogor', 'density': 104037.0}]

### Working with dictionaries

Ok, now let's go back to our original dictionary.

In [12]:
first_city = cities[0]

first_city

{'city': 'Malé',
 'population': '153,904[1]',
 'area_mi': '1.956[1]',
 'density_mi': 203846.0,
 'country': 'Maldives'}

Iterate through the key value pairs, create new list of tuples for information related to the keys of `city` and `country`.  Store this in the `selected_attrs` list.

In [19]:
selected_attrs = []

for city in cities:
  attr = []
  for k, v in city.items():
    if k == 'city' or k == 'country':
      attr.append((k, v))
  selected_attrs.append(attr)
selected_attrs

# [('city', 'Malé'), ('country', 'Maldives')]

[[('city', 'Malé'), ('country', 'Maldives')],
 [('city', 'Manila'), ('country', 'Philippines')],
 [('city', 'Bogor'), ('country', 'Indonesia')],
 [('city', 'Titagarh'), ('country', 'India')],
 [('city', 'Baranagar'), ('country', 'India')],
 [('city', 'Serampore'), ('country', 'India')],
 [('city', 'South Dumdum'), ('country', 'India')],
 [('city', 'Kamarhati'), ('country', 'India')],
 [('city', 'Kolkata'), ('country', 'India')],
 [('city', 'Levallois-Perret'), ('country', 'France')],
 [('city', 'Mandaluyong'), ('country', 'Philippines')],
 [('city', 'Neapoli'), ('country', 'Greece')],
 [('city', 'Caloocan'), ('country', 'Philippines')],
 [('city', 'Karachi'), ('country', 'Pakistan')],
 [('city', 'Chennai'), ('country', 'India')],
 [('city', 'Sukabumi'), ('country', 'Indonesia')],
 [('city', 'Hyderabad'), ('country', 'India')],
 [('city', 'Saint-Josse-ten-Noode'), ('country', 'Belgium')],
 [('city', 'Malabon'), ('country', 'Philippines')],
 [('city', 'Kallithea'), ('country', 'Greece')]

Ok, and now coerce that list of tuples into a dictionary.

In [22]:
dict(selected_attrs)

{('city', 'Malé'): ('country', 'Maldives'),
 ('city', 'Manila'): ('country', 'Philippines'),
 ('city', 'Bogor'): ('country', 'Indonesia'),
 ('city', 'Titagarh'): ('country', 'India'),
 ('city', 'Baranagar'): ('country', 'India'),
 ('city', 'Serampore'): ('country', 'India'),
 ('city', 'South Dumdum'): ('country', 'India'),
 ('city', 'Kamarhati'): ('country', 'India'),
 ('city', 'Kolkata'): ('country', 'India'),
 ('city', 'Levallois-Perret'): ('country', 'France'),
 ('city', 'Mandaluyong'): ('country', 'Philippines'),
 ('city', 'Neapoli'): ('country', 'Greece'),
 ('city', 'Caloocan'): ('country', 'Philippines'),
 ('city', 'Karachi'): ('country', 'Pakistan'),
 ('city', 'Chennai'): ('country', 'India'),
 ('city', 'Sukabumi'): ('country', 'Indonesia'),
 ('city', 'Hyderabad'): ('country', 'India'),
 ('city', 'Saint-Josse-ten-Noode'): ('country', 'Belgium'),
 ('city', 'Malabon'): ('country', 'Philippines'),
 ('city', 'Kallithea'): ('country', 'Greece'),
 ('city', 'Mumbai'): ('country', 'Indi

* Do for all

Now if we want, we can loop through all of our cities, and for each city loop through the list of items.

In [23]:
selected_cities = []
for city in cities:
    selected_k_vs = []
    for k, v in city.items():
        if k in ['city', 'country']:
            selected_k_vs.append((k, v))
    selected_cities.append(dict(selected_k_vs))

selected_cities[:3]
# [{'city': 'Malé', 'country': 'Maldives'},
#  {'city': 'Manila', 'country': 'Philippines'},
#  {'city': 'Bogor', 'country': 'Indonesia'}]

[{'city': 'Malé', 'country': 'Maldives'},
 {'city': 'Manila', 'country': 'Philippines'},
 {'city': 'Bogor', 'country': 'Indonesia'}]

But generally, we should avoid nested loops unless we absolutely need them -- it's just too difficult to keep track of all of those variables floating around.

So instead, we can use our earlier approach.  Just use list comprehension to create a list of city names and countries.  Then we can zip them together, and turn each resulting tuple into a dictionary.

In [24]:
city_names = [city['city'] for city in cities]
country_names = [city['country'] for city in cities]

city_countries = list(zip(city_names, country_names))
city_countries[:2]

[('Malé', 'Maldives'), ('Manila', 'Philippines')]

In [25]:
abbrev_cities = []
for city, country in city_countries:
    abbrev_cities.append({'city': city, 'country': country})

abbrev_cities[:3]

[{'city': 'Malé', 'country': 'Maldives'},
 {'city': 'Manila', 'country': 'Philippines'},
 {'city': 'Bogor', 'country': 'Indonesia'}]

Doesn't that look easier?  Instead of looping through our key value pairs, we just select the relevant list of values, zip them together, and then iterate through the list of tuples.

And we could even create the dictionary by zipping together the keys and values.

In [26]:
abbrev_cities = []
keys = ['city', 'country']
for vals in city_countries:
    abbrev_cities.append(dict(zip(keys, vals)))

abbrev_cities[:3]

[{'city': 'Malé', 'country': 'Maldives'},
 {'city': 'Manila', 'country': 'Philippines'},
 {'city': 'Bogor', 'country': 'Indonesia'}]

### Summary

In this lesson, we worked with zipping and unpacking, and saw how we can use this to create dictionaries with fewer attributes than our original dictioaries.  Our pattern for doing this was to select the relevant attributes into two separate lists, then zip them together, iterate through the resulting tuples.

In [None]:
city_names = [city['city'] for city in cities]
country_names = [city['country'] for city in cities]

city_countries = list(zip(city_names, country_names))

abbrev_cities = []
for city, country in city_countries:
    abbrev_cities.append({'city': city, 'country': country})

We also saw how we can -- if we need to -- iterate through the key value pairs of a single dictionary.  We do that with something like the following:

In [None]:
first_city = cities[0]

selected_attrs = []
for k, v in first_city.items():
    if k == 'city' or k == 'country':
        selected_attrs.append((k, v))

selected_attrs

[('city', 'Malé'), ('country', 'Maldives')]

In [None]:
dict(selected_attrs)

{'city': 'Malé', 'country': 'Maldives'}