# Review:

## Data Structures

Let’s review *all* the built in python **data structures** that we have learned - and yes, you now know them all! Way to go!!  

Data structures store a collection of data and they are iterable, meaning you can loop through each element. The types of data structures provide programers with a variety of options when dealing with a collection of data. 

## Lists

A ***List*** is a collection of items in a specific order that is mutable, or changeable.  Square brackets `[ ]` indicate a list, and empty brackets can define an empty list. `my_list = []`


## Tuples.

A ***Tuple*** is a collection of items in a specific order that is immutable, or unchangeable.  Parentheses `( )` usually indicate a Tuple, and empty parentheses can define an empty tuple. `my_tuple = ()`

## Sets

A ***Set*** is a collection of items that is unordered, mutable, or changeable but has no duplicates, another way to think of them is as a mathematical notion of a set.  Sets are created with the `set()` function applied to a list. Creating a set requires both the square brackets `[ ]` or a list inside parentheses `( )`.  `my_set = set([1,2,3,4,5,6])` or `my_set = set(my_list)`

## Dictionaries

A ***Dictionary*** is a group of items; each item consists of a *key* and a *value*. Braces `{ }` indicate a dictionary, and empty braces can define and empty dictionary. `my_dictionary = {}`

----

# Other Data Structures (they are not built in and need to be imported)

## Pandas

We have used pandas, though I have only touched on them as they are quite complicated, yet unavoidable for the data required for our project.    We will not go that deep into Pandas, as we could spend a full semester going through just some of the details.  A general note about Pandas data structures, which are called DataFrames, they are tabular, or they are tables. These tables can import many data structures including, but in no way limited to, SQL DataBases, JSON Files, CSV, Excel, HDF5, SAS, Pickle, Stata, Google Big Query, etc…

You can also convert lists, tuples, sets, and dictionaries to data frames.  I have created the table below to show you the various properties of all the data structures we have talked about:

In [61]:
import pandas as pd

my_pandas = {'Data Structure' : ['Lists', 'Tuples', 'Sets', 'Dictionaries', 'Pandas'], 
             'Mutability' : ['mutable', 'immutable', 'mutable', 'mutable', 'mutable'],
             'ordered' : ['ordered', 'ordered', 'unordered', 'unordered-ish', 'ordered'],
             'duplicates' : ['duplicates', 'duplicates', 'no duplicates', 'duplicates', 'duplicates'],
             'defined with' : ['[ ]','( )','set(list) or set([ ])','{ }','pandas package']}

my_pandas_df = pd.DataFrame.from_dict(my_pandas,).set_index(keys = 'Data Structure')
my_pandas_df

Unnamed: 0_level_0,Mutability,ordered,duplicates,defined with
Data Structure,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Lists,mutable,ordered,duplicates,[ ]
Tuples,immutable,ordered,duplicates,( )
Sets,mutable,unordered,no duplicates,set(list) or set([ ])
Dictionaries,mutable,unordered-ish,duplicates,{ }
Pandas,mutable,ordered,duplicates,pandas package


## Adding Shapes to a Map

Last week we used a list of tuples to draw a line on our map: 

In [55]:
import folium
from folium import plugins

GALA_Address = '1067 West Blvd, Los Angeles, CA 90019'
gala_lat = 34.05544
gala_long = -118.33132

gala_map = folium.Map(location=[gala_lat, gala_long],
                        zoom_start=14,
                        tiles="CartoDB positron")

folium.Marker(location=[gala_lat, gala_long],
             icon=folium.Icon(color='black',icon='graduation-cap', prefix='fa'
                             ), popup = GALA_Address).add_to(gala_map)

line = [(34.0677263, -118.35418),
         (34.0677263, -118.34418) ,
         (34.06242,-118.34418),
         (34.07242,-118.34418), 
         (gala_lat, gala_long),
        (34.0677263, -118.35418)]

folium.PolyLine(line, color="purple", weight=2.5, opacity=1).add_to(gala_map)

gala_map

# Let's Add a Zip Code Boundary 

In [39]:
my_zip_90036 = [(-118.361375, 34.062967), (-118.361452, 34.080191), 
                (-118.361417, 34.083811), (-118.347435, 34.083505), 
                (-118.346216, 34.081852), (-118.345149, 34.083455), 
                (-118.34405, 34.083431), (-118.338583, 34.083457), 
                (-118.338496, 34.080292), (-118.337995, 34.08029399999999), 
                (-118.337734, 34.068981), (-118.337348, 34.068911), 
                (-118.337763, 34.066905), (-118.337149, 34.064928), 
                (-118.338469, 34.064925), (-118.338464, 34.062115), 
                (-118.336545, 34.062077), (-118.337054, 34.060564), 
                (-118.337616, 34.060423), (-118.338167, 34.058981), 
                (-118.339393, 34.057063), (-118.353539, 34.057385), 
                (-118.354642, 34.054199), (-118.355409, 34.054467), 
                (-118.354739, 34.05741), (-118.359856, 34.057519), 
                (-118.360966, 34.056668), (-118.363818, 34.058183), 
                (-118.361375, 34.062967)]

# Huston, we have a problem!

Can anyone see the problem?

### Thankfully Data Structures are iterable including tuples inside lists - or lists of tuples:

for example, let's print each tuple by looping through the list:

In [56]:
for cord in my_zip_90036:
    print(cord)

(-118.361375, 34.062967)
(-118.361452, 34.080191)
(-118.361417, 34.083811)
(-118.347435, 34.083505)
(-118.346216, 34.081852)
(-118.345149, 34.083455)
(-118.34405, 34.083431)
(-118.338583, 34.083457)
(-118.338496, 34.080292)
(-118.337995, 34.08029399999999)
(-118.337734, 34.068981)
(-118.337348, 34.068911)
(-118.337763, 34.066905)
(-118.337149, 34.064928)
(-118.338469, 34.064925)
(-118.338464, 34.062115)
(-118.336545, 34.062077)
(-118.337054, 34.060564)
(-118.337616, 34.060423)
(-118.338167, 34.058981)
(-118.339393, 34.057063)
(-118.353539, 34.057385)
(-118.354642, 34.054199)
(-118.355409, 34.054467)
(-118.354739, 34.05741)
(-118.359856, 34.057519)
(-118.360966, 34.056668)
(-118.363818, 34.058183)
(-118.361375, 34.062967)


## `sorted()` function to the rescue:

In [57]:
my_sorted_90036 = []

for cord in my_zip_90036:
    my_sorted_90036.append(sorted(cord))
    
my_sorted_90036

[[-118.361375, 34.062967],
 [-118.361452, 34.080191],
 [-118.361417, 34.083811],
 [-118.347435, 34.083505],
 [-118.346216, 34.081852],
 [-118.345149, 34.083455],
 [-118.34405, 34.083431],
 [-118.338583, 34.083457],
 [-118.338496, 34.080292],
 [-118.337995, 34.08029399999999],
 [-118.337734, 34.068981],
 [-118.337348, 34.068911],
 [-118.337763, 34.066905],
 [-118.337149, 34.064928],
 [-118.338469, 34.064925],
 [-118.338464, 34.062115],
 [-118.336545, 34.062077],
 [-118.337054, 34.060564],
 [-118.337616, 34.060423],
 [-118.338167, 34.058981],
 [-118.339393, 34.057063],
 [-118.353539, 34.057385],
 [-118.354642, 34.054199],
 [-118.355409, 34.054467],
 [-118.354739, 34.05741],
 [-118.359856, 34.057519],
 [-118.360966, 34.056668],
 [-118.363818, 34.058183],
 [-118.361375, 34.062967]]

# Wait what happend?

In [59]:
my_sorted_90036 = []

for cord in my_zip_90036:
    my_sorted_90036.append(sorted(cord, reverse=True))

my_sorted_90036

[[34.062967, -118.361375],
 [34.080191, -118.361452],
 [34.083811, -118.361417],
 [34.083505, -118.347435],
 [34.081852, -118.346216],
 [34.083455, -118.345149],
 [34.083431, -118.34405],
 [34.083457, -118.338583],
 [34.080292, -118.338496],
 [34.08029399999999, -118.337995],
 [34.068981, -118.337734],
 [34.068911, -118.337348],
 [34.066905, -118.337763],
 [34.064928, -118.337149],
 [34.064925, -118.338469],
 [34.062115, -118.338464],
 [34.062077, -118.336545],
 [34.060564, -118.337054],
 [34.060423, -118.337616],
 [34.058981, -118.338167],
 [34.057063, -118.339393],
 [34.057385, -118.353539],
 [34.054199, -118.354642],
 [34.054467, -118.355409],
 [34.05741, -118.354739],
 [34.057519, -118.359856],
 [34.056668, -118.360966],
 [34.058183, -118.363818],
 [34.062967, -118.361375]]

In [60]:
gala_map = folium.Map(location=[gala_lat, gala_long],
                        zoom_start=14,
                        tiles="CartoDB positron")

folium.Marker(location=[gala_lat, gala_long],
             icon=folium.Icon(color='black',icon='graduation-cap', prefix='fa'
                             ), popup = GALA_Address).add_to(gala_map)

folium.PolyLine(my_sorted_90036, color="purple", weight=2.5, opacity=1).add_to(gala_map)

gala_map

# Shape Files `.shp`

For almost any boundary or border in the world, there is probably a coordinates file, similar to our list of tuples, in the form of a **shape file**, somewhere on the internet. A common format of these files is the `.shp` file extension.  

In the case of the United States, and most western countries, there are usually government web sites, and APIs,  that store and share the precise data.   And often times, as it is in the United States, there are many agencies that both utilize and provide the data to the public.   For zip codes, I have dowloaded first from the United State Census, which keeps and accurate record, via ZCTAs or ZIP Code Tabulation Areas:

https://www.census.gov/geo/reference/webatlas/zctas.html

However a great repository for data provided by the US government is at DATA.GOV:

https://catalog.data.gov/dataset/zip-codes-zipcodes

In addition to zip codes, you can get almost all of the US data for coding projects through the site or other government sites that DATA.GOV links too. Pretty Cool!

All of the states have their own state data sites and APIs.  BTW, API stands for **Application Programming Interface**, which is a set of clearly defined methods of communication in the form of functions and procedures allowing the communication data of an operating system, application, or other computer service.

Or another way to put it, APIs are way for computers to talk to each other without the need for it to be human readable. 
