# 1: Ordered Collections

## Lists

A Python list is an ordered collection of items.

```python
[item1, item2, ..., itemN]
```


Each item can be of any type

Example: 

In [247]:
[2.0,9.1,12.5]

[2.0, 9.1, 12.5]

In [248]:
x=[2.0,9.1,12.5]
print("x has type", type(x))
x

x has type <class 'list'>


[2.0, 9.1, 12.5]

### What can we do with lists?

We can access items in a list called `mylist` using `mylist[N]` where `N` is an integer.

Note: Python start counting at zero

In [249]:
x[1]

9.1

In [250]:
x[0]

2.0

In [251]:
len(x) # determine how many items in x

3

In [252]:
# x[4] #if index is higher than the number of items in a list, 
# it gives an error

In [253]:
2.0 in x

True

In [254]:
1.5 in x # checking whether the element is n the lists

False

In [255]:
x.reverse() #reverse order of lists
x

[12.5, 9.1, 2.0]

In [256]:
number_list=[10,25,42,1.0]
print(number_list)
number_list.sort() # sorting the elements, for numerical, small to large
print(number_list)

[10, 25, 42, 1.0]
[1.0, 10, 25, 42]


In [257]:
str_list=["NY","AZ","TX"]
print(str_list)
str_list.sort()
print(str_list)

['NY', 'AZ', 'TX']
['AZ', 'NY', 'TX']


Append: adds an element to the end of existing list, it adds a element in the end

In [258]:
num_list=[10,25,42,8]
print(num_list)
num_list.append(10)
print(num_list)

[10, 25, 42, 8]
[10, 25, 42, 8, 10]


If you add a list to the end,rather than numbers in that list

In [259]:
num_list=[10,25,42,8]
print(num_list)
num_list.append([20,4])
print(num_list)

[10, 25, 42, 8]
[10, 25, 42, 8, [20, 4]]


If we want to combine to list, We can use `extend`

In [260]:
num_list=[10,25,41,8]
print(num_list)
num_list.extend([20,4])
print(num_list)

[10, 25, 41, 8]
[10, 25, 41, 8, 20, 4]


## Lists of Different Types

1: Make a small change, change `2.0` to `2`, which is change from floating to integer

In [261]:
x=[2,9.1,12.5]

In [262]:
import numpy as np
x=[2,9.1,12.5]
np.mean(x) #numpy package gives function to calculate mean
np.mean(x)==sum(x)/len(x)

True

In [263]:
# To see the type of list, just list itself
x=[2,"Hello",3.0]
print("x has type",type(x))
x

x has type <class 'list'>


[2, 'Hello', 3.0]

In [264]:
# To see the types of individual elements in the list
print(f"type(x[0])={type(x[0])},type(x[1])={type(x[1])},type(x[2])={type(x[2])}")

type(x[0])=<class 'int'>,type(x[1])=<class 'str'>,type(x[2])=<class 'float'>


We can't sort the list which has two different types of elements, except for float and integer

In [265]:
# x=[2,"hello",3.0]
# x.sort()

## The `range` Function

It has three versions:

1. `range(N)`: goes from 0 to N-1  
1. `range(a, N)`: goes from a to N-1  
1. `range(a, N, d)`: goes from a to N-1, counting by d  


When we call the `range` function, we get back something that has type `range`:

In [266]:
r=range(5)
print(type(r))

<class 'range'>


In [267]:
list(r) # to turn `range` into a list:

[0, 1, 2, 3, 4]

## Tuples

1: using parenthesis -() not []

2: immutable, they can't be changed or altered after they are created

3: Tuples and multiple return values from functions are tightly connected

In [268]:
t=(1,"hello",3.0)
print("t is a", type(t))
t

t is a <class 'tuple'>


(1, 'hello', 3.0)

In [269]:
# Convert list to tuple
print("x is a",type(x))
print("tuple(x) is a", type(tuple(x)))
tuple(x)

x is a <class 'list'>
tuple(x) is a <class 'tuple'>


(2, 'Hello', 3.0)

In [270]:
# Convert a tuple to a list
list(t)

[1, 'hello', 3.0]

In [271]:
t[0]

1

In [272]:
t[2]

3.0

Tuples (and lists) can be unpacked directly into variables

In [273]:
x,y=(1,"test")
print(f"x={x},y={y}")

x=1,y=test


### List vs Tuple: Which to use?

This depends on what you are storing, whether you might need to
reorder the elements, or elements, or whether you would add elements
withoud complete reinterpretation of the underlying data

Example:

In [274]:
china_data_2015=("China",2015,11.06,1.371)

print(china_data_2015)

('China', 2015, 11.06, 1.371)


Reason using tuples:

1: ordering is meaningless

2: adding more data would require a reinterpretation of whole data structure

In [275]:
gdp_data=[9.607,10.48,11.06]
print(gdp_data)

[9.607, 10.48, 11.06]


In this case, we use list, since adding on a new element to the end of the list for GDP in 2016 would make complete sense

In [276]:
china_data = [(2015, 11.06, 1.371), (2014, 10.48, 1.364), (2013, 9.607, 1.357)]
print(china_data)

[(2015, 11.06, 1.371), (2014, 10.48, 1.364), (2013, 9.607, 1.357)]


General rules: use a list unless you need to use a tuple

Key criteria for tuple use are when you want to:

- ensure the order of elements can't change
- ensure the actual values of the elements can't change
- use the collection as a key in a dict


## `zip` and `enumerate`

`zip` example:

In [277]:
gdp_data=[9.607,10.48,11.06]
years=[2013,2014,2015]
z=zip(years,gdp_data)
print("type(z)",type(z))

type(z) <class 'zip'>


In [278]:
list(z)

[(2013, 9.607), (2014, 10.48), (2015, 11.06)]

We unpack the resulting tuple directly into variables

In [279]:
l=list(zip(years,gdp_data))
x,y=l[0]
print(f"year={x},GDP={y}")

year=2013,GDP=9.607


`enumerate` example:

In [280]:
e=enumerate(["a","b","c"])
print("type(e)",type(e))
e

type(e) <class 'enumerate'>


<enumerate at 0x11066d750>

In [281]:
list(e) #enumerate is to index the list elements

[(0, 'a'), (1, 'b'), (2, 'c')]

Note:

Zip, enumerate and range produce what is called a generator. A generator will only produce each of its elements a single time, so if you call list on the same generator a second time, it will not have any elements to iterate over anymore.

In [282]:
gdp_data = [9.607, 10.48, 11.06]
years = [2013, 2014, 2015]
z = zip(years, gdp_data)
l = list(z)
print(l)
m = list(z)
print(m)

[(2013, 9.607), (2014, 10.48), (2015, 11.06)]
[]


# 2: Associative Collections

## Dictionaries

A dictionary associates `key`s with `value`s

Similar to a dictionary for words, where the keys are words and the values are the associated definitions.

The most common way to create a `dict` is to use curly braces - `{` and `}` - like this:

```python
{"key1": value1, "key2": value2, ..., "keyN": valueN}
```


where the `...` indicates that we can have any number of additional terms

`key: value` and that these pairs are separated by commas -`,`.

Example:

In [283]:
china_data = {"country": "China", "year": 2015, "GDP" : 11.06, "population": 1.371}
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371}


In [284]:
china_data = {
    "country": "China",
    "year": 2015,
    "GDP" : 11.06,
    "population": 1.371
} # more readable

Most often, the keys (e.g. “country”, “year”, “GDP”, and “population”)
will be strings, but we could also use numbers (`int`, or
`float`) or even tuples (or, rarely, a combination of types).

The values can be **any** type and different from each other.


Values can be anything - including another dictionary

In [285]:
companies = {"AAPL": {"bid": 175.96, "ask": 175.98},
             "GE": {"bid": 1047.03, "ask": 1048.40},
             "TVIX": {"bid": 8.38, "ask": 8.40}}
print(companies)

{'AAPL': {'bid': 175.96, 'ask': 175.98}, 'GE': {'bid': 1047.03, 'ask': 1048.4}, 'TVIX': {'bid': 8.38, 'ask': 8.4}}


### Getting, Setting, and Updating dict Items

the syntax `d[k]`,  where `d` is our `dict` and `k` is the key 


In [286]:
print(china_data["year"])
print(f"country = {china_data['country']}, population = {china_data['population']}")

2015
country = China, population = 1.371


If we ask for the value of a key that is not in the dict, we will get an error

In [287]:
# china_data["inflation"]

Adding new items to a dict using `d[new_key]=new_value`

In [288]:
print(china_data)
china_data["unemployment"] = "4.05%"
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371}
{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '4.05%'}


To update the value, we use assignment in the same way (which will create the key and value as required)

In [289]:
print(china_data)
china_data["unemployment"]="4.051%"
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '4.05%'}
{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': '4.051%'}


Or we could change the type

In [290]:
china_data["unemployment"]=4.051
print(china_data)

{'country': 'China', 'year': 2015, 'GDP': 11.06, 'population': 1.371, 'unemployment': 4.051}


### Common `dict` Functionality

In [291]:
# number of key-value pairs in a dict
len(china_data)

5

In [292]:
# get a list of all the keys
list(china_data.keys())

['country', 'year', 'GDP', 'population', 'unemployment']

In [293]:
# get a list of all the values
list(china_data.values())

['China', 2015, 11.06, 1.371, 4.051]

In [294]:
more_china_data = {"irrigated_land": 690_070, "top_religions": {"buddhist": 18.2, "christian" : 5.1, "muslim": 1.8}}

# Add all key-value pairs in mydict2 to mydict.
# if the key already appears in mydict, overwrite the
# value with the value in mydict2
china_data.update(more_china_data)
china_data

{'country': 'China',
 'year': 2015,
 'GDP': 11.06,
 'population': 1.371,
 'unemployment': 4.051,
 'irrigated_land': 690070,
 'top_religions': {'buddhist': 18.2, 'christian': 5.1, 'muslim': 1.8}}

In [295]:
# Get the value associated with a key or return a default value
# use this to avoid the NameError we saw above if you have a reasonable
# default value
china_data.get("irrigated_land", "Data Not Available")

690070

In [296]:
china_data.get("death_rate", "Data Not Available")

'Data Not Available'

## Sets

Sets are like mathematical concept of sets.

Definition: A set is an unordered collection of unique elements

In [297]:
s= {1,"hello",3.0}
print("s has type",type(s))
s

s has type <class 'set'>


{1, 3.0, 'hello'}

Checking length and checking an element whether or not in set

In [298]:
print("len(s)=",len(s))
"hello" in s

len(s)= 3


True

Unlike lists and tuples, we can’t extract elements of a set `s` using
`s[N]` where `N` is a number. Since it is unordered.

We can add

In [299]:
s.add(100)
s

{1, 100, 3.0, 'hello'}

In [300]:
s.add("Hello")
s #nothing happen since it is unique we will not have one single element twice

{1, 100, 3.0, 'Hello', 'hello'}

We can also do set operations.

Consider the set `s` from above and the set
`s2 = {"hello", "world"}`.

- `s.union(s2)`: returns a set with all elements in either `s` or
  `s2`  
- `s.intersection(s2)`: returns a set with all elements in both `s`
  and `s2`  
- `s.difference(s2)`: returns a set with all elements in `s` that
  aren’t in `s2`  
- `s.symmetric_difference(s2)`: returns a set with all elements in
  only one of `s` and `s2`  

`set` convert other types of data to set

In [301]:
x = [1, 2, 3, 1]
set(x)

{1, 2, 3}

In [302]:
t = (1, 2, 3, 1)
set(t)

{1, 2, 3}

In [303]:
list(s)

[1, 'Hello', 3.0, 100, 'hello']

In [304]:
tuple(s)

(1, 'Hello', 3.0, 100, 'hello')