Importing the packages you need in the top cell let's other people know what libraries need to be installed for them to be able to run your code

In [None]:
import pandas as pd
import operator
import time
import seaborn as sns

### Syntax  - when do we use these?
 - `()` - when calling a function or method; to create a tuple
 - `{}` - when creating a dictionary or set 
 - `[]` - when slicing an iterable; to create a list

### call the `ctime()` method from the time module


In [None]:
time.ctime()

### [Python Collections](https://www.w3schools.com/python/python_lists.asp)

#### a _tuple_ is an ordered, immutable list (with or without parentheses)


In [None]:
tuple1 = (90, 180, 270, 360)
tuple2 = 'Mahesh', 'Michael'
tuple4 = tuple()

You can use the `type()` function to find out what type a variable is in python

In [None]:
type(tuple1)

In [None]:
print('tuple1 is a ', type(tuple1), 'Ben!')
print('tuple2 is a ', type(tuple2))

In [None]:
tuple2

tuples can be concatenated using the `+` operator

In [None]:
tuple3 = tuple1 + tuple1

In [None]:
tuple3

You can slice a tuple with `[]` to see what is at a particular position. Remember that python is 0 indexed!

In [None]:
tuple3[1]

Below is one way to print the documentation in python

In [None]:
print(tuple.__doc__)

### your turn
- type `tuple3.` and press `tab` to see the tuple methods
- try each method in turn in the cell below to see it at work
    - `count()` will count how many times a given value appears in the tuple - try it with 180
    - try using the `index()` method, passing it 180 as the input or argument
- try typing `tuple3.index?` for help understanding how to use the `index()` method

In [None]:
tuple3.index?

#### immutability - what happens when we try to assign a new value to the first item in `tuple3` ?

In [None]:
tuple3[0] = 360

#### a _list_ is ordered and mutable

In [None]:
list1 = ['red', 'yellow', 'blue', 'green', 'red', 'green']
list2 = ['Amanda', 'Media', 'Ari', 'Jacob', 'Pam', 'Cristina']

In [None]:
print('list1 is a ', type(list1))
print('list2 is a ', type(list2))

lists can also be concatenated with the `+` operator

In [None]:
list3 = list1 + list2

In [None]:
list3

use slicing -- `[]` -- to find the first value of list3

In [None]:
list3[0]

#### mutability - can we assign a new value to the first item in the list?

In [None]:
list3[0] = 'purple'

In [None]:
list3

In [None]:
# type list3. and press tab to see list methods
list3.

#### `pop( )` will remove the last item from a list *_or_* you can specify the position of the item to remove by passing the index value (position) of the element to remove: `pop(2)` would remove the third element from a list

In [None]:
list3.pop()
list3

### your turn
- remove the second `green` from list3

In [None]:
list3.pop(5)

In [None]:
list3.append('Amanda')
list3

In [None]:
list3.reverse()
list3

#### You can store the item you removed in a variable

In [None]:
last_one = list3.pop()
last_one

In [None]:
list3

#### lists are iterable 
- use a for-loop to iterate through list3, add an `s` at the end and create a new list


In [None]:
plurals = []

for thing in list3:
    #print(thing)
    plurals.append(thing + 's')
    #print(plurals)
    
plurals

#### list comprehensions are a compact way to iterate through a list
 - `[` `what to return in the new list` `for` `iterator` `in` `list` `]`

In [None]:
plurals2 = [item + 's' for item in list3]
plurals2

### your turn
- write a list comprehension that adds a `?` at the end of every item in `list3`
- save the results in a variable called `questions`

In [None]:
questions = [item + '?' for item in list3]
questions

#### what do you think the next cell does?

In [None]:
favorite_colors = {}

In [None]:
type(favorite_colors)

#### you can create a dictionary from two lists by zipping them together and wrapping them with a `dict( )` constructor
- the first list that you pass to the constructor becomes the dictionary keys
- the second list you pass to the constructor bscomes the dictionary values matched to the keys in the same order

In [None]:
favorite_colors = dict(zip(list2, list1))
favorite_colors

#### you can use a `key` to look up a `value`

In [None]:
favorite_colors['Cristina']

### your turn
- use a key with the `favorite_colors` dict to find Ari's favorite color

In [None]:
favorite_colors['Ari']

## Sometimes we want to turn a pandas series into a list

### Let's get a _list_ of unique iris species

In [None]:
# load the iris dataset from seaborn

iris_df = sns.load_dataset('iris')
print(iris_df.shape)
iris_df.head()

#### what datatype is iris_df.species?

In [None]:
type(iris_df.species)

#type(iris_df['species'])

#### you can call `.head( )` on a series just like you can on a data frame

In [None]:
iris_df.species.head()

#### you can even assign the series to a new variable

In [None]:
species = iris_df.species
type(species)

#### to convert it to a list, _wrap it_ with the `list( )` constructor

In [None]:
species_list = list(iris_df.species)

print(type(species_list))
print(species_list)

#### passing a list to a set constructor is a handy way to get unique list values

In [None]:
unique_species = set(species_list)
print(type(unique_species))
unique_species

#### notice that a set is enclosed with curly braces (`{ }`)


### your turn!
- print tuple3
- find the unique values in tuple 3

In [None]:
print(tuple3)
set(tuple3)

### Next, we'll use a mask to *filter* the iris DataFrame
- a mask tells us row by row if the condition is `True` or `False`
- use a mask to subset the data to get only observations with a `sepal_length` less than five


In [None]:
iris_df['sepal_length'] > 5

#### we can save the mask and then use it to subset our data

In [None]:
big_sepal_mask = iris_df['sepal_length'] > 5

#### use the mask to subset and look at the first two rows

In [None]:
big_sepals = iris_df[big_sepal_mask]
print(big_sepals.head(2))
print("The biggest big sepal is ", big_sepals.sepal_length.max(), 
      "cm long!")

#### you can also use the mask without saving it to a variable
- here we want to find sepals between five and seven centimeters long
- each condition is wrapped with parentheses and joined by `&`

In [None]:
biggish_sepals = iris_df[(iris_df.sepal_length > 5) & (iris_df.sepal_length < 7)]

print(biggish_sepals.head(2))
print("The biggest biggish sepal is ", 
      biggish_sepals.sepal_length.max(), "cm long!")

### your turn
- use a mask to find observations in the iris data frame where `petal_length` is over 4 cm
- how many are there?


In [None]:
big_petals_mask = iris_df['petal_length'] > 4
big_petals_df = iris_df[big_petals_mask]

#how many?
big_petals_df.shape