# Python cheatsheets and tutorials

1. [Python basics](https://www.pythoncheatsheet.org/)
2. [numpy](http://datacamp-community-prod.s3.amazonaws.com/da466534-51fe-4c6d-b0cb-154f4782eb54)
3. [pandas](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf)
4. [Basics, numpy and pandas](https://www.kaggle.com/lavanyashukla01/pandas-numpy-python-cheatsheet)
5. [matplotlib for plotting](https://github.com/matplotlib/cheatsheets#cheatsheets) 
6. [seaborn for plotting](https://seaborn.pydata.org/tutorial.html) 

# Python basics part 1

### Use Python as a calculator

**[Arithmetic operators](https://www.programiz.com/python-programming/operators) are used to perform mathematical operations like addition, subtraction, multiplication, etc.**

<img align="left" src="img/arithmetic_operators.png" width="70%"/>

The order of operations is important! From highest to lowest precedence:

1. Exponentiation and root extraction: `**`
2. multiplication and division: `*`, `/`, `%`, `//`
3. addition and subtraction: `+` and `-`

Examples of expressions in an interactive way:

In [1]:
5 + 5

10

In [2]:
5 - 5

0

In [3]:
3 * 5

15

In [4]:
4 * 5

20

**Let's get the division of 9/2**

In [5]:
9 / 2

4.5

**Let's get the quotient and remainder of 9/2**

$9 = 2 \times 4 + 1$

In [6]:
9 // 2 # use the floor division operator to get the quotient

4

In [7]:
9 % 2 # use the modulus operator to get the remainder

1

**Let's calculate the square of 4**
($4^2$)

In [8]:
4**2

16

**Let's calculate the square root of 4**
($\sqrt[2]{4}$)

In [9]:
4**(1/2)

2.0

In [10]:
4**0.5

2.0

In [11]:
# use round bracket to change the order
(2 + 3) * 3

15

**import the math module for more calculations**

In [12]:
import math

In [13]:
math.sqrt(4)

2.0

In [14]:
# math.sqrt(4)

In [15]:
math.log10(10)

1.0

In [16]:
math.log(10)
# natural log

2.302585092994046

In [17]:
math.exp(2)

7.38905609893065

$e^2$

In [18]:
math.pi

3.141592653589793

---
### *Exercise*

> Calculate the value of $(2+3)^5 + 4\times5$
---

In [19]:
(2+3)**5 + 4 * 5

3145

## Packages and modules

There is a very wide range of packages/modules that you can import that extend the abilities of core Python. There are packages/modules that deal with file input and output, internet communication, numerical processing, etc. One of the nice features about Python is that you only import the packages/modules you need, so that the memory footprint of your code remains lean. Also, there are ways to import code that keep your **'namespace'** organized.

In the same way directories keep your files organized on your computer, **namespaces** organize your Python environment. There are a number of ways to import packages, for example.

In [20]:
# This imports the math module. Here 'math' is like a subdirectory in your namespace 
# that holds all of the math functions
import math

In [21]:
math.e

2.718281828459045

In [22]:
e = 12

In [23]:
e

12

In [24]:
math.e

2.718281828459045

---
### *Exercise*

> After importing the math module, type `math.` and hit `TAB` to see all the possible completions. These are the functions available in the math module.
---

In [25]:
math.hypot(4)


4.0

There are a number of other ways to import things from the math module. Experiment with these commands

    from math import log   # Import just the `log` function. Called as `log(x)`
    import math as m       # Import the math package, but rename it to `m`. Functions called like `m.sin(x)`


In [26]:
from math import log

In [27]:
log(10)

2.302585092994046

In [28]:
import math as m

In [29]:
m.log(10)

2.302585092994046

---
### *Exercise*

> Calculate $e^2$
---

In [30]:
(m.e)**2

7.3890560989306495

## Variables

### What are variables?

Variables are used to store values.

The syntax we'll use most often to create a variable will be the following:

<code>variable_name = some_value</code>

You can name a variable anything as long as it obeys the following rules:

1. It should be only one word.
2. It should use only letters, numbers, and the underscore `_` character.
3. It can’t begin with a number.
4. It should not use the reserved keywords (The Python language reserves a small set of keywords that designate special language functionality)
5. **Case-sensitive!!!!** a is not the same with A

<img align="left" src="img/keywords.png" width="90%">

Variable names can be a single letter long (as shown below) or they can be a short phrase (to make it clear what that code does). But keep them reasonable. Remember you'll need to type them in and if they're too long it will make your code look terrible. 

In [31]:
a = 1         # Defining the variable 'a' and setting it equal to an integer with a value of 1
b = 'Hello'   # Defining a second variable and setting it equal to a string of text
c = '1234'    # A third variable with a string of text but this time the text is numerical
this_is_a_longer_variable_name = 1  # A longer variable name that is perfectly valid, but kind of cumbersome

#### What's the hash tag `#` along with the following text? <br>
These are comments in Python. They are there to help you state what is going on, what a variable should be/represent, and to help anyone else who might read your code. And more often than not, they help **YOU** remember what the code should do when you go back and read it in a month, 2 months, a year.... **Use comments** whenever you can! Use them in your **homework** and use them in your **final project**.

> Do Something Today, That You'll Thank Yourself For Tomorrow.

The comments shown above are all comments following the code. You can also write singleline comment or multiline comment. 

In [32]:
# this is a variable defined now
a = 3

In [33]:
# this is multiple
# comment
# that i will use

a = 3

**Trick**: use `Ctrl` + `/` (on Mac or on Windows) to toggle comment lines

Recap: Just like if you were typing code at the Python interpreter, you need to make sure your variables are declared before you can use them. What will happen when you run the following cell? Try it and see:

In [34]:
x

NameError: name 'x' is not defined

In [None]:
x = 1

Running the cell below should work, because x has now been declared.

In [None]:
x

### Why variables?
You may wonder why we need variables. Why not just type the number 1 in instead of going through the process of creating a variable?

It all comes down to **re-use** of values. Imagine if I needed to use the value of 1 in many different places (e.g. 100 different places) throughout my code. We could just as easily enter the number '1' in there instead of the variable 'a'. Now imagine you wanted to change the value to be '2'..... You would have to go through all 100 entries and change them to be a '2'. *OR* you could just change the value of 'a' to be equal to '2'.


### Basic variables

**OK**, now I hope we all know why need variables. Variables are used to store values. So what are the basic `types of values` (aka `Basic Data Types`) in Python?

#### Table 1: Basic Data Types

| Data Type              | Examples                | Symbols in Python |
| ---------------------- | ------------------------| ------------------|
| Integers               | `-2, -1, 0, 1`          | `int`             |
| Floating-point numbers | `-1.25, -1.0, 0.0, 2.3` | `float`           |
| Strings                | `'a', 'aa', '11 cats'`  | `str`             |
| Boolean                | `True, FALSE`           | `bool`            |

In [None]:
# Define variables with different data types

i = 5                      # An integer
f = 3.1415928              # A floating point number
s = 'Ice cream'            # A string
b = True                   # A boolean value

We defined several variables above. Now that they're defined in the code, once we run that cell above, they will be created and stored in memory for our current session. If we close Jupyter notebook, we'll have to re-run the cell in order to recreate those variables. <br>

You can see what type a variable is by using the `type` function, like

In [None]:
type(i)

---
### *Exercise*

> Use `type` to see the types of the other variables (f, s, b) above

In [None]:
type(f)

In [None]:
type(s)

In [None]:
type(b)


You can test to see if a variable is a particular type by using the `isinstance(var, type)` function. Refer to the **Table 1** to find the right data type symbol. Note, `int`, `str` are types, while `"int"`, `"str"` are strings.

In [None]:
isinstance(i, int)  # is i an integer?

In [None]:
isinstance(i, str) # is i a string?

---
### *Exercise*

> Use `isinstance` to test the types of the other variables (f, s, b) above

In [None]:
isinstance(f,int)

In [None]:
isinstance(s,bool
          )

### A. Numeric variables

In [None]:
a = 5
b = 3.1415
c = a + b
print(a, b, c)

---
### *Exercise*
> Check the data type of `a`, `b` and `c`


In [None]:
type(a)

In [None]:
type(c)

**Note**: we can add different numeric data types (`a` is an integer, `b` is a float, and we add them together to get a variable `c`). `c` will be a float. The reason is that Python promotes the conversion of the lower data type (integer) to the higher data type (float) when both lower and higher data types exist in the operation to avoid data loss, and the final output will be the higher data type. This is called `implicit type conversion`, and also called `upcast`.

### B. String variables
Strings are made using various kinds of (matching) quotes. Examples:

In [None]:
s1 = 'hello daWG' # single quotes

In [None]:
s2 = "World" # double quotes

In [None]:
# use '''       ''' to create multiple line string
s3 = '''strings can 
also go "over"
multiple lines.yo
yo
yo
yo
'''

In [None]:
s1 # single quote shown in the interactive jupyter notebook

In [None]:
print(s1)

In [None]:
s2 # single quote shown in the interactive jupyter notebook

In [None]:
print(s2)

Do you notice the difference between using interactive output and using the print function?

In [None]:
s3 # single quote shown in the interactive jupyter notebook

Notice that the multiple line string (`s3`) is converted to a single line string with the newlines 'escaped' out with `\n` when you check the `s3` variable interactively. Use the function `print` to show it nicely.

In [None]:
print(s3)

We can also create multi line string without using paired `'''`. Just include special characters directly in strings when creating them. For example `\n` gives a newline, `\t` a tab, etc. 

In [None]:
s3 = 'strings can \nalso go "over" \nmultiple lines.'

In [None]:
print(s3)

You can concatenate strings using the `+` sign.

In [None]:
space_x = ' '

In [None]:
s1 + space_x + s2  # note, we need the space otherwise we would get 'helloworld'

or you can directly use `' '` with the `+` sign

In [None]:
s1 + ' ' + s2

This turns out to be very handy when creating filenames or full file paths pointing to your data file(s).

In [None]:
main_directory = "D:/data"
day_directory = '25May2021'
data_file = 'my_awesome_dataset.csv'
print(main_directory + '/' + day_directory + '/' + data_file)

**Tips**: The `+` sign can take on different meanings depending on the data types of the variables you are using it on. Recall that when you use the `+` sign with numbers, it is adding the numbers together. This is called `operator overloading`.

In [None]:
1 + 1

#### The `str`, `int`, and `float` function

Strings are variables that are often text-based. But this <b>doesn't</b> mean it has to be actual text (i.e., letters). Strings can also contain numbers, however, they won't be treated as an actual number unless converted into a different type. 

In [None]:
i4 = 5 # this is an integer
s4 = '5' # this is a string even though the string itself is a number

In [None]:
i4 == s4 # string and integer are different data types

In [None]:
type(s4)

In [None]:
type(i4)

What will happen if we add these i4 and s4 together?

In [35]:
i4 + s4

NameError: name 'i4' is not defined

How can we make it work?

In [None]:
i4 + int(s4)

When adding numbers with strings, if we want to let the string perform like numbers, we need `explicit type conversion`! 

In [None]:
# let's create two strings that look like numbers
s4 = '69'
s5 = '420'

What will happen if we add these strings together?

In [None]:
s4 + s5

In the above example, it combined the two strings together into a new string. However, if we wanted to add the actual numbers together, we would need to **explicitly convert** them to numbers first. We can do this using the built-in `int` or `float` functions. 

In [None]:
int(s4) + int(s5)

Read more about **implicit conversion** and **explicit conversion** [here](https://www.programiz.com/python-programming/type-conversion-and-casting).

---
### *Exercise*

> First concatenate the string s4 and s5, and then convert the outcome to interger

In [None]:
int(s4) + int(s5)

Other examples for converting between numerical variable and string variable:

In [None]:
str(12) # convert number to string

In [None]:
str(12.3) # convert number to string

In [None]:
int("12") # convert string to number

---
### *Exercise*

> Try to use `int` to conver the string '12.7' to an integer. What will happen? <br>
> Try to use `int` to conver the float 12.7 to an integer. What will happen? <br>
> How can you convert the string '12.7' to a number and then round the number to 13?

In [None]:
float('12.7')


In [None]:
int(12.7)

In [None]:
round(12.7)

#### String slicing

In [36]:
s1

NameError: name 's1' is not defined

In [None]:
# Get the number of characters in a string
len(s1)

In [None]:
# get the first letter
s1[0]

In [None]:
# get the fifth letter (last letter)
s1[4]

In [None]:
# get the last letter
s1[-1]

Pay attention to the index. **Python index starts from 0 and ends in -1**. <br>
This is what's going on above:
```Python
s1 = 'hello'

      'h  e  l  l  o'
       0  1  2  3  4    #forward index
      -5 -4 -3 -2 -1    #reverse index
```

**Extracting multiple elements:**

**We can get a sub-string from the string by giving a range of the index to extract**. This is done by using the following format

    start:stop:step

1. `s[start:stop:step]`: This extracts the elements from the start index up to but not including the stop index, with a step size of step.

1. `s[start:stop]`:  This extracts the elements from the start index up to but not including the stop index, with a default step size of 1.

1. `s[start:]`: This extracts the elements from the start index to the last one, with a default step size of 1.

1. `s[:stop]`: This extracts the elements from the beginning up to but not including the stop index, with a default step size of 1.

1. `s[::step]`: This extracts the elements from the beginning up to the last one (or from the ending to the first one, depending on whether step is positive or negative), with a default step size of step.

1. `s[:]` or `s1[::]` extract all elements

**Note**: The `up to but not including` part is confusing to first time Python users, but makes sense given the zero-based indexing. For example, `s[:10]` gives the first ten elements of a string.

In [181]:
s1

'hello'

In [182]:
# extract the elements from index 1 to index 4 with a step size of 1
s1[1:5:1]

'ello'

In [183]:
# extract the elements from index 1 to index 4 with a step size of 2
# specifically, it will extract index 1, index 3, etc... 
# since index 4 does not meet the criterion, it will not be extracted
s1[1:5:2]

'el'

In [184]:
# extract the elements from index 1 to the last one (including the last one)
s1[1:]

'ello'

In [185]:
# extract the elements from the beginning up to but not including the last one
s1[:-1]

'hell'

In [186]:
# extract the elements from the beginning to the last one with a step size of 2
# specifically, it will extract index 0, index 2, index 4, index 6, etc.
s1[::2]

'hlo'

In [187]:
# extract the elements from the end to the first one with a step size of 2
s1[::-2]

'olh'

In [188]:
# extract the elements from index -3 (third to last) up to but not including the last one
s1[-3:-1]

'll'

In [189]:
# get all elements out
s1[:]

'hello'

In [None]:
# get all elements out
s1[::]

In [None]:
# or simply
s1

#### String methods

Python is a truly object oriented language, where every variable is an object.

**Python objects have 'methods' and 'attributes'.**

* A **function** stored in an object is called a **method**.
* A **variable** stored in an object is called an **attribute**.

**We will focus on method for now**. You can access the method by putting a **dot** after the variable name and then the **method name with parentheses** (and any arguments to the method within the parentheses). Methods always have to have parentheses, even if they are empty.

In [39]:
s1 = "hello"

# convert to upper case
s1.upper()

'HELLO'

In [40]:
# s1 still remains the same
s1

'hello'

In this case `s1` is a string object. `upper()` is the method that acts on this string object.

In [190]:
# convert to lower case
s1.lower()

'hello'

In [191]:
s1.capitalize()

'Hello'

---
### *Exercise*

> What other methods are there? Type `s1.` and then `<TAB>`. This will show the possible completions, which in this case is a list of the methods and attributes. You can get help on a method by typing, for example, `s1.upper?`.  The text in the help file is called a `docstring`; as we will see below, you can write these for your own functions.

> See if you can use these methods of the str instance `s1`:

            
            1. islower
            2. isupper
            3. find

In [192]:
s1.upper()

'HELLO'

#### Recap on the print function: different print styles

In [198]:
a = 2

In [199]:
# one way to print
print("The number I had is", a)

The number I had is 2


In [200]:
# the above is equivalent to 
print("The number I had is", a, sep = " ")

The number I had is 2


In [201]:
# a different way to print 
# will cause error because 'a' is not string!
# this error occurs frequently!
# print("The number I had is" + a)

In [202]:
# a different way to print (no space)
print("The number I had is" + str(a))

The number I had is2


In [203]:
# a different way to print (add space)
print("The number I had is " + str(a))

The number I had is 2


**Can we print the string and `a` together without explicitly converting `a` to string?**

**Yes!** Use the `format()` method on the string, and then print!

In [204]:
'the number I had is {0}'.format(a)

'the number I had is 2'

In [205]:
# insert number in the print
# {0} corresponds to a
a = 2.348
print('the number I had is {0}'.format(a))

the number I had is 2.348


In [206]:
# limit the number of digits when printing out
# {0} corresponds to a and .2f indicates 2 digits after decimal
print('the number I had is {0:.2f}'.format(a))

the number I had is 2.35


In [207]:
# can also insert string in the print
# {0} corresponds to s_name
s_name = "I"
print('the number {0} had is 2.35'.format(s_name))

the number I had is 2.35


In [208]:
# insert both string and number in the print
# 0 corresponds to a and 1 corresponds to b

a = 'Tom'
b = 2.348

print('the number {0} had is {1:.2f}'.format(a, b))

the number Tom had is 2.35


---

### *Exercise*

> Use `a` and `b`, print "the number Tom had is 2.35" in a different way (hint: use our traditional string concatenation)


In [212]:
print('The number ' + a + ' had is ' + str(round(b,2)))

The number Tom had is 2.35


### C. Tests for equality and inequality

We can compare the values of variables using different operators. Shown in the following figure are different comparison operators.<br><br>
These compare operations return a `Boolean` value. Either `True` or `False`.

**Note that assignment operator `=` is different than the test of equality operator `==`. <br><br>**

<img align="left" src="img/comparison.png" width="70%">

In [213]:
# assign 5 to variable a
a = 5

In [214]:
a <= 99

True

In [215]:
a > 99

False

In [216]:
a == 5

True

In [217]:
a == 5.0 # note that implicit conversion also works here

True

These statements have returned "booleans", which are `True` and `False` only. These are commonly used to check for conditions within a script or function to determine the next course of action.

**Note:** booleans are NOT equivalent to a string that says "True" or "False". We can test this:

In [218]:
True == 'True'  # not equivalent

False

In [219]:
True == True   # equivalent

True

In [220]:
False == 'False'

False

In [221]:
False == False

True

There are other things that can be tested, not just mathematical equalities. For example, to test if an element is inside of a string (**or inside any sequence, like List, as you will see soon**), use the `membership operators` (`in` and `not in`):

In [222]:
'this' in 'What is this?'

True

In [64]:
'that' in 'What is this?'

False

In [65]:
'that' not in 'What is this?'

True

---
### *Exercise*

> Test if 1 is equal to "1" <br>
> test if 'ab' is inside of 'abcd' <br>
> test if 'ab' is inside of 'ABcd' <br>
> test if 'ab' is inside of 'acbd' <br>

In [223]:
1 == '1'

False

In [224]:
'ab' in 'abcd'

True

In [225]:
'ab' in 'ABcd'

False

In [227]:
'ab' in 'ABcd'.lower()

True

In [228]:
'ab' in 'acbd'

False

## Quick intro to functions

We will discuss functions in more detail later in this notebook, but here is a quick view to help with the homework.

Functions allow us to write code that we can use in the future. When we take a series of code statements and put them in a function, we can reuse that code to take in inputs, perform calculations or other manipulations, and return outputs, just like a function in math.

**A large proportion of the code you submit in your homework will be within functions so that I can test the functionality of your code.**

In Python, a function is defined using the `def` keyword, followed by the **function name**, and the **parentheses** that contain the **arguements** you passed to the function.

Here we have a function called `capitalize_string` which takes in a string, and then returns the same string but with it capitalized. In most functions, you will use the final **`return`** keyword to return your output.

In [229]:
def capitalize_string(input_str):
    '''Documentation for this function, which can span
    multiple lines since triple quotes are used for this.
    
    Takes in a string, and then returns the same string but with it capitalized.'''
    
    new_string = input_str.capitalize()  # use built-in method for a string to capitalize it
    
    return new_string

In [230]:
?capitalize_string


This is analogous to the relationship between a variable and a function in math. The variable is $x$, and the function is $f(x)$, which changes the input $x$ in some way, then returns a new value. To access that returned value, you have to **use the function** -- **not just define the function**.

In [67]:
# input variable, 'x'

x = 'hi'

output_string = capitalize_string(x)

# function f(x) is 'capitalize_string(x)'
# Internal to the function itself, 'x' is passed to 'input_str'
# the function operates on 'input_str' and returns the variable 'new_string', 
# which is then assigned to the 'output_string' variable in the main program below

In [68]:
output_string

'Hi'

#### Note: Indentation is very important

* Indentation in python is typically *4 spaces*. Most programming text editors will be smart about indentation, and will also convert TABs to four spaces. Jupyter notebooks are smart about indentation, and will do the right thing, i.e., autoindent a line below a line with a trailing colon, and convert TABs to spaces.

* Indentation is commonly seen in functions, try-except code blocks, conditionals, for-while loops, etc. In Jupyter notebook, just press `Enter` or `Return` following the keyword (e.g., `def`, `if`, `for`, `while`) line and it will create the indentation in the following line.

In [69]:
def capitalize_string(input_str):

    new_string = input_str.capitalize()  # use built-in method for a string to capitalize it
    
    return new_string

**Note: When indentation is not needed, don't indent**

In [70]:
a = 2
# b = 2

Equality checks are commonly used to test the outcome of a function to make sure it is performing as expected and desire. We can test the function we wrote before to see if it works the way we expect and want it to. 

Here are two ways to test the outcome of the same input/output pair. **You will see them a lot in your homework!**

In [71]:
# Method 1: Use default assert keyword
out_string = capitalize_string('banana')

# if condition returns True, then no error will be rasied.
assert out_string == 'Banana'

In [72]:
out_string

'Banana'

In [73]:
# If condition returns False, AssertionError is raised
# Let's see how they respond when the condition is False
# assert out_string == 'banana'

In [74]:
# Method 2: Use the assert_equal model from the nose.tools module
from nose.tools import assert_equal

In [75]:
# if condition returns True, then no error will be rasied.
assert_equal(out_string, "Banana")

**Note: `assert_equal` generates much clear message when condition returns False compared with the default `assert`** 

In [76]:
# Let's see how they respond when the condition is False
# assert_equal(out_string, "BANANA") # Commented so that all cells run

---

### *Exercise*

Write your own function that do the following: (**pay attention to the indentation**): <br>
> Take in a number and return that number plus 10. Use assert or assert_equal to test the output.

In [238]:
def calc_plus_ten(x):
    ''' 
    This function is to
    add 10
    '''
    return x + 10

In [239]:
calc_plus_ten(69)

79

In [240]:
def add_10(t):
    answer = t + 10
    return answer

In [241]:
assert_equal(add_10(5), int(15))

In [243]:
assert add_10(5) == 15

In [244]:
assert_equal(calc_plus_ten(69), int(79))

## Python try-except

* The try block lets you test a block of code for errors.

* The except block lets you handle the error.

In [246]:
# print(y) # throw error when you try to print the undefined y

In [247]:
# use try and except will not throw the error
try:
    print(y)
except:
    print("No y defined yet!")

No y defined yet!


**Why do we need try-except in the first place?**

In [249]:
# one reason is that it will allow your code to continue run when encountering errors
# for example, we want to print both y and x

x = "Hello, world!"

try:
    print(y)
except:
    print("No y defined yet!")

print(x) # x will not be printed because print(y) throws error

No y defined yet!
Hello, world!


In [80]:
x = "Hello, world"

try:
    print(y)
except:
    print("No y defined yet!")

print(x) # x will be printed

No y defined yet!
Hello, world


`try-except` is also very useful in functions. 

In [250]:
# For example, define a function that convert a string to integer.
def string_to_integer_1(x):
    try:
        return int(x)
    except:
        print("String not in integer format!")
        return None

In [251]:
# this will run as string '12' can be converted to integer using int()
string_to_integer_1('12')

12

In [252]:
# this will not run as string '12.7' cannot be converted to integer directly using int()

string_to_integer_1('12.7')

String not in integer format!


In [253]:
# try another function example without try-catch
def string_to_integer_2(x):
    return int(x)

In [85]:
# string_to_integer_2("ss")

---

### *Exercise*

Write your own try-except block: <br>
> Define a = 1, and b = "s", convert b to integer and assign the output to c, and then print a

## Containers

Often you need a contain of multiple values (e.g., temperature on sequential days). There are four basic containers in the core Python language: **List, Dict, Tuple, and Set**. <br>

We will focus on `List`, `Dict`, and `Tuple` in this class. <br>

There are a few more specialized containers (e.g., numpy arrays and pandas dataframes) for use in scientific computing that we will learn much more about later.


Definition of different containers.

* **list: ordered, mutable sequence of objects**
* **dict: unordered, mutable collection of object key-value pairs, with unique keys**
* **tuple: ordered, immutable sequence of objects**

Mutable means "changeable", i.e. can be altered in-place. For example, the list has the `.append()` method which adds an item to the list.  Tuples are immutable, which means once they are assigned, they cannot be changed.

### A. Lists

Lists are perhaps the most common container type. They are used for ordered sequential data. This does <b>not</b> mean that they will automatically be in increasing/decreasing order. It means that the order in which the values are placed into the contatiner will be maintained. 

#### Initialize a list

In [254]:
foo = [] # initialize an empty list

In [255]:
foo

[]

In [256]:
type(foo)

list

In [257]:
foo = [1.0, 2.0, 3, 'four', 'five', 'nine'] # initialize a list with values

In [258]:
foo

[1.0, 2.0, 3, 'four', 'five', 'nine']

In [259]:
type(foo)

list

In [260]:
print(foo)

[1.0, 2.0, 3, 'four', 'five', 'nine']


**Note:** lists (**unlike** arrays, as we will learn later) can be heterogeneous. That is, the elements in the list don't have to have the same kind of data type.

In [261]:
# Here we have a list with floats, ints, and strings. We even have lists inside of a list!
var1 = [1, 2, 3, [4, 5], ["hello", "world"]]

In [262]:
var1

[1, 2, 3, [4, 5], ['hello', 'world']]

#### Get the length of a list

In [95]:
len(foo)

6

#### Test for membership in a list (similar to test if certain characters are contained in a string)

In [96]:
foo

[1.0, 2.0, 3, 'four', 'five', 'nine']

In [97]:
"five" in foo

True

In [98]:
"number" in foo

False

#### Slice a List

We can retrieve the individual elements of a list by 'indexing' the list. We do this with square brackets, using zero-based indexes – that is `0` is the first element (similar to string slicing)

<b>```
foo            -->    [1.0, 2.0, 3, 'four', 'five', 'nine']
index forward  -->      0    1   2    3       4       5
index backward -->     -6   -5  -4   -3      -2      -1
```</b>

In [99]:
foo

[1.0, 2.0, 3, 'four', 'five', 'nine']

In [100]:
foo[0]

1.0

In [101]:
foo[3]

'four'

In [102]:
# access the character inside of one string in the list
foo[3][1]

'o'

This is what's going on above:
```Python
foo[3] = 'four'

      'f  o  u  r'
       0  1  2  3     #forward index
      -4 -3 -2 -1     #reverse index
```


You can also index from the reverse direction, similar to string slicing.

In [103]:
foo[-1]   # This is the way to access the last element.

'nine'

In [104]:
foo[-2]    # ...and the second to last element

'five'

Why might you want to do something like that? Perhaps you add a new value to your list each day, such as the high temperature for the day. If you want to know what the high temperature was for the most recent day, you can use the index value for the last element. But how do you know what it is? <br><br>
You could do this:

```python
    daily_temperature = [76,77,75,78,84,87,83,91,88]   # a list of daily temperatures
    number_of_temperatures = len(daily_temperature)   #find the length of the daily_temperature and store it
    most_recent_temperature = daily_temperature[number_of_temperatures - 1]
    
```

<b>Question: Why did I have to subtract 1 from the number_of_temperatures variable in that above block? </b>

<br>Or you could simply do this:

```python
    daily_temperature = [76,77,75,78,84,87,83,91,88]   # a list of daily temperatures
    most_recent_temperature = daily_temperature[-1] # grab the last value
```

What if we wanted to know the temperature 1 week ago (i.e. 7 days ago)? You might have guessed:
```python
    temperature_last_week = daily_temperature[-7] # this should be equal to 75
```

**We can get a sub-sequence from the string by giving a range of the index to extract**. This is done by using the format

    start:stop:step

1. `foo[start:stop:step]`: This extracts the elements from the start index up to but not including the stop index, with a step size of step.

1. `foo[start:stop]`:  This extracts the elements from the start index up to but not including the stop index, with a default step size of 1.

1. `foo[start:]`: This extracts the elements from the start index to the last one, with a default step size of 1.

1. `foo[:stop]`: This extracts the elements from the beginning up to but not including the stop index, with a default step size of 1.

1. `foo[::step]`: This extracts the elements from the beginning up to the last one (or from the ending to the first one, depending on whether step is positive or negative), with a default step size of step.

1. `foo[:]` or `foo[::]` extract all elements

In [105]:
# create a sequence of 10 elements, starting with zero, up to but not including 10.
bar = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [106]:
# extract the elements from index 1 to index 4 with a step size of 1
bar[1:5:1] 

[1, 2, 3, 4]

In [107]:
# extract the elements from index 1 to index 4 with a step size of 2
# specifically, it will extract index 1, index 3, etc... 
# since index 4 does not meet the criterion, it will not be extracted
bar[1:5:2]

[1, 3]

In [108]:
# extract the elements from index 1 to the last one (including the last one)
bar[1:]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [109]:
# extract the elements from the beginning up to but not including the last one
bar[:-1]

[0, 1, 2, 3, 4, 5, 6, 7, 8]

In [110]:
bar

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [111]:
# extract the elements from the beginning to the last one with a step size of 2
# specifically, it will extract index 0, index 2, index 4, index 6, etc.
bar[::2]

[0, 2, 4, 6, 8]

In [112]:
# extract the elements from the end to the first one with a step size of 2
bar[::-2]

[9, 7, 5, 3, 1]

In [113]:
# extract the elements from index -3 (third to last) up to but not including the last one
bar[-3:-1]

[7, 8]

In [114]:
# get all elements out
bar[:]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [115]:
# get all elements out
bar[::]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [116]:
# or simply
bar

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

---
###  *Exercise*

> Use the list

    bar = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
> use indexing to get the following sequences:
    
    
    [3, 4, 5]
    
    [9]        # note this is different than just the last element. 
               # It is a sequence with only one element, but still a sequence
    
    [2, 5, 8]

#### Modify the list values

You can assign values to list elements by putting the indexed list on the left side of the assignment, as

In [117]:
bar[5] = -99
bar

[0, 1, 2, 3, 4, -99, 6, 7, 8, 9]

This works for sequences as well,

In [118]:
bar[2:7] = [1, 1, 1, 1, 1]
bar

[0, 1, 1, 1, 1, 1, 1, 7, 8, 9]

#### Concatenate lists with +

In [119]:
var1 = [1, 2, "ss"]
var2 = [2, 3, "dd"]

In [120]:
var1 + var2

[1, 2, 'ss', 2, 3, 'dd']

#### Append values to a list

Lists are also 'objects'; they also have 'methods'.

In [121]:
bar.append(4)

In [122]:
bar

[0, 1, 1, 1, 1, 1, 1, 7, 8, 9, 4]

**Note:** `append` changes the list itself. This is called **in place** change

#### Sort the values in the list

In [123]:
bar = [4, 5, 6, 7, 3, 6, 7, 3, 5, 7, 9]
sorted(bar)

[3, 3, 4, 5, 5, 6, 6, 7, 7, 7, 9]

In [124]:
bar # note that the sorted function will not change bar itself!

[4, 5, 6, 7, 3, 6, 7, 3, 5, 7, 9]

In [125]:
bar = sorted(bar)
bar

[3, 3, 4, 5, 5, 6, 6, 7, 7, 7, 9]

In [126]:
bar = sorted(bar, reverse=True) # reverse sorting
bar

[9, 7, 7, 7, 6, 6, 5, 5, 4, 3, 3]

#### String-List interaction

One of the most useful string methods is `split` that returns a **list of the words** in the original string, with all of the whitespace (actual spaces, tabs, and newlines) **removed**.

In [127]:
ss = 'strings can \nalso go over \nmultiple lines.'
ss.split()

['strings', 'can', 'also', 'go', 'over', 'multiple', 'lines.']

In [128]:
ss

'strings can \nalso go over \nmultiple lines.'

Another common thing that is done with strings is the `join` method. It can be used to join a sequence of strings given a common **conjunction**.

In [129]:
words = ss.split()

In [130]:
words

['strings', 'can', 'also', 'go', 'over', 'multiple', 'lines.']

In [131]:
# Here, we are using a method directly on the string '_' itself, and '_' will serve as our conjunction.

'_'.join(words)

'strings_can_also_go_over_multiple_lines.'

---
### *Exercise*

> use hyphen to join the list `words`


### B. Dictionaries

Dictionaries are used for <b>unordered</b> sequences that are referenced by arbitrary 'keys' instead of by a (sequential) index. Dictionaries are created using curly brackets with **keys** and **values** separated by a colon, and key:value pairs separated by commas.

In [132]:
foo = {} # one way to initialize an empty dictionary

In [133]:
foo

{}

In [134]:
type(foo)

dict

In [135]:
foo = {'a':4, 'b':3, 'c':5} # initialize a dictionary with keys and values

In [136]:
foo

{'a': 4, 'b': 3, 'c': 5}

In [137]:
type(foo)

dict

#### Get the number of elements in a dictionary

In [138]:
len(foo)

3

#### Elements are referenced by keys:

In [139]:
foo['b']

3

In [140]:
# foo[1] # using index like what we did for list will throw error

**old key values can be modified simply by assigning a value to the old key.**

In [141]:
foo['c'] = -99
foo

{'a': 4, 'b': 3, 'c': -99}

**New values can be added to the dictionary simply by assigning a value to a key that does not exist yet.**

In [142]:
foo['spam'] = 'eggs'
foo

{'a': 4, 'b': 3, 'c': -99, 'spam': 'eggs'}

#### The keys and values can be extracted individually using the `.keys()` and `.values` methods of the dictionary class.

In [143]:
foo.keys()

dict_keys(['a', 'b', 'c', 'spam'])

In [144]:
type(foo.keys())

dict_keys

In [145]:
"a" in foo.keys() # Test for membership in dictionary keys

True

In [146]:
# convert the keys to list if you are more comfortable with the list format
list(foo.keys())

['a', 'b', 'c', 'spam']

In [147]:
foo.values()

dict_values([4, 3, -99, 'eggs'])

In [148]:
'eggs' in foo.values() # Test for membership in dictionary values

True

In [149]:
type(foo.values())

dict_values

In [150]:
# convert the values to list if you are more comfortable with the list format
list(foo.values())

[4, 3, -99, 'eggs']

#### The key-value pair can be extracted  using the `.items()` method of the dictionary class.

In [151]:
foo.items()

dict_items([('a', 4), ('b', 3), ('c', -99), ('spam', 'eggs')])

In [152]:
# convert to list
list(foo.items())

[('a', 4), ('b', 3), ('c', -99), ('spam', 'eggs')]

### Tuples

Tuples (pronounced `too'-puls`) are ordered sequences that can't be modified, and don't have methods. Thus, they are designed to be immutable sequences. They are created like lists, but with round brackets instead of square brackets.

#### Initialize a tuple

In [153]:
foo = () # initialize an empty tuple

In [154]:
foo

()

In [155]:
type(foo)

tuple

In [156]:
foo = ('a', 'b', 'c', 'd') # initialize a tuple of values

In [157]:
foo

('a', 'b', 'c', 'd')

In [158]:
type(foo)

tuple

In [159]:
# One caveat: If you have only one value inside of the parentheses, 
# then the type of variable is not tuple, unless you add a comma in the end

a = (1)
type(a)

int

In [160]:
a

1

In [161]:
b = ("hello")
type(b)

str

In [162]:
b

'hello'

In [163]:
c = (1,)
d = ("hello",)

In [164]:
type(c)

tuple

In [165]:
c

(1,)

In [166]:
type(d)

tuple

In [167]:
d

('hello',)

#### Get the length of a tuple

In [168]:
foo

('a', 'b', 'c', 'd')

In [169]:
len(foo)

4

#### Test for membership in a tuple

In [170]:
"a" in foo

True

#### Slice a tuple

In [171]:
foo[1:3]

('b', 'c')

#### Modify the tuple values

In [172]:
# you cannot modify the values in the tuple
# foo[2] = -999  

In [173]:
# However, you can "add" tuples together to create a new tuple. 
foo + foo

('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd')

Tuples are often used when a function has multiple outputs, or as a lightweight storage container. Because of this, you don't need to put the parentheses around them, and you can return or assign multiple values at a time.

In [174]:
def tuple_test():
    return 1,2,3,4

In [175]:
tuple_test()

(1, 2, 3, 4)

In [176]:
def tuple_test():
    return (1,2,3,4) # can also add parentheses if you want

In [177]:
tuple_test()

(1, 2, 3, 4)

In [178]:
a, b, c = 1, 2, 3   # Equivalent to '(a, b, c) = (1, 2, 3)'

print(a, b, c)

1 2 3


In [179]:
(a, b, c) = (1, 2, 3)
print(a, b, c)

1 2 3
