## Introduction to Python II
**Nick Kern**
<br>
**Astro 9: Python Programming in Astronomy**
<br>
**UC Berkeley**

---

Now that we've got the basics of Python syntax and data types, we will move on to consider how to organize data within structures (called data structures!) for more advanced manipulation. We will also review how to control the flow of a program with loops, `if` statements and error handling. In addition, we will study the concept of (im)mutability, and peek under the hood to see how manipulating these objects affects their location in the computer's memory.

1. [Immutability and References](#Immutability-and-References)
2. [Lists](#Lists)
3. [Tuples](#Tuples)
4. [Sets](#Sets)
5. [Dictionaries](#Dictionaries)
6. [Conditionals](#Conditionals)
7. [Loops](#Loops)
8. [Handling Errors](#Handling-Errors)


### Immutability and References

For something to be mutable means that it is has the ability to be altered after being defined. For an object to be immutable, once it is assigned memory, it cannot be changed. Mutable objects are objects who retain one location in our computer's memory, but can change their form. Built-in types like `int`, `float`, `bool` and `str` are **immutable**: there is a single location in memory allocated to these objects. User-defined objects, like lists, functions and classes are generally mutable.

In order to access a single character of a string, we need to index it (we will see this later on with lists). Remember, Python is **zeroth ordered**, meaning that the *first* element of a sequence is actually accessed with a 0, the *second* element is accessed with a 1, and so on.

In [None]:
# we can access an element in greeting
greeting = "HelloWorld!"
print( greeting[0] )

In [None]:
# can we change an element in greeting?
greeting[0] = "h"

This is related to the concept of a **reference**. A variable, as Python understands them, is nothing but a reference to some location in the computer's memory, where the data we have assigned that variable lives. Consider two different variables, `a` and `b`, which we have assigned the same value to. In this case, these would both be references to the same patch of memory in the computer.

<img src="imgs/references.png" width=600px/>
<center>IC: *Learning Python*, Mark Lutz</center>

One convenient built-in function we will be using is the `hex()` and `id()` functions, which together print out an objects memory address in hexadecimal format.

In [None]:
# example above
a = 3
b = a
print(hex(id(a)))
print(hex(id(b)))

### Data Structures

Often we want to organize and containerize data into structures. These structure hold the data for us, and allow us to perform operations on the whole set. In fact, one data structure we have been working with all along is a string; a string is a sequence of characters whose data can be accessed element by element. The built-in data structures in Python we will explore are
* list
* tuple
* dictionary
* set


### Lists 
Lists are mutable objects, meaning their data can change but their memory address will not (unless we explicitly tell it to do so). Lists are created with square brackets `[]` with comma-separated data. Lists can hold really any kind of data: numbers, strings, booleans, other lists and data structures, and even functions and class objects. 

In [None]:
# Create a list
a_list = [1, 2, 3, 'hi', True, (1 + 3j), 9.9, ['another list!']]
print(a_list)

In [None]:
# print the zeroth element
print( a_list[3] )

In [None]:
# let's change an element!
a_list[0] = 'bye'
print(a_list)

We can access the individual data elements of a list via list indexing and slicing. The syntax is
```
list[<start>:<stop>:<increment>]
```
Note that `<start>` is inclusive and `<stop>` is exclusive.
If `<start>` or `<stop>` are left blank, the default is beginning and end respectively, and if `<increment>` is left blank, the default is 1.

<img src="imgs/list_indexing.png" width=500px>
<center> A graphic of list indexing </center>

In [None]:
# create a list
a_list = ['M', 'o', 'n', 't', 'y', '', 'P', 'y', 't', 'h', 'o', 'n']

In [None]:
# This will give us the first three elements
print(a_list[0:3])

In [None]:
# This will give us the last three elements
print(a_list[-3:])

In [None]:
# This will give us the same thing
print(a_list[9:])

In [None]:
# This will give us every other element of the first four
print(a_list[::2])

In [None]:
# This will reverse the list
print(a_list[::-1])

In [None]:
# NOTE: this also works for strings
my_string = 'Monty Python'
my_string[::-1]

To generate a list, you can construct it by hand inserting each element (as we've done above), or you can use the convenient `range()` built-in function. It generates a list of integers from a specified start and stop point with a given increment (very similar to list indexing!). One subtlety is that in Python 3.X, the `range()` function does not create a `list`, but creates an iterable. To convert it into a list we only need to use the `list()` built-in function.

In [None]:
# auto-generate a list with the range function
list(range(0, 15, 3))

** * Be careful** generating lists with more than a few tens-of-millions of elements: doing so can use up a considerable amount (if not all) of your computer's memory.

**Data Structure Methods** 

Data structures are objects (which we will study later). This means that they have both attributes and methods (variables and functions) attached to them. To see the kind of methods an object has, use the `dir()` function.

In [None]:
# Methods of a list object
dir(a_list)

In [None]:
# Append
new_list = [1,2,3]
new_list.append(5)
print(new_list)

new_list.append(6)
print(new_list)

In [None]:
# Extend
new_list.extend([7, 8, 9])
print(new_list)

In [None]:
# Pop
new_list.pop(0)
print(new_list)

In [None]:
# Get length, or number of elements in a list
len(new_list)

Try exploring some other `list` methods on your own!

### Tuples

Tuples are similar lists but, like strings, are **immutable**. Tuples are constructed like a `list` but use parentheses `()`.

In [None]:
# A tuple object can take any data type, and can be indexed like lists
my_tuple = ('hi there', 5, False, 7, 8, 9, 5, 6, 5)
print(my_tuple[0:3])

In [None]:
# Try to reassign...
my_tuple[0] = 'goodbye'

In [None]:
# tuples are also objects, and have attached methods
# this one counts the number of times an element occurs
my_tuple.count(5)

### Sets

Sets are like list and tuples in that they hold data in an element-by-element fashion and can hold any type of data. They are constructed with curly brackets, or with the `set()` function. However, sets differ from lists and tuples in some significant ways. First, they are "unordered" meaning that their order is not static and can change without you knowing it. Because of this, **sets do not support indexing**. Sets also do not allow for repeated data: repeated data are eliminated upon declaration. In some cases this is a desirable feature; in the case that we only care about unique values, not adding a repeated element means the set retains a leaner footprint in memory.

In [None]:
# Make a set
my_set = {1, 2, 10, True, 0, 'hello there', False, 'hello', 10}
print(my_set)

Notice two things, 1) the order was not preserved and 2) repeated elements were erased. Note, which repeated elements were erased?

Sets can also be used, unlike lists and tuples, to do set operations, like unions, intersections, etc.

<img src='imgs/venn.jpg' width=400px>
<center> A venn diagram, showing the intersection and union of two sets </center>

In [None]:
# Create overlapping sets
set1 = set([1,2,3,4,5])
set2 = set([4,5,6,7,8])

In [None]:
print(set1)
print(set2)

In [None]:
# Set union
print(set1 | set2)

In [None]:
# set intersection
print(set1 & set2)

In [None]:
# set difference
print(set1 - set2)

In [None]:
# set symmetric difference
print(set1 ^ set2)

Drawing from the above examples, how could I "append" a new item to an already existing set?

In [None]:
# append a new item to an existing set
set1 = set1 | {100}
print(set1)

In [None]:
set1.update({1000})

In [None]:
print(set1)

On your own time, try exploring some of the methods of the `set()` object, in particular, its `update()` method.

### Dictionaries

Python dictionaries are distinctly different that lists, tuples and sets. A dictionary is a data structure that holds both the data itself, but also *assigns the data a name*. Similar to tuples, we cannot index the elements of a dictionary because order is not preserved. Instead, to access data, we must feed the dictionary the data's associated name.

The set of names in a dictionary are called its *keys*, and their associate data are called its *values*. Dictionaries can take any form of data, even if it is repeated. They cannot take multiple declarations of the same name, though: a second declaration of an existing key will just overwrite the first.

Dictionaries are created with curly brackets or the `dict()` function.

In [None]:
# create a dictionary
my_dictionary = {'var1':1, 'var2':2, 'var3':4}
print(my_dictionary)

In [None]:
# same way to create a dictionary with nested lists
my_dictionary = dict( [ ['var1', 1], ['var2', 2], ['var3', 4] ] )
print(my_dictionary)

In [None]:
# inspect a dictionary's keys with the keys() method
my_dictionary.keys()

In [None]:
# inspect values with the values() method
my_dictionary.values()

In [None]:
# Access data by indexing with a key name
my_dictionary['var1']

In [None]:
# Overwrite an already existing key
my_dictionary['var1'] = 100
print(my_dictionary)

In [None]:
# Add a new key
my_dictionary['new_key'] = 'newbie'
print(my_dictionary)

In [None]:
# how do I erase a key? let's look at some dictionary methods...
dir(my_dictionary)

### Breakout

1.
There is a message hidden in the following string, try to decode it using indexing. Hint: the message begins with the first character of the string. `T3j4hs92ij38sjsi_810i_2_sj1s_jj2a^@(_#JSh!JQiJD)dpw-d.1(e__FnJs3_jdamaa2eap2s234sjaws1(#a#*(g@#@easij9ja029dj9@`
<br><br>

2.
Create a dictionary called `cities`, that has as its keys the names of a few cities, and has as values the state that the cities resides in. Next, create a dictionary called `states`, that has as its keys the names of a few states and as its values their state abbreviations. Using these dictionaries and given a city name, how can I print the abbreviation of the state it resides in with one line of code?
<br><br>

3.
Use `sets` and set operations to find which characters these two sentences have in common:
```
It is a wonderful day outside!
``` 
and 
```
What terrible weather we are having!
```

In [None]:
whatever = "T3j4hs92ij38sjsi_810i_2_sj1s_jj2a^@(_#JSh!JQiJD)dpw-d.1(e__FnJs3_jdamaa2eap2s234sjaws1(#a#*(g@#@easij9ja029dj9@"

In [None]:
print(whatever[:-12:4])

In [None]:
cities = {"Berkeley":"California", "Ann Arbor":"Michigan", "Seattle":"Washington"}
states = {"California":"CA", "Michigan":"MI", "Washington":"WA"}
city_name = "Berkeley"
print( states[ cities[ city_name ] ] )

In [None]:
set1 = set("It is a wonderful day outside!")
set2 = set("What terrible weather we are having!")

print(set1)
print(set2)

In [None]:
set1 & set2

### Flow Control: Loops and Conditionals


Often we want more control over a program's execution than what is allowed by a single block of linear code. We can create conditionals such that certain operations are performed only when a specific criterion is met. We can also take a block of code and loop (or iterate) over it to perform the same task multiple times. Combining these allows us to take the synatx we've learned up to now and create sophisticated and "smart" programs.

### Conditionals

Conditionals are statements that enact an "if-then" logic. An "if" statement says, if a certain condition is met then perform some operation. We can also include an "elif" (or "else-if") statement, which says, "if the previous condition was not met, and if this condition is met, then...". And lastly an "else" statement, which just says, "if all previous conditions were not met, then..." The conditionals are accepted if the result is True, and rejected if the result is False. We can construct conditionals with the comparison operators (`==, !=, <, <=, >, >=`) we learned before.

Here, we will also introduce a **key syntactical element** of Python: indentation. You may have noticed that for basic arithmetic, Python is insensitive to extra whitespace. In other words, `2*2` is the same as `2 * 2` is the same as `2    *2`. *This is not the case for indentation*. In Python, indentation place a similar role that brackets {} play in C or Java: they imply a certain level of scope which, casually, means "the following block of code will be grouped together." In terms of conditionals, indentation specifies which lines of code belong to the conditional.

Let's explore the syntax of an `if` statement.

In [None]:
# http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/ifstatements.html

temperature = float(input('What is the temperature in F? '))

if temperature > 70:
    print('Wear shorts.')

print("...end of program")

Note that if the condition isn't met, the indented code is completely skipped.

In [None]:
temperature = float(input('What is the temperature in F? '))

if temperature > 70:
    print('Wear shorts.') 
    
else:
    print('Wear pants.')
    
print("...end of program")

Note that the `else` statement is executed only if the first condition fails. It could have drastically failed, or just barely failed, either way the `else` statement is executed.

In [None]:
temperature = float(input('What is the temperature in F? '))

if temperature > 70:
    print('Wear shorts.') 

elif temperature > 20:
    print('Wear pants.')
    
elif temperature > 10:
    print('do something')
    
else:
    print("Don't go outside.")
    
print("...end of program")

Note that if the first condition is met, even though the second condition is met it is not executed. What would happen if we changed the `elif` to an `if`?

### Loops

Loops allow us to iterate over a chunk of code to perform it multiple times. Like conditionals, code that is looped over is specified by **code indentation**. In general there are two kinds of loops, a **for loop** and a **while loop**. A for loop can be thought of as iterating over an array for each element of the array. A while loop is a loop that repeates indefinitely while some condition is met and stops when the condition is broken. We will see examples of both.

**`for` loops**

The general syntax of a FOR loop is:
```
<preceding code>
for <iterator> in <array>:
    <operation1>
    <operation2>
    ...
    <final operation>
    
<suceeding code>
```
You can see that the indentation of `<operations>` specifies which lines of code will be iterated over. In this case, the indented block is repeated `N` times, where `N` is the number of elements in `<array>`.

In [None]:
# a really simple for loop
for i in [1, 2, 3, 4]:
    print(i)

We can combine conditionals in for loops!

In [None]:
# A simple for loop to find sum of all numbers up to 99
counter = 0

for i in range(100):
    counter += i
    
print(counter)

How can I combine a conditional to find sum of all odd numbers? Hint: think of the modulus operator `%`

In [None]:
# A simple for loop to find sum of all odd numbers up to 99
counter = 0

for i in range(100):
    if i % 2:
        counter += i
        
print(counter)

Is there an even more efficient way to solve this problem?

In [None]:
# hint: range() function!
sum(list(range(100))[1::2])

We can put for loops inside of other for loops!

In [None]:
# A nested FOR loop to get all possible combinations
perm = []

for i in ['a', 'b', 'c']:
    for j in ['a', 'b', 'c']:
        perm.append(i+j)
        
print(perm)

**`while` loops**

Now let's look a `while` loop, which has a slightly different syntax.

```
<preceeding code>
while <condition> == True:
    <operation1>
    ...
    
<suceeding code>
```
In this case, the condition is evaluated at the beginning of the loop. If it is `True` the loop is evaluated. This is repeated indefinitely until the condition is not True. Be careful, because if you set up a loop with no way to exit it will **actually repeat forever** (or until the batteries on your laptop run out).

In [None]:
# A while loop
counter = 0

while counter < 5:
    print("the counter =", counter)
    counter += 1
    

print("the loop finished!")

** Exiting and Skipping Loops **

To have more control over the structure of nested loops and conditionals, we can utilize the `continue` command and the `break` command, which allows us to skip an iteration within a loop, and exit a loop entirely, respectively.

Let's say you are iterating through a loop and would like to skip one particular iteration while still finishing the remaining iterations in the loop. In this case, you would use the `continue` statement.

In [None]:
# use continue in a loop to skip the rest of the code in that given iteration
numbers = [1, 2, 3, 4.0, 5, 6, 7.0, 8]
for n in numbers:
    if type(n) != int:
        continue
    print("The number %s is an integer" % n)


The `break` statement is similar to the `continue` statement, but in this case it not only stops the current iteration of the loop but exits the loop entirely.

In [None]:
# Use a break statement to kill the loop
for n in range(10):
    if n > 5:
        print("n > 5")
        break
    else:
        print("n = %d" % n)

As we have seen above, sometimes we use `for` loops to iterate over an array to perform a calculation and store the result into a data structure. There is another way to do this in a single line, called a **list comprehension**, which combines the concepts of a `list`, a `for` loop and a conditional, all in one line of code.

In [None]:
# Store all numbers from 1 to 100
result = [i for i in range(1, 101)]
print(result)

In [None]:
# Store all multiples of three from 1 to 100
result = [i for i in range(1, 101) if i % 3 == 0]
print(result)

In [None]:
# do the list comprehension explicitly
result = []
for i in range(1, 101):
    if i % 3 == 0:
        result.append(i)
        
print(result)

### Breakout

1. Write your own magic 8-ball. Create a program that asks the user for a yes-or no question. Make the program reply with 1 out of 8 possible responses, each of which range from "it is certain to come true" to "impossible" in their response. You can use a random number generator to dictate which response to give the user. Upon reply, print out the users question and the magic 8 ball's response, and give them the option to ask another question or to exit the program. You can use the following syntax to generate a random number between 1 and 8. We will learn more about importing external modules in the next lecture.
```
import random
rand_number = random.randint(1, 8)
```
<br><br>
2. Starting with the two dictionaries defined below:
    1. add three more cities and three more states to their corresponding dictionary (do not add them into the cell below, create a new cell and add them to the already existing dictionary).
    2. using the `state2abbr` dictionary, create a new `abbr2state` dictionary, which has abbr has keys and state as values
    3. using a `for` loop and the `state2abbr` dictionary, print `<statename> has abbreviation <abbr>` for each state in the `states` dictionay
    4. using a `for` loop and the `abbr2state` dictionary, print `<cityname> is in <statename>` for each city in the `cities` dictionary
    5. write a code that asks the user for input and prints out the abbreviation of a state, or the state in which a city resides. First, have the code ask the user to first choose to get a state abbreviation, or to get the state in which a city resides, once they have made their selection, use the dictionaries we defined above to answer their question. If you cannot answer their question, tell them so, and prompt them to either start over from the beginning or to exit the program.

In [None]:
state2abbr = {
    'Michigan': 'MI',
    'Oregon' : 'OR',
    'Califonia' : 'CA',
    'Nevada' : 'NV' }

cities = {
    'Ann Arbor' : 'MI',
    'Chicago' : 'IL',
    'Portland' : 'OR',
    'Berkeley' : 'CA',
    'San Francisco' : 'CA' }

### References of mutable objects

One needs to be careful when working with mutable objects, particularly when passing references between variables. Consider the following example.

In [3]:
# Define a to be a simple list
a = [1, 2, 3]
# Assign a to be equal to b 
b = a

print(a)
print(b)

[1, 2, 3]
[1, 2, 3]


In [4]:
# Now change a, which is a mutable object
a[0] = 100
print(a)

[100, 2, 3]


In [5]:
# What about b?
print(b)

[100, 2, 3]


If you haven't seen something like this before, this may be a huge shock to you. How did we change `b` when we only directly altered `a`? The answer involves the two concepts we previously discussed, mutability and references. A list is a mutable object, meaning that we can alter it without having to change its memory address. Also by using `b=a` syntax, we merely passed the reference of `a` over to `b`, meaning they both point to the same object in memory. If we affect the data that `a` references, we are also affecting the data `b` references. We can confirm this by printing the memory addresses of the two variables.

In [6]:
print( hex(id(a)) )
print( hex(id(b)) )

0x103008c08
0x103008c08


You can avoid this by creating a new copy of the object in memory. Some ways to do this are to 1.) slice the list, 2.) take a `list()` of the list, 3.) multiply by 1, 4.) make a copy with the `copy` module.

In [7]:
# Take a `list()` of a to make a new copy in memory and assign to b
a = [1,2,3]
b = list(a)
print(hex(id(a)))
print(hex(id(b)))

0x102d892c8
0x10302cac8


In [8]:
# Alter elements of a
a[0] = 100
print(a)

[100, 2, 3]


In [9]:
# doesn't affect b
print(b)

[1, 2, 3]


### Extra: Error Handling

In the event that we anticipate our code to fail, we can use the `try: except:` syntax to "try" something out, and in the case that it fails, perform something else. If our code continues to fail, it will raise an exception (or an error). Sometimes this is actually desirable: anytime you write a piece of code from scratch, you are bound to get something a little off. This introduces bugs into the code. Some bugs cause the program to fail. These bugs are "good" bugs, because we are alerted that they exist when the code breaks. However, there are also "silent" bugs: bugs that change the code from doing what we expect, but don't cause the code to completely break. These kinds of bugs can be dangerous because they can go completely unnoticed. If we think that some piece of the code is introducing a bug, we can manually "raise" an error with the `raise` command.

In [10]:
# What are some kinds of errors we can raise?
1 / 0

ZeroDivisionError: division by zero

In [11]:
5 / 'five'

TypeError: unsupported operand type(s) for /: 'int' and 'str'

In [12]:
an_integer = int('hello')

ValueError: invalid literal for int() with base 10: 'hello'

In [13]:
# Practice with a try statement
while True:
    try:
        number = int(input('Please enter an integer: '))
        break
    except ValueError:
        print("That wasn't an integer, try again...")
        
print("Your number is", number)

Please enter an integer: hello
That wasn't an integer, try again...
Please enter an integer: hello
That wasn't an integer, try again...
Please enter an integer: 3
Your number is 3


We can also raise an error if we anticipate that your code will fail given some condition is not met:

In [15]:
password = input("Please choose a password with at least 7 characters: ")
if len(password) < 7:
    raise Exception("This password is not long enough")

Please choose a password with at least 7 characters: sdkdksksk
