# Module 2 – Diving Deeper into Fundamentals

_Goal: Expand the types we're comfortable with, and learn about Python's fundamental whitespace-delimited blocks._

Topics we'll cover:
* Strings continued
* Other sequence types
    * Tuples
    * Sets
    * Dictionaries
* Flow control
* Functions
* Conditionals
* Identity versus Equality

#### Companion Reading for This Module

[Official Tutorial](https://docs.python.org/3/tutorial/index.html): Sections 4.1-4.8, 5.1-5.8, 6.1-6.2

## 2.1 Revisiting Strings

In [None]:
"123"[0] = "4"  # remember: immutable

In [None]:
s = "Hi! I'm Lynn Root"
s[8]  # but strings are still index-able

In [None]:
s[:5] # as well as slice-able

#### The string API
_Some helpful methods to use on strings_

In [None]:
s.lower()  # all lower case

In [None]:
# can "chain" together
s.replace("Hi!", "Hello.").replace("Lynn Root", "Super Woman")

In [None]:
"  \t  a string with leading and trailing whitespace   \n  ".strip()

In [None]:
"we can also titlecase strings".title()

In [None]:
"or, split, things".split(", ")  # split by character

In [None]:
"or\nsplit\nmultiple\nlines".splitlines()  # split by newline character

Python supports a few different ways to "format" strings, or fill in strings with variables.

There are some nuances behind each approach, and in the future you may reach for one or the other depending on what you're trying to do. But for now, you can pick one that you like the best, or rotate through them.

In [None]:
name = "Lynn"
surname = "Root"
"Hi! I'm " + name + " " + surname

In [None]:
f"Hi! I'm {name} {surname}"  # f-strings 

In [None]:
# old style string formatting
"Hi! I'm {} {}".format(name, surname)

In [None]:
"Hi! I'm {n} {s}".format(s=surname, n=name)

In [None]:
"Hi! I'm {} {} {}".format(name, surname)

In [None]:
# old _old_ style string formatting
"Hi! I'm %s %s" % (name, surname)

#### Recall

The `%` operator was used for modulus when dealing with `int`s. For `str`s, it does the above -- the two uses have no real connection with each other other than that they use the same operator. We'll discuss this more in the context of Object Oriented Programming, but for now you should note that the behavior of a particular operator (or method) can entirely differ depending on the types it is applied to.


Python's library of [string manipulation methods](https://docs.python.org/3/library/stdtypes.html#string-methods) is extensive and extremely useful for manipulating text.

## 2.2 More Types

### 2.2.1 Tuples

In [None]:
a = (1, 2, 3)
a

In [None]:
b = 1, 2, 3
a == b

In [None]:
c = [1, 2, 3]
a == c

When to use a tuple?

Use a tuple when you _know_ you will not change it.
For instance, you'd probably use a tuple when dealing with coordnates

In [None]:
coords = (40.8075, 73.9626)

If you're familiar with the C programming language, it's sort of comparable to `struct`s. 

Basically, the `list` type is intended for homogeneous sequences, like all integers, all floats, all strings. `tuple`s are meant as heterogeneous data structures.

Granted, a lot of the times we create lists with different types - a list gives us a lot of flexibility. And tuples can still be homogenous. so it's okay not to use lists or tuples the _intended_ way. It wouldn't be Python without that flexibility.

But too build on that, `tuple`s are for when order is important. Like coordinates, the first item will always be the latitude; the second is the longitude. Later in the course we'll learn about "name"-based data structures. But here, tuples can be seen as index-based data structures. Where the index of a tuple has a specific meaning. Another example:

In [None]:
person = ("lynn", 35, 157.5)  # name, age, height in cm

Other reasons to use a tuple:

A tuple is faster than a list. 

Since it's essentially "read-only", you are making your code safer. You or someone using your code can't change the value of that tuple. 

In a little bit, we'll see another use for tuples.

### 2.2.2 Sets

A set is an unordered, unique collection of items

In [None]:
my_set = {1, 2, 3}  # curly bracket "list"
my_set

In [None]:
my_set = set([1, 2, 3])  # built-in function takes an iterable

In [None]:
my_set = {1, 1, 2, 2, 3, 3}
my_set

In [None]:
my_set.add(4)

In [None]:
my_set

In [None]:
my_set.add(1)
my_set

In [None]:
my_set.remove(2)
my_set

In [None]:
my_set.remove(2)  # remove will raise an error

In [None]:
my_set.discard(2)  # remove if exists, otherwise do nothing 

In [None]:
my_set[0]  # no indexing, no order

In [None]:
lynns_fav_tv = {"Succession", "The Marvelous Mrs. Maisel", "Madam Secretary"}

In [None]:
petes_fav_tv = {"BoJack Horseman", "I Think You Should Leave", "Succession"}

In [None]:
lynns_fav_tv + petes_fav_tv  # can't add them together

In [None]:
lynns_fav_tv | petes_fav_tv  # Union

In [None]:
lynns_fav_tv & petes_fav_tv  # intersection

In [None]:
lynns_fav_tv ^ petes_fav_tv  # symmetric difference

In [None]:
lynns_fav_tv - petes_fav_tv  # difference

In [None]:
petes_fav_tv - lynns_fav_tv  # difference

In [None]:
tv_list = list(lynns_fav_tv)  # cast to a list

In [None]:
type(lynns_fav_tv), type(tv_list)

### 2.2.3 Corollary: Unpacking

In [None]:
coords

In [None]:
lat, long = coords  # tuple unpacking
lat

In [None]:
person

In [None]:
name, age, height = person

In [None]:
name, _, _ = person  # throw away using `_`

In [None]:
name, remaining = person  # too many assignments

In [None]:
name, *remaining = person  # collect remaining using *

In [None]:
name

In [None]:
remaining

In [None]:
first, second, third = lynns_fav_tv  # works for sets too

In [None]:
first, second, third = tv_list  # and lists

### 2.2.4 Dictionaries

Without a doubt, dict is the workhorse of Python's built-in types, and is arguably a large reason Python became successful early on.

dicts are associative mappings ("maps"). They map a set of (unique) keys to (possibly non-unique) values.

It uses curly brackets like `set`s, but inside is a mapping of keys to values (rather than just comma-separated items):

In [None]:
mapping = {"a": 2, "b": 5, "c": 7}

In [2]:
# a more readable way to define a dictionary
mapping = {
    
    
    
    "a": 2,
    "b": 5,
    "c": 7,
}
mapping

{'a': 2, 'b': 5, 'c': 7}

In [3]:
mapping["a"]
# look up a value by its key

2

In [4]:
"c" in mapping  # truthiness 

True

In [None]:
2 in mapping  # only looks at keys, not values

In [None]:
mapping["d"]  # doesn't exist

In [None]:
mapping["d"] = -5  # mutable - can add new key/value pairings

In [None]:
mapping["a"] = 10  # or update values of a key

In [None]:
# another way to define a dict - with the built-in function
mapping2 = dict(a=10, b=5, c=7, d=-5)  # notice the keys are not strings but arguments

In [None]:
mapping == mapping2  # can compare to see if two dicts are equal

In [None]:
mapping3 = {
    "b": 5,
    "d": -5,
    "c": 7,
    "a": 10,
}

In [None]:
mapping == mapping3  # order doesn't matter when comparing

In [None]:
mapping3  # but order is preserved (as of Python version 3.6 and later)

In [None]:
mapping.pop("d")  # removes the "d" key/value pairing and returns it

In [None]:
mapping

#### Exercise 1

How does unpacking look like with dictionaries?

In [None]:
my_dict = {"a": 1, "b": 2, "c": 3}
first, *remain = my_dict

In [None]:
first  # just keys

#### The `dict` API
_Some methods you can use on dictionary objects._

In [None]:
mapping.keys()  # returns a set-like object of the dictionary's keys

In [None]:
mapping.values() # returns a set-like object of the dictionary's values

In [None]:
mapping.items() # returns a set-like object of the dictionary's key/value pairs

#### Quick Aside: dictionaries + strings
You can also use dictionaries for string formatting!

In [None]:
"Hi! I'm {name} {surname}".format(surname=surname, name=name)  # recall this from earlier

In [None]:
full_name = {
    "name": "Lynn",
    "surname": "Root",
}

In [None]:
"Hi! My name is {name} {surname}".format(**full_name)  # dict unpacking - like earlier

In [None]:
"Hi! My name is {n} {s}".format(**full_name)  # keys must match those in `{}`

#### Interlude

So far, we've dealt with `list`s, `set`s, and `tuple`s, which are quite happy to contain any Python object at all.

The **values** of a `dict` are similar – they can be any Python object. `dict` keys however cannot! The precise name for objects that are suitable for use in keys are "hashable objects", which we'll discuss as part of Object Oriented Programming, however you may think for the moment of these objects as being synonymous with "immutable objects". In particular, a `list` cannot be the key in a `dict`.

But, this is another case for `tuple`s! A `tuple` - since it's immutable - _can_ be a key in a dictionary.

In [None]:
book_notes = {
    # k: (page, line); v: notes
    (5, 11): "This passage shows that...",
    (25, 19): "Here is another place where ...",
}

In [None]:
# don't worry - Python will complain if you use lists as keys
book_notes = {
    # k: (page, line); v: notes
    [5, 11]: "This passage shows that...",
    [25, 19]: "Here is another place where ...",
}

#### Exercise 2

Make a dictionary that contains each (lowercase) letter of your full name as a key. The value for each key should be the number of times the letter appears in your name. For instance, for my name, "lynn root":

    counts = {"l": 1, "y": 1, "n": 2, "r": 1, "o": 2, "t": 1}

In [None]:
counts = {"l": 1, "y": 1, "n": 2, "r": 1, "o": 2, "t": 1}

## 2.3 Python Modules

You can think of a module informally as being a "file with Python code in it". Strictly speaking, this isn't correct, and a module is a kind of object just like any other we've seen – `int`s, `list`s, `dict`s, et cetera. But mentally it's a good model for the common case.

In the case of Jupyter notebooks, you can again informally consider an entire notebook to be similar to a Python module.

We call a set of related modules a library, or sometimes a full _application_.

Python uses modules (again informally, files) to group together related code – so when writing your own Python code, you often may want to break it up into multiple files which contain related pieces of functionality. But additionally, we've so far been using Python's "built-in" objects – which are objects we can use without additional code because they're pre-imported and available for us.

Python however has an extensive set of modules which are installed with it, not pre-imported but which are easily accessible. We call the most common set of these modules its **standard library**. Any computer with Python installed will have these modules as well.

In [None]:
my_name = "Lynn Root"
my_name = my_name.lower().replace(" ", "")  # data clean up

In [None]:
import collections  # module

In [None]:
# "Counter" class is defined in the "collections" module
# and `.` is how we access what's available in that module
char_count = collections.Counter(my_name)

In [None]:
char_count

In [None]:
counts == char_count

This module was the [`collections` standard library module](https://docs.python.org/3/library/collections.html). It contains:

> specialized container datatypes providing alternatives to Python’s general purpose built-in containers


In [None]:
# before we imported all of the "collections" module.
# we can also import specific objects from a module
from collections import namedtuple

In [None]:
Person = namedtuple("Person", ["name", "age", "height"])

In [None]:
myself = Person("lynn", 35, 157.5)

In [None]:
myself

In [None]:
myself[1]  # same like a tuple

In [None]:
myself.age  # but now can look up via the name of the attribute on the instance

There is a huge list of [standard library modules](https://docs.python.org/3.8/library/index.html).

**Familiarizing yourself with a number of key modules is very much recommended** and will often be needed both in this course and in any future programming you do in Python.

Here are a short list of particularly noteworthy modules, only some of which we'll cover further in detail:

* `re` - an implementation of "regular expressions"
* `datetime` - manipulation of objects representing dates and times
* `math` - additional commonly used mathematical functions
* `random` - an implementation of (pseudo-) random number generation functionality
* `pathlib` - file path manipulation
* `csv` - CSV (comma-separated-value) file manipulation
* `json` - JSON format manipulation
* `os` and `sys` - interaction with your operating system and general computing environment

### Anaconda

Beyond the standard library, we in this class are also making use of Anaconda – which comes with an additional set of libraries ([depending on your operating system](https://docs.anaconda.com/anaconda/packages/pkg-docs/)).

These libraries are generally available and installable separately, but Anaconda pre-installs many useful libraries for us.

In [None]:
import requests  # not in the standard library, installed with Anaconda

response = requests.get("https://text.npr.org")
print(response.text)

### Installing Additional Modules

If a piece of functionality you need isn't either in the standard library, nor bundled with Anaconda (or if you happen to not be using Anaconda in the future), there is an immense set of libraries available to you.

The most common place libraries are uploaded to (and installable from) is a central index of packages known as [PyPI](https://pypi.org/) (pronounced "pie-pee-eye", _not_ "pie-pie" - although people often interchange the pronunciation).

We can install a package from PyPI using a package manager, which manages installation of Python packages. The concept of a package manager extends even beyond Python, which we'll discuss in future lectures. Within Python, the most commonly used package managers are `pip` and `conda`. You might remember `pip` from your first homework. It's the most common, most popular used. `conda` is functionally very similar to `pip`, and comes "free" when Anaconda is installed. `conda` is used more in the research subdomains of Python programming.

_Side note: `pip` stands for "python install python"._

Here's an installation of a package called [`httpx`](https://pypi.org/project/httpx/), which is an [open-source package maintained by volunteers](https://github.com/encode/httpx):

In [None]:
# special syntax within Jupyter that executes - it's non-Python code
!pip install httpx 

In [None]:
import httpx  # this would error if the step above didn't happen

response = httpx.get("https://text.npr.org")
print(response.text)

`requests` and `httpx` essentially does the same thing for us here, and it's okay to use either. But one may reach for one library over another because of certain features, performance, maybe they like how to use one over the other even if the result is the same.

## 2.4 Flow Control

### 2.4.1 Looping

We've learned what a [_sequence_](https://docs.python.org/3/library/stdtypes.html#typesseq) is. Specifically, it is a data type which contains a sequence of elements, and supports operations like indexing and getting lengths.

An [iterable](https://docs.python.org/3/glossary.html#term-iterable) is a more general type of object – one which may or may not support indexing or `len()`, but which can be iterated over as above in a `for` loop.

In [None]:
my_list = [1, 2, 3]
for x in my_list:
    sqrd = x ** 2
    to_print = f"{x}: {sqrd}"
    print(to_print)

In [None]:
for tv_show in lynns_fav_tv:
    print(tv_show, len(tv_show))

#### Interlude

A `set` is an iterable (see above), but not a sequence. All sequences are iterables. 

Consider each of the other types we've learned so far. Have we seen any other iterables which are not sequences?

In [None]:
# dictionaries are not sequencies but can be looped over
book_notes = {
    (5, 11): "This passage shows that...",
    (25, 19): "Here is another place where ...",
}

In [None]:
for key, value in book_notes.items():
    print(f"location: {key}")
    print(f"notes: {value}")

In [None]:
# can use unpacking for the key with the tuple
for (page, line), note in book_notes.items():
    print(f"page: {page}, line: {line}")
    print(f"notes: {note}")

In [None]:
# iterate over characters within a string;
# use the built-in `enumerate` to get the index along with the item

greeting = "Hi! My name is Lynn Root"

for item, char in enumerate(greeting):
    print(f"{item}: {char}")

In [None]:
# can use the built-in `range` function to generate a list
for n in range(5):  # from 0 up to but not including 5
    print(n)

In [None]:
for n in range(2, 14, 2): # start, stop, step
    print(n)

In [None]:
# another type of loop: a "while" loop

n = 2
while n < 17:
    print(n)
    n = n ** 2

### 2.4.2 Conditional Flow

Using `if`, `elif`, and `else`

In [None]:
n = 2
if n == 3:
    print("n is equal to 3")
else:
    print("n is not equal 3")

In [None]:
n = 7
while n != 1:
    print(n)
    if n % 2:
        n = 3 * n + 1
    else:
        n = n // 2

In [None]:
n

In [None]:
for x in range(10):
    if x % 2 == 0:
        continue  # stop processing this iteration and continue to next iteration
    print(x)

In [None]:
for x in range(10):
    if x % 2 == 0:
        continue
    elif x > 7:
        break  # stop looping entirely
    print(x)

In [None]:
# the keyword "pass" can be used as a placeholder
for x in range(10):
    if x % 2 == 0:
        continue
    elif x > 7:
        pass  # TODO: remember to implement this!
    print(x)

In [None]:
# You can use "else" with "for" loops, too.

for x in range(10):
    if x % 2 == 0:
        continue  # stop processing this iteration and continue to next iteration
    print(x)
else:
    print("done processing!")

In [None]:
for x in range(10):
    if x % 2 == 0:
        continue
    elif x > 7:
        print("breaking the loop")
        break
    print(x)
else:
    print(f"last x value: {x}")  # does not get executed

## 2.5 Functions

Up until now, we've been using functions that have been defined for us. But we can define our own as well!

Unlike other languages some may be familiar with, Python makes use of whitespace, rather than brackets. With

### 2.5.1 Defining Functions

In [None]:
def greet():
    print("Hi!")  # inside function; indented with 4 spaces
    print("I am inside the function called 'greet'.")

In [None]:
greet()

In [None]:
# we can (re)define the function to take an argument
def greet(name):
    print(f"Hi {name}!")
    print("I am inside the function called 'greet'.")

In [None]:
greet("Lynn")

In [None]:
g = greet("Lynn") 

In [None]:
# What's the return value of g? Why is it None?
g

In [None]:
# define a function that takes 2 arguments, and returns a value
def add(x, y):
    return x + y

In [None]:
add(2, 2)  # note: this is Jupyter automatically showing us the result

In [None]:
result = add(3, 4)

In [None]:
result  # the returned value of add

#### Exercise 3

Recall last week's lecture, we had the following exercise:

> Choose any real number. Write code to double that number. Add six to that. Then divide by two. Finally, subtract your original number from the result of dividing by two. You should get 3.0.

Then later we replaced the original number with a variable.

    x = 6
    (((x * 2) + 6) / 2) - x
    
Now let's make that into a function where it takes in one argument - the original number. And it returns one float - the result.

In [None]:
def exercise(n):
    result = (((n * 2) + 6) / 2) - n
    return result

#### Exercise 4

Now, take function you just defined, and verify that it returns `3.0` for all numbers between `-10000` and `10000`. As soon as you find a number that does not return 3.0, stop testing other numbers. But, if you do indeed find that all those numbers do return 3.0, then print the string `"Success!"`

In [None]:
for x in range(-10000, 10001):  # remember: the end of the range is exclusive
    result = exercise(x)
    if result != 3.0:
        break
else:
    print("Success!")  # should not get executed if `result != 3.0`

### 2.5.2 Scope

In Python, there is this thing called "scopes". The scope of an object in Python refers to its accessibility. To access the particular variable in our code, the scope must be defined as it cannot be accessed from anywhere in the program. The particular coding region where variables are visible is known as scope.

There are 4 types of "scope" in Python:
* built-in
* global
* local
* enclosed

Build-in scope, you should already be familiar with. The functions like `print`, `type`, `help`, etc are all built-in scope.

In [None]:
# example of global scope
a_list = [1, 2, 3]

def print_a_list():
    print(a_list)

print_a_list()

Defining our own functions can introduce scope:

In [None]:
# example of local scope

result = [2, 4, 6]

def calculate(): 
    result = "a very important result"
    print("result locally:", result)
    return result

returned = calculate()
print("returned:", returned)
print("result globally:", result)

In [None]:
# example of enclosed scope

def outer_func():
    a = 1
    
    def inner_func(): # define a function w/i a function
        a = 2
        print(f"inner: {a}")
        
    inner_func()  # call the inner func within outer func
    print(f"outer: {a}")

outer_func()

## 2.6 Comprehensions

Let's revisit our for loops. We can create dynamic lists with a single line of code with what's called a list comprehension.

In [None]:
# original
squares = []
for x in range(10):
    squares.append(x ** 2)
squares

In [None]:
# using a list comprehension
squares = [x ** 2 for x in range(10)]
squares

In [None]:
# nested comprehensions
# original 
combinations = []
for x in [1, 2, 3]:
    for y in [3, 1, 4]:
        if x != y:
            combinations.append((x, y))
combinations

In [None]:
# the order of for & if statements are the same as the original
combinations = [(x, y) for x in [1, 2, 3] for y in [3, 1, 4] if x != y]
combinations

We can also do nested list comprehensions

In [None]:
# a list of lists
matrix = [
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
]

In [None]:
# generate a new matrix that transposes original matrix
[[row[i] for row in matrix] for i in range(4)]

In [None]:
# equivalent to:
transposed = []
for i in range(4):
    transposed.append([row[i] for row in matrix])
transposed

In [None]:
# an aside: this would be even more more concise:
list(zip(*matrix))

For the above example - read more on [unpacking argument lists](https://docs.python.org/3/tutorial/controlflow.html#tut-unpacking-arguments).

We can also use dictionary comprehension to create dynamic dictionaries:

In [None]:
dict1 = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
double_dict1 = {k: v * 2 for (k, v) in dict1.items()}
double_dict1

Note that not every for-loop can be condensed in to a comprehension; but every comprehension can be expanded out into a for-loop.

## 2.7 Object Identity versus Equality

In [None]:
first = [1, 2, 3]
second = [1, 2, 3]

In [None]:
first == second  # equality

In [None]:
first is second  # identity

In [None]:
# `id()` returns the identity of the object - the memory address
id(first)

In [None]:
id(second)

In [None]:
third = first
first is third

In [None]:
id(first)

In [None]:
id(third)

#### Bottom-line advice

In 99.9% of cases, use `is` only with `None`, and there only because it's stylistically preferred.

Try to never care whether two objects are precisely the same object "underneath", and rather instead whether they should be considered equal.

You may additionally find [this wonderful article](https://nedbatchelder.com/text/names.html) by Ned Batchelder helpful to read.

## 2.8 Summary

As a quick summary, we've now covered:

* Various ways to format strings;
* More built-in types: `set`, `tuple`, `dict`;
* How to control the flow of our logic with `for`, `while`, `if`, `elif`, `else`, `continue`, `break`, and `pass`;
* How to employ comprehensions for a more concise way of created lists and dictionaries
* Defining and calling our own functions and what is in & out of scope
* Importing additional functionality we may need.