#### Intro to Python (and Jupyter notebooks)
For the official Python tutorial, go to:
https://docs.python.org/3/tutorial/

I provide links to that and other parts of the documentation for many topics, so that this tutorial can also serve as a quick reference.

This tutorial was developed by me, Joseph Pedersen. I am not affiliated with the Python Software Foundation, and this tutorial is not endorsed by the Python Software Foundation.

This tutorial is intended for an audience with no Python experience.  It only covers basic Python, not third-party packages (it does show a few examples that use third-party packages, e.g. NumPy and pandas, but does not cover them in any depth).

This notebook was written for Python version 3.10, meaning that you should have version 3.10 or newer to use this tutorial.

tutorial version 20250323

Please send any errors found or suggestions for improvement to:  
[joseph.m.pedersen@gmail.com](mailto:joseph.m.pedersen@gmail.com?subject=Python_Tutorial) with ***SUBJECT: Python_Tutorial***

## Part 1: Basic syntax and some built-ins

### 1.1 Beginning

***Instructions:***

<span style="background:yellow">To complete this tutorial, RUN ***EVERY*** code cell, and complete all of the EXERCISES!</span>

If a cell produces an <span style="color:red">***Error***</span>, there should be an exercise right after it telling you how to fix the error, or an explanation of the error explaining why the error was produced. 

Sometimes the explanation may be ***hidden*** in a cell like the next one that says ***Remember***. Click on that cell to expand it.

<details>
    <summary><b>Remember</b></summary>
    
Explanations or solutions that I don't want you to see until after you have run a cell are hidden like this.
</details>

There are many links in this tutorial, mostly to the [***Python documentation***](https://docs.python.org/3/), that you can click on to learn more about a topic!

<span style="background:yellow; color:red">***WARNING: Make sure that you got this notebook from a trustworthy source. Never run code that may have been compromised, since it could contain malicious code that can harm your computer or compromise the data on it!***</span>

Why did I take the time to write that?  If this notebook was comprised, wouldn't the attacker remove the warning?

Maybe that's what they want you to think! 🤔

#### 1.1.1 Run a code cell in a [Jupyter notebook](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/What%20is%20the%20Jupyter%20Notebook.html) <a id="run_code"></a>

One way to [run a code cell](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Running%20Code.html) is:
1) highlight the cell by clicking to the left of it (or in it), then
2) hold shift and press enter

In [None]:
# Execute your first command in Python
print("hello, world!")

If you ran the [***code cell***](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html#code-cells) above, the code in the cell was read and executed, [***printing***](https://docs.python.org/3/library/functions.html#print) `hello, world!` below the cell.

Then, the next cell (this one) was highlighted. 

This is a [***markdown cell***](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html#Markdown-Cells), so it doesn't have any code, but you can still hold shift and press enter to highlight the next cell.

#### 1.1.2 Check which Python you are running <a id="python_version"></a>

This tutorial is written for Python version 3.10 or greater. Running the next cell will show you which version you are using, and the path to that executable.

In [None]:
# Check your python version
import sys

print("Python version =", sys.version)
print("\nThe path to that Python interpreter:\n\t", sys.executable)

#### 1.1.3 Comments <a id="comments"></a>

Anything on a line after the symbol `#` is a [***comment***](https://peps.python.org/pep-0008/#comments) (unless it is inside a [string](#strings_intro)) and is ignored by Python. Notice that Jupyter Notebooks have [***syntax highlighting***](https://en.wikipedia.org/wiki/Syntax_highlighting), making the comments a different color than other code to improve readability.

Comments are important for helping anyone who reads your code understand what it is doing, including you later!

In [None]:
# This will not print anything, since it is a comment

# print("The answer to the ultimate question, of life, the universe, and everything, is", 42)

The cell above only had comments in it, so nothing was executed.

##### Exercise 1.1.3.1: Uncomment code

Sometimes code is commented out in order to not run it, without deleting it.  This is especially useful during debugging. When you want to run the code again, you can ***uncomment*** it. Uncomment the code in the cell above by removing the `#` before the `print` statement. Then, rerun the cell.

#### 1.1.4 Output cells <a id="output_cells"></a>

Besides printing and plotting output below the cell containing those commands, [***Jupyter notebooks***](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/What%20is%20the%20Jupyter%20Notebook.html) also sometimes create an output cell containing the value of the last expression, by default.

Running the code cell below will cause each of the three expressions to be evaluated. However, only the value of the last one is output to the screen (by default).

In [None]:
2+3  # addition
2*3  # multiplication
2**3 # exponentiation

The output cell above contains an "8" because the last expression in the corresponding input cell evaluated to "8".  Notice that the corresponding input and output cells have the same number.

##### Exercise 1.1.4.1: semicolon

In the code cell above, put a semicolon (`;`) after the last expression (`2**3`) and rerun the cell. Make sure not to put the semicolon in the comment (which is ignored by the interpreter). Put it immediately after the expression. 

***What changed?***

In Python, lines do not need to be "terminated" by a semicolon. However, in Jupyter notebooks, if you put a semicolon after the last expression in a cell, it suppresses the output.

#### 1.1.5 Adding and deleting cells <a id="cells"></a>

When a cell is highlighted in JupyterLab, there are symbols in the upper right corner that you can click on to add a cell above or below that cell. If you hover over the symbol, it also shows you the [keyboard shortcut](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html#Keyboard-Navigation) that you can use when in [***command mode***](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html#Command-mode), which is "indicated by a grey cell border with a blue left margin."  You get to command mode by clicking to the left of a cell, just outside of it, or pressing `ESC`. The shortcuts are ***A*** to add a cell ***A***bove and ***B*** to add a cell ***B***elow.

There is also a trashcan that you can click on to delete a cell. The keyboard shortcut for that when in command mode is ***D D*** to ***D***elete a cell (making you type the "D" twice so that you are sure, although you can undo the deletion with ***Z*** if you have not done anything else yet).

Once you have made a cell, you can change it to either a code cell, markdown cell, or raw cell, using the following keyboard shortcuts ***when in command mode***:

- ***Y*** = "code cell"  
- ***M*** = "markdown cell"  
- ***R*** = "raw cell"  

##### Exercise 1.1.5.1: Adding cells

Add a cell above this one.  Make it a raw cell.  Type something.

Also add a cell below this one.  Make it a code cell.  Type a comment in it.

You can use these features in this notebook to add notes/comments to yourself as you study Python!

### 1.2 Expressions <a id="expressions"></a>

Any Python syntax that evaluates to some value is referred to as an [***expression***](https://docs.python.org/3/glossary.html#term-expression).

#### 1.2.1 Parentheses <a id="parentheses"></a>

Parentheses can be used so that a long expression can span multiple lines, referred to [***implicit line joining***](https://docs.python.org/3/reference/lexical_analysis.html#implicit-line-joining).

In [None]:
# The expression on the right hand side of this assignment statement
#     is meant to include the final term +1
too_long = 1000 + 2000 + 3000 + 4000 + 5000 + 6000 + 7000 + 8000 + 9000 + 10000 + 11000 + 12000
+ 1

print(too_long)

##### Exercise 1.2.1.1: Use parentheses for implicit line joining

Should `too_long` end in three zeros? 

Fix it by putting a left parenthesis just before the `1000`, as in `(1000 +`, and a right parenthesis just after the `1`, as in `+ 1)`. Then run the cell again.

You need to use parentheses to have an expression span multiple lines.†

†or use line continuation by putting `\` at the end of a line, referred to as [***explicit line joining***](https://docs.python.org/3/reference/lexical_analysis.html#explicit-line-joining).

#### 1.2.2 Order of operations

Parentheses can also be used to indicate [order of operations](https://docs.python.org/3/reference/expressions.html#operator-precedence), as in math.

In [None]:
2 * (3 + 5)

### 1.3 Variables (names) <a id="variables"></a>

#### 1.3.1 Assignment statements <a id="assignment"></a>

To create a variable ([***bind a name***](https://docs.python.org/3/reference/executionmodel.html#naming-and-binding) to an object), we use an [***assignment statement***](https://docs.python.org/3/reference/simple_stmts.html#assignment), which has the following syntax:
```python
some_name = expression
```
The variable `some_name` (called the target) can be any valid identifier (name) as described in the [naming rules](#naming_rules) below.  The `expression` can be any expression. The expression will be evaluated, and the resulting object will be stored in memory, and can later be referenced using the name.

This is just one use of assignment statements. We'll see other assignment statements later, with multiple targets on the left side of the `=`, and with targets other than names.

Let's look at some examples.

In each of the lines below, the expression on the right hand side is evaluated, and that ***object*** (in each of these cases, an integer) is stored, and can later be referenced by the ***name*** on the left hand side.

In [None]:
x = 2+3  # addition
y = 2*3  # multiplication
z = 2**3 # exponentiation

##### Exercise 1.3.1.1: Print multiple values

The built-in [function](#functions) [***print***](https://docs.python.org/3/library/functions.html#print) takes an arbitrary number of [arguments](#functions), of any [type](#type).

In [None]:
# print the values of the variables
print(x, y, z)

#### 1.3.2 Naming rules <a id="naming_rules"></a>

Python names (also called [***identifiers***](https://docs.python.org/3/reference/lexical_analysis.html#identifiers)) can contain letters (a-z and A-Z), underscores, and (except for the first character) digits (0-9). Names are ***case sensitive***, as is the rest of the syntax. There is no limit to the length of a name. Names can also contain other characters, for the full specification, see [here](https://docs.python.org/3/reference/lexical_analysis.html#identifiers).

There are some [***reserved words***](https://docs.python.org/3/reference/lexical_analysis.html#keywords) which ***cannot*** be used for names.

It is also good practice not to use the names of [***built-in functions and types***](https://docs.python.org/3/library/functions.html), in case you go to use those later, forgetting that you overwrote them.

There are also some [***naming conventions***](https://peps.python.org/pep-0008/#naming-conventions).

It is best to use ***meaningful*** names whenever possible.

Rather than name your variables like this:

In [None]:
x = 10
y = 113/355

z = 3.14159 * x**2 * y

round(z, 3)

It is better to use meaningful names, since they are easier for humans to read, understand, and remember.

In [None]:
# Using meaningful names:
radius = 10
height = 113/355

volume = 3.14159 * radius**2 * height

# round the volume to three decimal places
round(volume, 3)

##### Exercise 1.3.2.1: use Python as a calculator

In [None]:
# create two variables (i.e. name two values), and print an expression containing them


##### Exercise 1.3.2.2: Who is max? <a id="max_who"></a>

The built-in [function](#functions) [***max***](https://docs.python.org/3/library/functions.html#max) returns the largest of its [arguments](#functions) (if it is given two or more arguments):

In [None]:
max(1, 2)

If we create a variable for the largest height our program allows, why might we not want to call it `max`?

In [None]:
# the maximum height allowed
max = 25

What do you think will be the output of the following cell?  Run it to see if you are correct.

In [None]:
max(1, 2)

<details>
    <summary><b>Solution</b></summary>
    
The built-in function `max` can be ***called*** with arguments, and will return the largest value.
    
However, we bound the name `max` to the value `25` in the assignment statement `max = 25`, so now the name `max` refers to the integer `25` instead of the built-in function.
    
An integer is not *callable* the way a function is, which is why we get a <span style="color:red">***TypeError***</span>`: 'int' object is not callable`
</details>

To make the name refer back to the built-in function again, you can unbind the name from the integer (delete the name binding we created) by using the [del statement](#del_keyword) described below.

In [None]:
# remove the binding that we gave to the name `max`
del max

In [None]:
# now `max` refers to the built-in function again
max(1, 2)

#### 1.3.3 Dynamic Typing <a id="dynamic_typing"></a>

Unlike many other languages, you do not need to "declare" a variable (name) before using it, and variables do not have types (objects do, more on that [later](#type)), so you can rebind a name to an object of a different type, as is done in the next cell.

In is important to keep clear the distinction between ***names*** and ***objects***. When we execute the assignment statement `msg = 42` below, we ***rebind*** the name `msg` to the object `42` (an integer); we do ***not*** change the object `"hello, world!"` (a string). If that string has another name, then it is still accessible from that name.

In [None]:
# save message (which is a string) in a variable
#     i.e. bind the name `msg` to the string object
msg = "hello, world!"

# bind another name to the same string
msg2 = msg

# the argument to the print function is the string object with name `msg`
print(msg)

In [None]:
# rebind the name `msg` so that it now refers to the integer 42
msg = 42

# the argument to the print function is the integer object with name `msg`
print(msg)

In [None]:
# this name still refers to the string
print(msg2)

#### 1.3.4 Deleting references with `del` <a id="del_keyword"></a>

As we have already seen, you can use a ***[del statement](https://docs.python.org/3/reference/simple_stmts.html#del)*** to delete a reference (e.g. unbind a name).

If the name was defined in multiple [namespaces](#namespace), like when we created our own binding for the built-in function `max`, then we can still use the name for those other references.

However, if there are no more references, then using the name will raise a <span style="color:red">***NameError***</span>.

In [None]:
# unbind the name `msg`
del msg

# using the unbound name `msg` raises a NameError
print(msg)

Reading ***[Errors](https://docs.python.org/3/tutorial/errors.html#):***  Make sure to read the errors that are produced. Then, if you run into those errors again, you'll be more likely to remember what caused them, and thus how to fix them.

The two most important lines of the error message to read are:
1. The line at the very bottom, which usually states what kind of error it is (e.g. <span style="color:red">***NameError***</span>), and other important details, like in this case the name that caused the error.
2. The line that shows which line of code (with line number) caused the error.

##### Exercise 1.3.4.1: a string by another name

Predict the output of the following cell; then run it to see if you are correct.

In [None]:
a = "another message"
b = a
del a

print(b)

<details>
    <summary><b>Remember</b></summary>
    
`del` deletes *references* to objects (i.e. unbinds names); it does ***not*** delete the objects.

If the object has other references, those can still be used without raising an error.
    
If an object has no remaining references, Python will remove it from memory (called "garbage collection"). One of the eases of Python is that it does this memory management for you.
</details>

### 1.4 Import statements <a id="import_keyword"></a>

Besides the built-in functions in Python, there are many modules that can be ***[imported](https://docs.python.org/3/reference/simple_stmts.html#the-import-statement)*** for increased functionality, using the syntax:
```python
import module_name
```
After importing the module, you can use variables defined in the module by using the "dot" notation:
```python
module_name.variable_name
```
If the variable refers to a [function](#functions), then you can [***call***](https://docs.python.org/3/reference/expressions.html#calls) the function using parentheses, as in:
```python
module_name.function_name(arguments)
```
You only need to import a module one time: before you use the name of the module (very similar to creating a variable).  If you try to use the module name before you import the module, it will raise a <span style="color:red">***NameError***</span>.

#### 1.4.1 Importing modules <a id="import_module"></a>

In [None]:
# import the math module
import math

# print the factorial of 4
print("4! =", math.factorial(4))

# print the value of pi
print("π =", math.pi)

**Important note:**  the `math` module is part of the [***Python Standard Library***](https://docs.python.org/3/library/index.html), so you can import it without worrying about whether it is installed.  There are many very useful modules that are not part of the Python Standard Library. Those need to be installed before they can be imported. If you are using a virtual environment, those modules need to be in that environment to be imported.

#### 1.4.2 Importing functions from a module <a id="from_keyword"></a>

If you only want one or a few functions from a module, you can import those using the following syntax:
```python
from module_name import function1, function2
```
Then, you do not need to use the "dot" notation, e.g. `module_name.function1()`, but can rather just call the function using `function1()`.

In [None]:
# Import the factorial and sqrt functions from the math module
from math import factorial, sqrt

# Now you can call the functions without the 'dot' notation
a = factorial(4)
b = sqrt(4)

print("4! =", a, "and the square root of 4 is", b)

#### 1.4.3 Creating an alias during import <a id="as_keyword"></a>

You can also make abbreviations using the keyword [***as***](https://docs.python.org/3/reference/simple_stmts.html#import).

The following imports the module `module_name` and binds the name `mn` to that module:
```python
import module_name as mn
```

The following imports the function `function1` from the module `module_name` and binds the name `fun1` to that function:
```python
from module_name import function1 as fun1
```

In [None]:
# import the factorial function from the math module, and name it `fact`
from math import factorial as fact

# Now the name `fact` refers to the factorial function from the math module
fact(4)

##### Exercise 1.4.3.1: import

Import the `randint` function from the `random` module (part of the Python Standard Library), and call `randint(1,100)` to choose a random integer from 1 to 100 (inclusive).

In [None]:
# Put your code below


<details>
    <summary><b>Solution</b></summary>
    
```python
from random import randint
randint(1,100)
```
</details>

### 1.5 Some of the built-in [data types](https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy) <a id="builtin_types"></a>

When working with data, it is important to know what [***type***](https://docs.python.org/3/library/functions.html#type) of data it is, because the type of an object usually indicates which [attributes](#attributes) it has, and what you can do with it (e.g. whether or not it makes sense as the argument passed to some parameter in a [function](#functions)). In this section, we will become familiar with some of the built-in data types in Python.

#### 1.5.1 Numbers <a id="numbers"></a>

The three main [***numeric types***](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex) in Python are integers, real numbers (called floats), and complex numbers.

##### 1.5.1.1 Integers <a id="integers"></a>

The built-in **[int](https://docs.python.org/3/library/functions.html#int)** type can represent arbitrarily large integers ***exactly***.

In [None]:
# a famous constant
long = 602214076000000000000000

# print a string representation of the value of the object named `long`
print(long)

# print a string representation of the `type` of the object named `long`
print(type(long))

##### 1.5.1.2 Floats <a id="floats"></a>

The built-in **[float](https://docs.python.org/3/library/functions.html#float)** type can represent positive or negative real numbers (most are only represented ***approximately***).

In [None]:
# a more famous constant, except the real pi goes on forever
pi = 3.14159265358979323846264
print(pi)
print(type(pi))

Notice that the float did not retain the full precision of the digits typed. Python's floats are "double-precision," (64-bit) which is about 15-17 decimal digits of precision. Some information about floats is specific to your computer, and can be found by using the  **[sys](https://docs.python.org/3/library/sys.html)** module (part of the Python Standard Library):

In [None]:
import sys

# largest finite float
print("The largest finite float is", sys.float_info.max)

# smallest positive float
print("The smallest positive float is", sys.float_info.min)

##### 1.5.1.3 Arithmetic of integers and floats <a id="arithmetic"></a>

You can use integers and floats in the same expressions. You can look up the arithmetic operations [***here***](https://docs.python.org/3/reference/expressions.html#binary-arithmetic-operations).

In [None]:
# arithmetic operators
print(2 + 3.5) # addition
print(2 - 3.5) # subtraction
print(2 * 3.5) # multiplication
print(2 / 3.5) # division

print(7 // 2) # floor division
print(7 % 2) # modulo (i.e. remainder)
print(7 ** 2) # exponentiation

##### 1.5.1.4 Augmented assignment <a id="augmented_assignment"></a>

If you want to update the value of a variable using the arithmetic operations above, you can perform the operation and bind the variable to the new value in an [***augmented assignment statement***](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statement).

For example, if `x = 5`, then you can double `x` using:
```python
x *= 2
```
This syntax multiplies `x` by `2`, then rebinds the name `x` to that value (i.e. `10`).

This accomplishes the same thing as `x = x*2`, with slightly less typing.  We will see other uses of augmented assignment in which they are more useful than saving a few  keystrokes.

Let's look at more examples:

In [None]:
x = 4
print(x)
x += 8 # addition
print(x)
x -= 3 # subtraction
print(x)
x *= 8 # multiplication
print(x)
x /= 3 # division
print(x)
x //= 3 # floor division
print(x)
x %= 5 # modulo (i.e. remainder)
print(x)
x **= 2 # exponentiation
print(x)

##### 1.5.1.5 Value <a id="value"></a>

Every Python object has a [***value***](https://docs.python.org/3/reference/expressions.html#value-comparisons). To test if **two** objects have the same value, you use ***two*** equal signs:
```python
a == b
```
The above expression will evaluate to `True` if the value of `a` equals the value of `b`, and evaluate to `False` otherwise.

Any two objects can be evaluated for equality without raising an error, even if they are different types.  Integers have the same value as the corresponding floats, e.g. the integer `1` has the same value as the float `1.0`.

In [None]:
# checking equality, to see if `1` and `1.0` have the same value.
print(1 == 1.0)

**Important tip** Note that the **double** equal sign is a *question* to Python asking, "These two objects have the same value?" (which Python answers with `True` or `False`) whereas a **single** equal sign is a *declarative statement* to Python saying, "Make this name refer to this object."

##### Exercise 1.5.1.5: Checking Equality

Predict the output of the following cell, then run it to see if you were right.

In [None]:
# checking equality
print(3 = 3.14)

<details>
    <summary><b>Remember</b></summary>
    
To check if ***two*** objects are equal (have the same value), use ***two*** equal signs.
    
To bind ***one*** name to ***one*** object, use ***one*** equal sign.
    
The error here is because `3` is not a [valid name](#naming_rules).
</details>

##### 1.5.1.6 Comparison Operators <a id="comparison"></a>

There are also the following other [***comparison operators***](https://docs.python.org/3/reference/expressions.html#value-comparisons):
  * `a != b` (`True` if `a` is not equal to `b`)
  * `a  < b` (`True` if `a` is less than `b`)
  * `a <= b` (`True` if `a` is less than or equal to `b`)
  * `a  > b` (`True` if `a` is greater than `b`)
  * `a >= b` (`True` if `a` is greater than or equal to `b`)
  
Every two objects can have their values tested for inequality.  `a != b` should be the opposite of `a == b`.

Not every two objects can have their values compared for order. If you try to evaluate `"cat" <= 7`, you will get a <span style="color:red">***TypeError***</span>.

You can also read more about comparison methods [***here***](https://docs.python.org/3/reference/datamodel.html#object.__lt__).

<a id="nan_comparison"></a>
***Important tip*** If you work in data analysis, and deal with [***NaNs***](https://docs.python.org/3/library/math.html#math.nan) (not-a-number), then you should be careful using comparison operators with NaNs, as the following example shows.  You should consider using other functions such as [***isnan***](https://docs.python.org/3/library/math.html#math.isnan) from the math module. 

In [None]:
x = float('nan')
print(x == float('nan'))

In [None]:
# we already imported math above
math.isnan(x)

##### Exercise 1.5.1.6: Comparing values

Predict the output of the following cells, then run them to see if you were right.

In [None]:
print(1 != 1.0) # inequality (i.e. DIFFERENT values)
print(1 < 1.0) # "less than"
print(1 <= 1.0) # "less than or equal to"
print(1 > 1.0) #  "great than"
print(1 >= 1.0) # "greater than or equal to"

In [None]:
print(1 != "dog") # inequality (i.e. DIFFERENT values)

In [None]:
print(1 < "dog") # "less than"

In [None]:
x = float('nan')
print(x == x) # "equality"

In [None]:
print(x < x) # "less than"

In [None]:
print(x > x) # "greater than"

In [None]:
print(x != x) # "inequality"

<details>
    <summary><b>Solutions</b></summary>
    
`1` does not have the same value as `"dog"`, so the comparison `1 != "dog"` evaluates to `False`
    
Integers and strings do not support order comparison, so `1 < "dog"` raises a <span style="color:red">***TypeError***</span>.
    
The `float('nan')` is not equal to anything, even itself!  Be very careful comparing values if some may be NaNs!
</details>

##### 1.5.1.7 Chaining Comparisons <a id="chained_comparisons"></a>

You can [***chain comparisons***](https://docs.python.org/3/reference/expressions.html#comparisons) in a single expressions using the following syntax:
```python
x1 op1 x2 op2 x3 ... xN
```
Here the `x1, x2, ...` are variables and the `op1, op2, ...` are comparison operators. The `x` could also be any expressions that evaluate to values that can be compared. However, depending on the [order of precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence), the expressions may need surrounded by parentheses.

The overall expression yields `True` if ***all*** of the comparisons are `True`, and `False` if any of the comparisons are `False`. The chain ***short-circuits*** as soon as the first comparison (from left to right) yields `False`, meaning that none of the other expressions are evaluated.

Let's look at some examples.

##### Exercise 1.5.1.7: Chained comparisons

Predict the output of the following cells, then run them to see if you were right.

In [None]:
0 < 1 < 2 < 3 < 4

In [None]:
1 < 5 > 3 # ugly, but...

In [None]:
3/2 < 2/1 < 1/0

In [None]:
3/2 > 2/1 > 1/0

In [None]:
print(0 != 1 != "dog")

In [None]:
print(1 = 1 != "cat")

In [None]:
print(1 > 1 < "dog")

<details>
    <summary><b>Solutions</b></summary>
    
All of the comparisons in `0 < 1 < 2 < 3 < 4` are `True`, so the expression is `True`.
    
`1 < 5` evaluates to `True` and `5 > 3` evaluates to `True`, so `1 < 5 > 3` evaluates to `True` (even though it looks ugly).
    
`3/2 < 2/1` evaluates to `True`, but `2/1 < 1/0` raises a <span style="color:red">***ZeroDivisionError***</span>. You just created a black hole somewhere in the universe!

`3/2 > 2/1` evaluates to `False`, so the expression short-circuits to `False` ***without*** trying to evaluate `1/0`. You're safe this time!

`0 != 1` evaluates to `True` and `1 != "dog"` evaluates to `True`, so `0 != 1 != "dog"` evaluates to `True`.

`1 = 1` is ***not*** a comparison!  This raises a <span style="color:red">***SyntaxError***</span>.

`1 > 1` is `False`, so the expression short-circuits to `False` ***without*** trying to evaluate `1 < "dog"`, which would have raised a <span style="color:red">***TypeError***</span>.
</details>

##### 1.5.1.8 Floating point precision <a id="float_precision"></a>

Floats may not have the ***exact*** value displayed. They use a fixed number of bits to store a value.

In [None]:
# beware of floating point precision
x = 0.3 + 0.3 + 0.3 + 0.1
y = 1
print(x == y)

##### Exercise 1.5.1.8: Floating point limitations

Investigate by running the next cell.

In [None]:
# print x and y to see their values
print("x =", x)
print("y =", y)

This is one of the [***issues***](https://docs.python.org/3/tutorial/floatingpoint.html) with using floats.  Rather than check for exact equality of two floats, it may be better to check that they are [***close***](https://docs.python.org/3/library/math.html#math.isclose) (within some tolerance).

<span style="background:yellow">***Note***: this is a very useful debugging technique in general: ***print*** the values that you are using, to make sure they are what you think they are!</span>

In [None]:
# we already imported math above
math.isclose(x, y)

#### 1.5.2 Boolean values <a id="bool"></a>

There are two objects of type [***bool***](https://docs.python.org/3/library/functions.html#bool), referenced by the built-in [***constants***](https://docs.python.org/3/library/constants.html) `True` and `False`, which are case sensitive (like all names).

We have already seen that comparisons evaluate to these values. They are very important to computer programming, such as in the conditional logic used to control the flow of a program, like the [***if statement***](#if_keyword) below (which will be explained in more detail later).

In [None]:
# bind the name `answer` to the built-in constant `True`
answer = True

# This is an "if statement", which are explained in more detail later
if answer:
    # this statement is executed when `answer` is True
    print("It is True!")
else:
    # this statement is executed when `answer` is False
    print("It is False!")
    
print(answer)
print(type(answer))

##### Exercise 1.5.2.1: Boolean logic

Change the value of `answer` (in the ***code cell*** above) to `False`, and run the cell again

#### 1.5.2.1 `bool` is a subclass of `int` <a id="bool_subclass"></a>

Technically the `bool` class is a subclass of `int`, with `True` having the same value as `1` and `False` having the same value as `0`.

In [None]:
# Check that `True` and `1` have the same value
print(True == 1)

# Check that `False` and `0` have the same value
print(False == 0)

In [None]:
# Just to be weird
False**False

#### 1.5.3 NoneType <a id="None"></a>

There is a single object of type [***NoneType***](https://docs.python.org/3/library/constants.html#None), the built-in constant `None`, which is often used to denote a missing value.

In [None]:
# Is the middle name unknown, or is it known that the person doesn't have one?
middle_name = None
print(middle_name)
print(type(middle_name))

##### Exercise 1.5.3.1: truth value of `None`

Change the value of `answer` (in the ***code cell in 1.5.2***) to `None`, and run the cell again

Is `None` ***truthy*** or ***falsy***?

***ANY*** Python object can be used for Boolean logic, such as in an `if` statement, not just objects of type `bool`.  Almost all values behave the same as `True` (referred to by some as *truthy*) in the Boolean logic, except for the objects listed here: [truth values](#truth_value).

#### 1.5.4 Text (strings) <a id="strings_intro"></a>

Python [***strings***](https://docs.python.org/3/tutorial/introduction.html#text) are  immutable sequences of unicode code points.  The type of a string is [***str***](https://docs.python.org/3/library/stdtypes.html#textseq). There is no separate character type.

Being [***immutable***](https://docs.python.org/3/reference/datamodel.html#immutable-sequences) means that a string ***cannot*** be changed after it is created. However, you can create a *new* string that is based on the old string (e.g. a capitalized version).

This section covers the basics of strings. They "will be" covered in much more depth [***eventually***](#strings_more).

##### 1.5.4.1 Double or Single quotes <a id="quotes"></a>

Strings can be surrounded by either double quotes, as in `"This is a string."`, or single quotes, as in `'This is a string.'`  There is no distinction between the two: both have the same value and the same type.

However, strings cannot contain the type of quotes they are surrounded by, unless those characters are [***escaped***](https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences).

The built-in function [**len( )**](https://docs.python.org/3/library/functions.html#len) will return the length of the string, which is the number of [***unicode code points***](https://docs.python.org/3/howto/unicode.html#comparing-strings), not necessarily the number of characters (although they are usually the same).

In [None]:
# These two strings have the same value
print("This is a string." == 'This is a string.')

# and the same type
print(type("This is a string.") == type('This is a string.'))

In [None]:
# Strings can be surrounded by either double quotes:
msg = "Have a nice day! 😀"
print(msg)
print(type(msg))

# the built-in function `len` gives the length of the string
print(len(msg))

In [None]:
# or single quotes:
msg = '祝你今天过得愉快 😀'
print(msg)
print(type(msg))
print(len(msg))

In [None]:
# Having both options provides flexibility
msg = 'You can't use apostrophes in strings with single quotes...'
print(msg)

##### Exercise 1.5.4.1: Mixing double & single quotes

Change the cell above so that the string uses double quotes, then run again

##### 1.5.4.2 String escapes <a id="escapes"></a>

You can use apostophes or quotes in a string that is surrounded by the same character if you ***[escape](https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences)*** the apostrophes or quotes inside the string. The other popular escape sequences are `\n` for a ***newline*** and `\t`  for a ***tab***.

In [None]:
# but there are other ways
msg = '...unless you escape the apostrophe, as in can\'t!\nThis is on line...\ttwo.'
print(msg)

##### 1.5.4.3 Raw strings <a id="raw_strings"></a>

String ***[literals](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)*** can have different prefixes. If a string is prefixed by an `r`, escapes are ***usually*** ignored, and the backslash is treated like any other character, ***except*** that a backslash still prevents a single/double quote (whichever started the string literal) that would terminate the string from doing so.  However, even in that exception the backslash is part of the string.

In [None]:
# raw strings, prefixed by 'r', ignore escapes
msg = r'...unless you escape the apostrophe, as in can\'t!\nThis is on line...\ttwo.'
print(msg)

<a id="raw_string_path"></a>
##### Exercise 1.5.4.3a: Raw strings for file paths

Raw strings are good to use for file paths (if you copy and paste a path in Windows), since the backslashes will not be interpreted as escape sequences.


1. Pick a folder on your computer that does not contain very many files
2. Open it
3. Right-click on one of the files
4. Click on Properties
5. Copy the path, which is to the right of the word ***Location***. It should look something like:   `C:\Users\joseph.pedersen\OneDrive - West Point\ORCEN`
6. Paste that path below, to the right of the `=`, 
7. Surround it by quotes
8. Add the `r` prefix before the quotes
9. Then run th cell below.  It should list the contents of that folder.

In [None]:
# put the path here as a raw string
path =

import os

os.listdir(path)

##### Exercise 1.5.4.3b: Forgetting the `r` prefix

Remove the `r` prefix to make the path a normal string, then run the cell again. You'll probably get an error. Get in the practice of reading errors to try to understand them. Then, if you encounter the same error again, you'll be more likely to remember the likely cause.

##### 1.5.4.4 Triple Quote strings <a id="triple_quote_strings"></a>

A ***[triple-quoted string](https://docs.python.org/3/glossary.html#term-triple-quoted-string)*** can span multiple lines (i.e. contain actual newline characters, not just their escape sequences). The quotes used can be single quotes or double quotes. 

These strings are the same type as the other strings, the only difference is that their [***literals***](https://docs.python.org/3/reference/lexical_analysis.html#literals) can contain unescaped single or double quotes (as long as there aren't three in a row that match the ones that surround the string) and can span multiple lines without needing to use line continuation.

In [None]:
# strings can also use triple quotes, which allows spanning multiple lines
msg = """This is a string
that spans multiple lines!"""
print(msg)
print(type(msg))

In [None]:
msg2 = "This is a string\nthat spans multiple lines!"
print(msg2)

# Check that the strings have the same value
print(msg == msg2)

##### Exercise 1.5.4.4: Use triple quoted strings to contain single and double quotes

If you ever copy and paste a large amount of text that contains ***both*** single quoted strings and double quoted strings, you can surround the text by triple quotes to quickly make it into a valid string literal.

Run the following cell.  Then surround the string by triple-quotes and run it again.

In [None]:
weird_string = "{'First': "John", 'M': "Q", 'Last': "Doe", 'Age': 99}"
print(weird_string)

##### 1.5.4.5 f-strings <a id="f_strings"></a>

A string that is prefixed by `f`, called an ***[f-string](https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals)***, can contain expressions inside curly brackets `{}`. The expressions are evaluated and replaced by the result converted to a string.

In [None]:
# f-strings, prefixed by 'f', replace expressions with strings
greeting = "Good morning"
name = "Bruce"
print(f"{greeting}, {name}, how are you today?")

##### Exercise 1.5.4.5: forgetting the `f` prefix

Remove the `f` prefix in the code cell above, then run the cell again.

In [None]:
# Just for fun
song = 13*"na " + "Batman!"
print(song)

There is a lot more to learn about strings, but we will do that [later](#strings_more).

First, we'll introduce some other types.

#### 1.5.5 Lists <a id="lists_intro"></a>

[***Lists***](https://docs.python.org/3/tutorial/introduction.html#lists) are ordered sequences of objects. The items in a list can be different types. Lists are ***[mutable](https://docs.python.org/3/reference/datamodel.html#mutable-sequences)***, which means that they ***can*** be changed (more on that soon).

Lists can be written as comma-separated items surrounded by square brackets, called a  [***list display***](https://docs.python.org/3/reference/expressions.html#list-displays).

In [None]:
# a list of grades
grades_list = ['A+','C-', 'D', 0]
print(grades_list)
print(type(grades_list))
print(len(grades_list))

##### 1.5.5.1 Subscripting lists

You can access the items in a list by using square brackets containing the index of the entry that you want typed immediately after the list, as in:
```python
some_list[i]
```
Python is zero-indexed, meaning that the ***first*** item in a list is at index 0.

***Note***: the use of square brackets to access the items in a list (called [***subscription***](#subscription)) is not related to the fact that list displays are surrounded by square brackets.  Square brackets are used to access the items in any ***[container](#containers_more)*** in Python, as we'll see with tuples and dictionaries.

##### Exercise 1.5.5.1: Accessing an item

Predict the output from running the following cell. Then run it to see if you are correct.

In [None]:
# reference an item in the list
grades_list[1]

##### 1.5.5.2 Item assignment

You can assign to a list (bind a list subscription) using an assignment statement with a list subscription as the target:
```python
some_list[i] = expresssion
```
The index `i` must already be one of the valid indices for the list (assignment to a subscription cannot change the size of the list). The `expression` will be evaluated, that object will be stored in memory, and later it can be referenced by `some_list[i]`. For more information, see [here](#subscription_target).

##### Exercise 1.5.5.2: Change one item in a list

Change the grades list so that the last grade is `'F'` instead of a zero ***USING indexing*** (changing just the one entry)

In [None]:
# Type your line of code on the line below:


# then we'll print grades to see if you were successful
print(grades_list)

<details>
    <summary><b>Solution</b></summary>
    
Since Python is zero-indexed, we can change which object the first item references using:
```python
grades_list[0] = 'F'
```  
<br>
    
***Note*** that this does ***not*** change the `0` into an `'F'`. Instead, it changes what `grades_list[0]` references. Integers, like `0`, are immutable. They cannot be changed. Lists are mutable. When we mutate a list, we change which items ***are in*** the list (or where they are in the list), we don't change the items themselves. This is a subtle but important point that will come up again and again.
</details>

**Important tip:** Sometimes it is easier for someone reading your code to see one item per line.  Expressions inside square brackets can span multiple lines, just like expressions inside parentheses or curly brackets. This is [***implicit line joining***](https://docs.python.org/3/reference/lexical_analysis.html#implicit-line-joining) again.

In [None]:
# Lists can span multiple lines
students = ['Alice', # suggest she try for honors
            'Carla', # is going to redo PS1 to try for a better score
            'Dave', # recommend that he come in for additional instruction
            'Fred', # talk to his DAC about an academic success plan
           ]

print(students)

There is a lot more to learn about lists, but we will do that [***later***](#lists_more).

First, we'll introduce some other ***[sequence](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range)*** and ***[container](https://docs.python.org/3/library/collections.html)*** types.

#### 1.5.6 Tuples <a id="tuples_intro"></a>

[***Tuples***](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) are ordered sequences of objects. The items in a tuple can be different types. Tuples are ***[immutable](https://docs.python.org/3/reference/datamodel.html#immutable-sequences)***, which means that they ***cannot*** be changed (more on that soon).

Tuples can be written as comma-separated items surrounded by parentheses (sometimes the parentheses are optional).

In [None]:
# A tuple of grades 
grades_tup = ('A+','C-', 'D', 0)
print(grades_tup)
print(type(grades_tup))
print(len(grades_tup))

**Important note:** it is the ***commas*** which make the tuple, **not** the parentheses. The parentheses are optional except where they are needed to remove ambiguity. Parentheses are needed to make an empty tuple, using `()`, since that is the ***only*** tuple without commas.

We could have created the same tuple as above using the code below:

In [None]:
grades_tup = 'A+','C-', 'D', 0
print(grades_tup)
print(type(grades_tup))
print(len(grades_tup))

##### 1.5.6.1 Subscripting tuples

Tuples are indexed the same way as lists (using square brackets).  For that matter, so are all [***sequences***](#sequences).

In [None]:
# reference an item in the tuple
grades_tup[1]

In [None]:
# reference an item in a string
'ABCDEFG'[1]

##### Exercise 1.5.6.1: Change one item in a tuple

Change the grades tuple so that the last grade is `'F'` instead of a zero USING indexing (changing just the one entry)

In [None]:
# Type your line of code on the line below:


# then we'll print grades to see if you were successful
print(grades_tup)

<details>
    <summary><b>Remember</b></summary>
    
The main difference between ***lists*** and ***tuples*** is that tuples cannot be mutated (changed).

Objects that cannot be mutated are called ***immutable***.
</details>

##### Exercise 1.5.6.2: A tuple with one item

Predict the output of the following code cell, then run it to see if you are correct.

In [None]:
# Independent study
grade_tup = ('A+')
print(grade_tup)
print(type(grade_tup))
print(len(grade_tup))

##### Exercise 1.5.6.3: Trailing commas

Add a comma to the `grade_tup` after the single item, so that it will represent a tuple, then rerun the cell.

***Trailing commas*** are allowed in Python, and ***required*** to make a tuple with just one entry without calling `tuple( )`.

<details>
    <summary><b>Solution</b></summary>
    
```python
grade_tup = ('A+',)
```
</details>

In [None]:
# Trailing commas are ok in Python
print([1,2,3,])
print((1,2,3,))

#### 1.5.7 Dictionaries <a id="dicts_intro"></a>

A [***Dictionary***](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) is a sequence of `key:value` pairs, with the requirement that the keys must be unique and [***hashable***](https://docs.python.org/3/glossary.html#term-hashable) (which basically means immutable). Dictionaries are mutable.

Dictionaries can be written as `key:value` pairs, surrounded by curly braces, with pairs separated by commas, called a [***dictionary display***](https://docs.python.org/3/reference/expressions.html#dictionary-displays).

In [None]:
# A dictionary of grades.
grades_dict = {'Alice':'A+++', 'Alice':'A+', 'Carla':'C-', 'Dave':'D', 'Fred':0}
print(grades_dict)
print(type(grades_dict))
print(len(grades_dict))

In [None]:
# Dictionaries are indexed using their keys
grades_dict['Carla']

##### Exercise 1.5.7.1: Change one item in a dictionary

Change the grades dictionary so that Fred's grade is `'F'` instead of a zero ***USING indexing*** (changing just the one entry)

In [None]:
# Type your line of code on the line below:


# then we'll print grades to see if you were successful
print(grades_dict)

<details>
    <summary><b>Solution</b></summary>
    
```python
grades_dict['Fred'] = 'F'
```
</details>

#### 1.5.8 Nesting <a id="nesting"></a>

All of these data structures can be nested, so the values insides lists, tuples, and dictionaries can themselves be lists, tuples, and dictionaries.  However, remember that the keys of a dictionary need to be hashable, so they cannot be lists. If you have lists that you would like to use for dictionary keys, you can convert your lists to tuples using the [***tuple( )***](https://docs.python.org/3/library/stdtypes.html#tuple) constructor: `my_tuple = tuple(my_list)`.

In [None]:
# an example of a nested data structure

gradebook = {
    'HW1':[('Alice',100), ('Carla',70), ('Dave',70), ('Fred',0)],
    'PS1':[('Alice',100), ('Carla',50), ('Dave',69), ('Fred',0)],
    'HW2':[('Alice',100), ('Carla',72), ('Dave',68), ('Fred',0)],
    'HW3':[('Alice',100), ('Carla',74), ('Dave',67), ('Fred',0)],
    'WPR1':[('Alice',100), ('Carla',76), ('Dave',66), ('Fred',0)],
    'HW4':[('Alice',100), ('Carla',78), ('Dave',65), ('Fred',0)],
    'PS2':[('Alice',100), ('Carla',80), ('Dave',64), ('Fred',0)],
}

print(type(gradebook))
print(type(gradebook['PS1']))
print(type(gradebook['PS1'][1]))
print(gradebook['PS1'][1])

##### 1.5.8.1 Immutable containers <a id="immutable_containers"></a>

When we say that a container, like a tuple, is ***[immutable](https://docs.python.org/3/reference/datamodel.html#immutable-sequences)***, we mean that we cannot change which objects it contains.

However, that does not mean that the objects that it contains cannot be mutated, if ***they*** are mutable.

For example, a tuple is immutable, but if one of its items is a list, that list ***can*** be mutated.

##### Exercise 1.5.8.1: Changing a list inside a tuple

Change Fred's grade to an F instead of a zero USING indexing (changing just the one entry). Hint: the list containing Fred's grade is the last list, and the grade is the second item in the list, so you'll need two sets of square brackets.

In [None]:
grades = (['Alice', 'A+'], ['Carla', 'C-'], ['Dave', 'D'], ['Fred', 0])

# Put your code here:


# we'll print to see if you were successful
print(grades)

<details>
    <summary><b>Solution</b></summary>
    
```python
grades[3][1] = 'F'
```
</details>

##### Exercise 1.5.8.2: Changing which list is inside a tuple

Predict the output of the following code, then run it to see if you are correct:

In [None]:
grades = (['Alice', 'A+'], ['Carla', 'C-'], ['Dave', 'D'], ['Fred', 0])

# Change this list
grades[3] = ['Bob', 'B']

# print to see if we were successful
print(grades)

<details>
    <summary><b>Remember</b></summary>
    
We ***can*** mutate a mutable item contained in an immutable container.
    
We ***cannot*** change which items an immutable container contains.
</details>

##### Exercise 1.5.8.3: Lists as dictionary keys

Predict the output of the following code, then run it to see if you are correct:

In [None]:
grades = {['Alice']: ['A+'], ['Carla']: ['C-'], ['Dave']: ['D'], ['Fred']: [0]}

# Change this list
grades[['Fred']] = ['F']

# print to see if we were successful
print(grades)

<details>
    <summary><b>Remember</b></summary>
    
The keys for dictionaries must be ***hashable***, so they cannot be lists.
</details>

#### 1.5.9 Sets <a id="sets_intro"></a>

[***Sets***](https://docs.python.org/3/reference/datamodel.html#set-types) are unordered collections of objects. They are mutable. They can only contain [hashable](https://docs.python.org/3/glossary.html#term-hashable) (which basically means immutable) objects. An item is either in a set, or not; an item cannot be in a set multiple times.

Sets can be written as commas separated items surrounded by curly braces, called a ***[set display](
https://docs.python.org/3/reference/expressions.html#set-displays)***.

In [None]:
# a set of grades
grades_set = {100, 72, 67, 50, 72}
print(grades_set)
print(type(grades_set))
print(len(grades_set))

##### Exercise 1.5.9.1: Change one item in a set

Change the grades set so that the last grade is a `0` USING indexing (changing just the one entry)

In [None]:
# Type your line of code on the line below:


<details>
    <summary><b>Remember</b></summary>
    
Sets are ***unordered***, so there is no *last* item, and you can not use indexing on a set.

You can remove an item by value using the `remove` method: `grades_set.remove(100)`
    
You can add an item using the `add` method: `grades_set.add(0)`
</details>

##### Using `set` to remove duplicates from a `list` <a id="set_dedupe_list"></a>

Since a set cannot contain duplicates, you can remove the duplicates from a list by [converting](#conversion) it to a set, by passing it as the argument to the [***set( )***](https://docs.python.org/3/library/functions.html#func-set) constructor. 

You can convert it back to a list if necessary, by passing the set as the argument to the [***list( )***](https://docs.python.org/3/library/functions.html#func-list) constructor.  However, it may not be in the same order as the original list.

See the following example:

In [None]:
numbers = [5,2,7,5,8,4,8,9,7,2,3,1]
print(len(numbers))

unique = set(numbers)

unique_list = list(unique)
print(unique_list)
print(len(unique_list))

#### 1.5.10 Ranges <a id="ranges_intro"></a>

[***Ranges***](https://docs.python.org/3/library/stdtypes.html#ranges) are immutable sequences of integers. The [***range( )***](https://docs.python.org/3/library/functions.html#func-range) constructor requires at least one argument, the value for `stop`. If that is the only argument passed to `range()`, the values in range are the non-negative integers less than `stop`. Note that there are `N` non-negative integers in `range(N)`, if `N` is non-negative.

In [None]:
# The sequence of non-negatives integers LESS THAN 6
r6 = range(6)
print(r6)
print(type(r6))
print(len(r6))

We can use the built-in functions `min` and `max` to see the minimum and maximum values in `r6`:

In [None]:
print("The min of r6 is", min(r6))
print("The max of r6 is", max(r6))

**Note** ranges can be used like tuples, but have the advantage of not creating the entire tuple,  which takes time and memory. To see all of the values, you can convert the range to a `tuple` by passing the range as the only argument in a tuple constructor expression (a call to `tuple`), as in:

In [None]:
tup6 = tuple(r6)
print(tup6)
print(type(tup6))

In [None]:
# import the sys module
import sys

In [None]:
# this is created in nanoseconds
r_big = range(200_000_000) # you can use underscores in python numbers

print(f"r_big takes up {sys.getsizeof(r_big)} bytes")

In [None]:
# this takes a lot longer
tup_big = tuple(r_big)

print(f"tup_big takes up {sys.getsizeof(tup_big)} bytes")

##### 1.5.10.1 Range `start` and `step`

Besides the required `stop` argument, the range function can also take a `start` and/or a `step`. The order of the positional arguments is `(start, stop, step)`.  If `start` is not given, it defaults to `0`. If `step` is not given, it defaults to `1`.

***Important note:*** the values in the range ***include*** the `start` (for most choices of `stop` and `step`, such as when `step > 0` and `stop > start`), but the  values in the range ***exclude*** the `stop` (always).

In [None]:
# which numbers are in this range object?
r_even = range(10, 30, 2)

##### Exercise 1.5.10.1: Last item in a `range`

What do you think is the last value in `r_even`? Convert it to a tuple and print it.

In [None]:
# Type your line of code on the line below:
print(tuple(r_even))

<details>
    <summary><b>Solution</b></summary>
    
```python
print(tuple(r_even))
```
It contains every other (`step=2`) integer from 10 (`start=10`) to 28, because it stops ***before*** (i.e. not including) 30 (`stop=30`)
</details>

## Part 2: Python Objects <a id="objects"></a>

Data in Python is represented by [***objects***](https://docs.python.org/3/reference/datamodel.html). Every object has a [***type***](#type), an [***identity***](#identity), and a [***value***](#value).

We have already seen many instances of checking that two objects have the same value by using `==`. We have also seen that two objects can have the same value even if they have different types (e.g. `1 == 1.0`), although this is [very uncommon](https://docs.python.org/3/library/stdtypes.html#comparisons) for non-numeric types.

Next, we'll learn a little more about types. Then, we'll discuss identity.

### 2.1 Type <a id="type"></a>

Every object has a [***type***](https://docs.python.org/3/library/stdtypes.html). The type of an object determines what you can do with it. For example, you cannot add an integer and a string, but you can multiply an integer and a string. The type of an object also often indicates which attributes it has ([discussed shortly](#attributes)).

The type of an object can be evaluated using the built-in function [***type( )***](https://docs.python.org/3/library/stdtypes.html#bltin-type-objects). However, to check that an object is a particular type, it is recommended to use the built-in function ***[isinstance](https://docs.python.org/3/library/functions.html#isinstance)***.  We have already seen several types, such as `int`, `float`, `bool`, `NoneType`, `str`, `list`, `tuple`, `dict`, `set`, and `range`.

#### 2.1.1 Instantiation <a id="instantiation"></a>

One of the types that we have not seen yet is `complex`.  We can create an [***instance***](https://docs.python.org/3/tutorial/classes.html#instance-objects) of an object of type `complex` by calling the [***complex( )***](https://docs.python.org/3/library/functions.html#complex) [class object](https://docs.python.org/3/tutorial/classes.html#class-objects) with two arguments: the real and imaginary parts. This is called ***instantiation***. Since the call to the class evaluates to a value (a new instance of the class), it is an [expression](#expressions); it's called an **[object constructor expression](https://docs.python.org/3/reference/datamodel.html#object.__new__)**.  We could also call `complex( )` with a string argument that represents the complex number.  In general, different class objects accept different arguments for instantiation.  If you're not sure which arguments a class object accepts, you can refer to the documentation, or try some guesses and see if they work.

In [None]:
# Create an instance of type complex, which has real part=2 and imaginary part=3.
z1 = complex(2, 3)
z2 = complex('2+3j') # equivalent
print(z1 == z2)

Print a string representation of `z`. Note that Python uses `j` instead of `i` for the imaginary unit.

In [None]:
# print the value of z
print(z1)
print(z2)

In [None]:
# print the type of object that `z` is
print(type(z1))
print(type(z2))
print(isinstance(z1, complex))

Note that an object's type is data, so it is also an object, and has a type. Its type is `type`.

In [None]:
print(type(type(z1)))

In Python, some people say that "Everything is an object."

In [None]:
# print is a built-in function
type(print)

But that is a little over zealous:

In [None]:
type(+)

#### 2.1.2 Conversion <a id="conversion"></a>

Class objects can also be called to *convert* an object to a different type - although it does **not** actually change the original object (an object's type ***cannot*** be changed), rather it creates a ***new*** object. The original object is passed to the **[`__init__( )`](https://docs.python.org/3/reference/datamodel.html#object.__init__)** method of the new instance, in order to initialize the new instance.  For example, passing an object to **[str( )](https://docs.python.org/3/library/functions.html#func-str)** makes a string version of the object.

In [None]:
# Create an instance of type str
txt = str(3.14159)
print(txt)
print(type(txt))
print(isinstance(txt, str))

In [None]:
# Create an instance of type float
alot = float('Infinity')
print(alot)
print(type(alot))
print(isinstance(alot, float))

In [None]:
# Create an instance of type tuple
#   then create an instance of type dict, using that tuple, which is hashable, as a key
favorite_book = tuple(["The Hitchhiker's Guide to the Galaxy", 1979])
book_authors = dict(favorite_book="Douglas Adams")

print(favorite_book, type(favorite_book))
print(book_authors, type(book_authors))

#### 2.1.3 Type vs Class <a id="type_vs_class"></a>

This is getting into the weeds, so feel free to skip it, but if you are at all confused about the difference between a `type` and a `class`, this section will try to provide a little clarity (or make you more confused).

In Python 2, the built-in types like `int`, `float`, `list`, etc., were more different from user-defined [custom classes](https://docs.python.org/3/reference/datamodel.html#custom-classes) than they are in Python 3, but starting in [Python 2.2.3](https://www.python.org/download/releases/2.2.3/descrintro/) they were unified (mostly?). Now, if you make your own class named `MyClass`, then (unless you use a [metaclass](https://docs.python.org/3/reference/datamodel.html#metaclasses)) `type(int) == type(MyClass)`. In fact, the types are not just equal, `type(int) is type(MyClass)` because `type(int) is type` and `type(MyClass) is type`. As we said above, a class object can be used to create instances of the class. Well, every object is an instance of a class (or multiple classes, through inheritance), so class objects are instances of a [metaclass](https://docs.python.org/3/reference/datamodel.html#metaclasses), and the default metaclass is [type](https://docs.python.org/3/reference/datamodel.html#metaclasses). In fact, type is the mother of all metaclasses, in that for ***every*** object `x`, `isinstance(type(x), type)` is True.

The built-in function [type( )](https://docs.python.org/3/library/functions.html#type) returns the type of an object, which according to the documentation at that link is ***generally the same object as returned by*** `object.__class__`, which is the name of the class used to instantiate the object. The `type` of the object ***is*** its `class`.

We'll discuss ***[classes](#classes)*** much later, but you can create your own user-defined class using a [class definition](https://docs.python.org/3/tutorial/classes.html#class-definition-syntax). The object created by a class definition is referred to as a ***class object***, just like the object created by a function definition is  referred to as a ***function object***, but there is a confusing difference: the type of the function object is `function`, whereas the type of the class object is `type`. There is no `class` type.

If all of this is confusing, it would probably be best to think of ***type*** and ***class*** as synonyms for now, while keeping in mind the difference in syntax: the  built-in function [type( )](https://docs.python.org/3/library/functions.html#type) returns the type of an object, whereas user-defined classes are created with a [class definition](https://docs.python.org/3/reference/compound_stmts.html#class).

Technically, the "built-in function" `type( )` is ***not*** a function object, it ***is*** the metaclass object `type`, which when called with one argument returns the type of an object as stated earlier, but when called with three arguments creates a new instance of type type, i.e. a class!  This can be used for dynamic creation of class objects.

In [None]:
# type is its own type:
type(type) is type

In [None]:
# type is an instance of itself:
isinstance(type, type)

### 2.2 Attributes <a id="attributes"></a>

Objects can have [***attributes***](https://docs.python.org/3/glossary.html#term-attribute), which are other objects accessed using the ***dot*** notation: `obj.name`.

#### 2.2.1 Data attributes

In [None]:
# Our complex number `z` has an attribute named `imag`
z = complex(2, 3)
z.imag

Notice that although the type of `z` is `complex`, the type of its `imag` attribute is `float`:

In [None]:
type(z.imag)

#### 2.2.2 Methods (method attributes) <a id="methods"></a>

Some attributes are functions. These are called [***methods***](https://docs.python.org/3/glossary.html#term-method).

In [None]:
# assign the conjugate method of `z` to the name `w`
w = z.conjugate

print(f"w = {w}")

# the type is 'builtin_function_or_method'
print(type(w))

#### 2.2.2.1 Calling a method <a id="call"></a>

If we wanted `w` to ***be*** the [***complex conjugate***](https://docs.python.org/3/library/numbers.html#numbers.Complex.conjugate) of `z`, rather than be the *method* that computes the complex conjugate of `z`, then we need to ***call*** that method (it returns the complex conjugate). In order to [***call***](https://docs.python.org/3/reference/expressions.html#calls) a method, you need to use parentheses `()` after the method, just like when calling any function (e.g. `f(x)`) or other [***callable***](https://docs.python.org/3/glossary.html#term-callable).

In [None]:
# call the method `z.conjugate`, then assign the value returned to `w`
w = z.conjugate()

print(f"w = {w}")

# the type of the value returned by the call to the method is 'complex'
print(type(w))

In [None]:
# Because it's often handy to print the name and value of an object,
#   there's a syntax shortcut:
print(f"{w = }")

##### Exercise 2.2.2.1: Methods

More practice with attributes and methods

In [None]:
# NumPy is a very popular module for working with arrays
#     note that NumPy must be installed, or you will get an error
import numpy as np

arr = np.array(range(12))
print(arr)
print(type(arr))

***Exercise 2.2.2.1a: Print the value of the `shape` attribute of `arr`***

In [None]:
# Type your code on the line below this comment:


<details>
    <summary><b>Solution</b></summary>
    
```python
print(arr.shape)
```
<br>
    
Or, since this is the only expression in the code cell, your code could just be `arr.shape`, and its value would output in an output cell below your code cell.
</details>

   ***Exercise 2.2.2.1b: What TYPE do you think it is?***

Check to see if you are correct:

In [None]:
# Type your code on the line below this comment:


<details>
    <summary><b>Solution</b></summary>
    
```python
type(arr.shape)
```
<br>
    
Its type is `tuple`.
</details>

***Exercise 2.2.2.1c: Make the array 2d, by assigning (4,3) to the `shape` attribute***

In [None]:
# Type your code on the line below this comment:


# We'll print the array, to see if you were successful
print(arr)

<details>
    <summary><b>Solution</b></summary>
    
```python
arr.shape = (4,3)
```
</details>

***Exercise 2.2.2.1d: What is the average of all the values in the array?***

Determine that by ***calling*** the `mean` method of `arr`.

In [None]:
# Type your code on the line below this comment:


<details>
    <summary><b>Solution</b></summary>
    
```python
arr.mean()
```
</details>

***Exercise 2.2.2.1e: What are the column averages?***

This time put `axis=0` inside the parentheses for the call to mean (passing in the argument `0` for the `axis` parameter), to find the column averages.

In [None]:
# Type your code on the line below this comment:


<details>
    <summary><b>Solution</b></summary>
    
```python
arr.mean(axis=0)
```
</details>

#### 2.2.3 Use `dir` to list attributes <a id="dir"></a>

You can use the built-in function [***dir( )***](https://docs.python.org/3/library/functions.html#dir) to list the attributes of an object, both the data attributes and the method attributes.  Attributes with names that have leading and trailing double underscores (referred to as [***dunder names***](https://docs.python.org/3/reference/lexical_analysis.html#reserved-classes-of-identifiers)) are not meant to be used directly, although they can be.

In [None]:
# List the attributes of our NumPy array
dir(arr)

### 2.3 Identity and Naming <a id="identity"></a>

Every object has an [***identity***](https://docs.python.org/3/library/functions.html#id), an integer which is guaranteed to stay the same over the life of the object, and which is returned from a call to the `id()` function with the object as the only argument.  Different objects have different identities, even if they have the same values.

In [None]:
# Create a list, and print its id
a = [7, 8, 9]
print(id(a))

In [None]:
# Create another list
b = [7, 8, 9]

# check if the two lists are equal (have the same value)
print(a == b)

#### 2.3.1 Testing identity with `is` <a id="is_keyword"></a>

You can test if two names refer to the same object using the keyword [***is***](https://docs.python.org/3/reference/expressions.html#is).

In [None]:
# Check if the two names refer to the SAME list
print(a is b)

In [None]:
# You can see they are different by inspecting their ids
print(id(a))
print(id(b))

#### 2.3.1.1 Mutating does not change identity

If we change which items a list contains, that does not change its id, which is guaranteed to stay the same over the life of the object. ***Mutating does not change the identity***.

In [None]:
b[2] = 10
print(b)
print(id(b))

#### 2.3.2 Objects can have multiple names <a id="two_names"></a>

[Assignment statements in Python do not copy objects](https://docs.python.org/3/library/copy.html), they bind names.

In [None]:
# Bind the name `c` to the object named `a`. Now it has (at least) two names.
c = a

print(a is c)

##### Exercise 2.3.2.1: a list by any other name ...

In [None]:
# Change the last entry in the list `a` to have the value 'cat'


# We'll print `a` to see if you were successful
print(a)

<details>
    <summary><b>Solution</b></summary>
    
```python
a[-1] = 'cat'
```
</details>

***What*** do you think the output of the following print function will be?

In [None]:
# Predict the output before running the cell
print(c)

<details>
    <summary><b>Remember</b></summary>
    
Changing the last entry in the list named `a` (mutating that list) also changed the last entry in the list named `c`, because ***they are the same list!***. There is only one list, but it has two names: `a` and `c`. 
    
Executing the assignment statement `c = a`, does ***NOT*** create a second list.  It only binds the name `c` to the list named `a`, giving it a second name.
</details>

#### 2.3.3 The `list.copy()` method <a id="list_copy"></a>

If we want to ***COPY*** a list, instead of giving it another name, then we can use the `copy()` method on the list that we want to copy.  That ***will*** create a ***new*** list.  The new list will have the same value, but it will ***not*** have the same id, since it is a ***different*** list.  Note that the `copy()` method creates a [***shallow***](https://docs.python.org/3/library/copy.html#module-copy) copy, meaning that it does ***not*** copy the items in the list. It is a new list that contains the ***same*** items.

In [None]:
# make a copy of the list named `a`, and name that new list `d`
d = a.copy()

In [None]:
# Now change the second entry in the list `a` to have the value [3, 4]
a[1] = [3,4]

# We'll print `a` to see if you were successful
print(a)

##### Exercise 2.3.3.1: Mutating an item in a copy

What do you think the output of the following print function will be?

In [None]:
# Predict the output before running the cell
print(d)

<details>
    <summary><b>Solution</b></summary>
    
Mutating the list `a` (i.e. changing which items it contains) did not mutate the list `d` because they are ***different*** lists.
</details>

#### 2.3.4 Deep copy <a id="deep_copy"></a>

You can use the [***deepcopy***](https://docs.python.org/3/library/copy.html#copy.deepcopy) function in the `copy` module in order to completely copy an object, including making copies of all of the objects it contains, and those contain, etc.  Note that for large objects which contain many nested objects this can take a longe time, so only make a deep copy when you really need one.

In [None]:
# Let's make a copy of the list `a`
e = a.copy()

# print the values and ids of the objects in the list `e`:
print(f"{e[0]} has id {id(e[0])}")
print(f"{e[1]} has id {id(e[1])}")
print(f"{e[2]} has id {id(e[2])}")

In [None]:
# The second entry in the list `a` is [3, 4]
# Let's change the first entry in that list to 'whoa'
a[1][0] = 'Whoa'

# We'll print `a` to see if you were successful
print(a)

##### Exercise 2.3.4.1: Mutating an item in a copy

What do you think the output of the following print function will be?

In [None]:
# Predict the output before running the cell
print(e)

<details>
    <summary><b>Remember</b></summary>
    
We made a copy of the list `a` and named that copy `e`. If we had assigned a new object to be the second item in `a` (i.e. mutated `a`), that object would ***not*** become the second item in `e`, since they are different lists: mutating `a` does not mutate `e`.
    
However, both list `a` and list `e` contain the same list as their second item.  The assignment statement `a[1][0] = 'Whoa'` mutated ***that*** list.  It did not mutate `a`, which contains the same objects that it did before that assignment statement was executed.  
    
Since `e` also contains ***that*** list, we see the mutation when we `print(e)`.
</details>

We [did not mutate](#immutable_containers) `e`, it is still a list of the same three objects (their `id` have not changed from before). However, the value of one of its (mutable) objects has changed:

In [None]:
# print the values and ids of the objects in the list `e`:
print(f"{e[0]} has id {id(e[0])}")
print(f"{e[1]} has id {id(e[1])}")
print(f"{e[2]} has id {id(e[2])}")

A regular copy of a container (like a list) is a ***different*** container that contains the ***same*** items.  If any of the items are mutable, then mutations to the items are visible in both containers.

In order to make sure that ***NO CHANGES*** to an object effect a second copy of that object, not even mutating one of its mutable items, then we need to make that second copy a ***deepcopy***.

A deep copy of a container (like a list) is a ***different*** container that contains ***different*** items (which are themselves deep copies of the items in the original container).

Notice the recursive nature of the deep copy.  For deeply nested data structures, making a deep copy can take a long time.

In [None]:
import copy

# make a DEEPCOPY of `a`
f = copy.deepcopy(a)
print(f)

In [None]:
# Now changes to `a` will not affect `f`
a[1][1] = 'Nellie!'

print(a)
print(f)

#### 2.3.5 Immutable objects may unexpectedly be the same <a id="only_one_None"></a>

The following case rarely comes up, but here is its explanation in case it does.

The following two assignment statements cause two ***different*** objects to be stored in memory:
```python
a = [1, 2, 3]
b = [1, 2, 3]
```
Python ***must*** make these different lists, because it does not know if you will change them to have different values later.

However, the same is not true for ***immutable*** objects.  An implementation of Python is free to use the same immutable object every time you bind a name to it, since you can never change its value. This can save memory; for example, so that `None` is not stored in memory ***many*** times unnecessarily:

In [None]:
a = None
b = None
a is b

## Part 3: Control Flow <a id="control_flow"></a>
In this part we will go over the syntax used to [***control the flow***](https://docs.python.org/3/tutorial/controlflow.html) of a program, such as iteration, looping, and conditional branching.

### 3.1 `if` statements <a id="if_keyword"></a>

To have code execute conditionaly, use the [***if***](https://docs.python.org/3/tutorial/controlflow.html#if-statements) keyword, followed by an expression (the condition), then a colon, then the code in a block (called the [***suite***](https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-suite)) ***indented*** by four spaces, such as:
```python
if some_expression:
    suite
```
The code in the suite will execute if and only if the expression evaluates as [truthy](#truth_value).

Note that when writing code in a Jupyter notebook (or other IDE), pressing enter after typing the colon after the condition automatically indents the cursor.  

Failure to include an indented block of code results in an <font color='red'>IndentationError</font>.

Many other programming languages use delimiters like curly braces `{}` to denote a block of text, but Python uses [***indentation***](https://docs.python.org/3/reference/lexical_analysis.html#indentation).

Note: there is also a different notion of a code [***block***](https://docs.python.org/3/reference/executionmodel.html#structure-of-a-program) in Python, which is ***not*** what is meant here by an *indented block of code*. Here we are only referring to code that has the same [***level of indentation***](https://docs.python.org/3/reference/lexical_analysis.html#indentation), which is how the grouping of statements is determined.

Let's look at some examples:

In [None]:
# this `if` statement is NOT followed by an indented block of code, so it will raise an error
if True:
    
a = 1

In [None]:
guess = 99

if guess < 42:
    # this is INSIDE the block of code that ONLY executes when `guess < 42` is True.
    print("Too low!") 
print("Have a nice day!") # This is not inside that block of code

##### Exercise 3.1: `if`

Change the value of `guess` to `13` and run the cell again. What changed?

<details>
    <summary><b>Solution</b></summary>
    
With `guess = 13`, the expression `guess < 42` evaluates to `True`, so the code in the suite is executed, printing `"Too low!"`.
</details>

#### 3.1.1 `else` clause <a id="else_keyword"></a>

You can optionally include an `else` clause. If present, it must include an indented block of code, as in:
```python
if some_expression:
    suite1
else:
    suite2
```
When there are not any `elif` clauses, the suite in the `else` clause executes if and only if the expression in the `if` statement evaluates as [falsy](#truth_value).

In [None]:
# You can also have an `else` block:

guess = 99

if guess < 42:
    print("Too low!")
else:
    print("Too high!")
print("Have a nice day!")

#### 3.1.2 `elif` clause <a id="elif_keyword"></a>

You can optionally include as many `elif` clauses as you want. If present, they must include an indented block of code. The first `elif` that has an expression which evaluates as truthy will have its suite executed.  Any remaining `elif` clauses or `else` clause is ignored.  When there are `elif` clause(s) and an `else` clause, the suite in the `else` clause executes if and only if all of the expressions in the `if` statement and all `elif` clause(s) evaluate as falsy. You do not need to have an `else` block to have `elif` block(s).

In [None]:
# of course, the guessing game above is RIGGED!

# we can add an `elif` clause to make the game possible to win:
guess = 99

if guess < 42:
    print("Too low!")
elif guess > 42:
    print("Too high!")
else:
    print("You win!")
print("Have a nice day!")

##### Exercise 3.1.2.1: `elif`

Predict the output of the following code cell.  Then run it to see if you are correct.

In [None]:
# we can add as many `elif` clauses as we want
guess = 99

if guess < 10:
    print("Way too low!")
elif guess < 42:
    print("Too low!")
elif guess > 90:
    print("Way too high!")
elif guess > 42:
    print("Too high!")
else:
    print("You win!")
print("Have a nice day!")

##### Exercise 3.1.2.2: more `elif`

Change the value of guess multiple times, getting each print statement to print.

Can you get **"Way too high!"** AND **"Too high!"** to print?

<details>
    <summary><b>Solution</b></summary>
    
You should be able to get each print function to execute by choosing an appropriate number for `guess`, but you ***cannot*** get `"Way too high!"` ***and*** `"Too high!"` to print at the same time, because once the condition in an `elif` evaluates as true, the rest of the `elif` are skipped.
</details>

##### Exercise 3.1.2.3: why use `elif`?

Why do we need `elif`?  Why not just use multiple `if` statements?

Predict the output of the following cell. Then run it to see if you are correct.

In [None]:
# we can add as many `elif` blocks as we want
guess = 99

if guess < 10:
    print("Way too low!")
if guess < 42:
    print("Too low!")
if guess > 90:
    print("Way too high!")
if guess > 42:
    print("Too high!")
else:
    print("You win!")
print("Have a nice day!")

<details>
    <summary><b>Solution</b></summary>
    
Every `if` statement with a condition that evaluates as true will have its block of code executed. Sometimes that is what you want to happen. If so, use multiple `if` statements.
    
When you want to ensure that exactly ***one*** block of code executes, use the `if`-`elif`-`else` construct.
</details>

### 3.2 Truth Value Testing <a id="truth_value"></a>

Any expression can be used as the condition in an `if` statement or its `elif` and/or `else` clause(s), no matter what type of object it evaluates to (it does ***not*** need to evaluate to either of the `bool` types: `True` or `False`).

The [***truth value***](https://docs.python.org/3/library/stdtypes.html#truth-value-testing) of most objects is true, in the sense that the code in the `if` block will execute (sometimes referred to as "truthy"). Besides `False`, the main built-in objects that are considered false are `None`, ***zero*** of any numeric type, or ***empty*** containers (lists, tuples, dictionaries, etc.); this is sometimes referred to as being "falsy".

##### Exercise 3.2.1: true or false

Bind the name `answer` to an object of one of the types you learned so far and predict which statement will print before running the code cell. Then run it to see if you are correct. ***REPEAT*** for many different types of values.

In [None]:
# bind the name `answer` to different values
answer = 

# Is `answer` true? (not necessary True with a capital T)
if answer:
    print("It is True!")
else:
    print("It is False!")
    
print(answer)
print(type(answer))

#### 3.2.1 Pythonic `if` statements <a id="pythonic_if"></a>

Rather than check if a number does not equal zero or a container has a positive length, in Python it is common to use the truth value of the objects directly.

In [None]:
# Rather than do this:
N_STUDENTS = 0

if N_STUDENTS != 0:
    print("I have {N_STUDENTS} students.")
else:
    print("I don't have any students.")

In [None]:
# In Python, we can just use the value directly.
N_STUDENTS = 0

if N_STUDENTS:
    print("I have {N_STUDENTS} students.")
else:
    print("I don't have any students.")

In [None]:
# Similarly, instead of checking that a container is nonempty, as in:
my_students = ['Alice', 'Bob', 'Carl', 'Dave', 'Fred']

if len(my_students) > 0:
    print("My students are:", end=" ") # `end` is a newline by default, but I'm making it a space
    print(*my_students, sep=", ") # the * unpacks the list, and `sep` is what to print between items
else:
    print("I don't have any students.")

In [None]:
# In Python, we can just check the truth value of the object
my_students = ['Alice', 'Bob', 'Carl', 'Dave', 'Fred']

if my_students:
    print("My students are:", end=" ")
    print(*my_students, sep=", ")
else:
    print("I don't have any students.")

In [None]:
# Any object with length zero evaluates to False
msg = ""

if msg:
    print(msg)
else:
    print("I got nothin.")

***Important Tip:*** if you are using what looks like `False` but it is behaving as if it is `True`, make sure that it is ***not*** text, since `"False"` is truthy!  The only falsy string is the empty string.

In [None]:
# bind the name `answer` to the string "False"
answer = "False"

if answer:
    print("It is True!")
else:
    print("It is False!")
    
print(answer)
print(type(answer))

***Important Tip:*** If you want to ensure that the value of `some_variable` is not `None`, then do ***NOT*** rely on:
```python
if some_variable:
    block_of_code
```
...since that condition will also be considered false for zero and empty containers

In [None]:
# Youngest child's age, or `None` if the person does not have any children
youngest_child_age = 0 #newborn

if youngest_child_age:
    print(f"Her youngest child is {youngest_child_age}.")
else:
    print("She does not have any children.")

<details>
    <summary><b>Solution</b></summary>
    
She does have children, but the child's age is zero, which is "falsy".  We should have used:
```python
if youngest_child_age is not None:
    print(f"Her youngest child is {youngest_child_age}.")
else:
    print("She does not have any children.")
```
</details>

### 3.3 Boolean operations 

The condition used for flow control can be composed of other conditions using the [***boolean operations***](https://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not) `and`, `or`, and `not`.

#### 3.3.1 Logical negation <a id="not_keyword"></a>

The logical operation of negation uses the keyword `not`. The following expression evaluates to `True` if `some_expression` is falsy, or `False` if the expression is truthy:
```python
not some_expression
``` 

##### Exercise 3.3.1.1: `not some_expression`

Predict the value of the following expressions, and the output of the cells. Then run the cells to see if you are correct.

In [None]:
not False

In [None]:
not None

In [None]:
not 7

In [None]:
not ()

In [None]:
not [0]

In [None]:
not (0)

In [None]:
not "False"

In [None]:
not print("This is a function that does stuff.")

<details>
    <summary><b>Solution</b></summary>
    
`not False` evaluates to `True`.
    
`None` is "falsy", so `not None` also evaluates to `True`.
    
`7`, like any non-zero number, is "truthy", so `not 7` evaluates to `False`.
    
`()` is the empty tuple, and every empty container is "falsy", so `not ()` evaluates to `True`.
    
`[0]` is a non-empty list, and no matter what it contains, a non-empty container is "truthy", so `not [0]` evaluates to `False`.
    
`(0)` is the number zero, ***not*** a tuple, because there is no trailing comma. Zero is "falsy", so `not (0)` evaluates to `True`.
    
`"False"` is a non-empty string, so no matter what it 'says', it is "truthy", so `not "False"` evaluates to `False`.
    
The `print()` function prints a string representation of its arguments, but it actually returns `None`, which is "falsy", so `not print("This is a function that does stuff.")` evaluates to `True`.
</details>

##### 3.3.1.1 Negating `is`

Earlier, we learned how to check that two names refer to the same object using the keyword [***is***](#is_keyword), such as:
```python
a is b
```
Note that you *could* check that two names refer to *different* objects by negating the expression as:
```python
not a is b
```
<span style="color:red">***But do NOT do that!***</span> Instead, use the [***is not***](https://docs.python.org/3/reference/expressions.html#is-not) operator, as recommended by [***PEP 8***](https://peps.python.org/pep-0008/#programming-recommendations), because it is easier for humans to parse:
```python
a is not b
```
Note that `a is not b` means the same thing as `not (a is b)`, not the same thing as `a is (not b)`.

In [None]:
a = [1,2,3]
b = [1,2,3]

# same value, but DIFFERENT objects
a is not b

#### 3.3.2 Logical conjunction <a id="and_keyword"></a>

The logical operation of conjuction uses the keyword `and`. The following expression is truthy if `expression1` ***and*** `expression2` are ***both*** truthy, otherwise the expression is falsy. 
```python
expression1 and expression2
```
***Note 1:*** the expression does not necessary evaluate to `True` or `False`. Instead, it evaluates to `expression1` if `expression1` is falsy, otherwise it evaluates to `expression2`.

***Note 2:*** the expression ***short-circuits*** if `expression1` is falsy, meaning that `expression2` will not be evaluated at all in that case.

##### Exercise 3.3.2.1: `expression1 and expression2`

Predict the value of the following expressions ***AND*** the output of the cells. Then run the cells to see if you are correct.

***REALLY THINK*** about the truth value of ***each*** term before you form an answer.

In [None]:
7 and 8

In [None]:
8 and "banana"

In [None]:
9 and None

In [None]:
print(9 and None)

In [None]:
[] and "banana"

In [None]:
[0] and print("banana")

In [None]:
(0) and not "banana"

In [None]:
0 and print("Orange you glad I didn't say banana!")

<details>
    <summary><b>Solution</b></summary>
    
`7`, like any non-zero number, is "truthy", so `7 and 8` evaluates to `8`.
    
`8` is "truthy", so `8 and "banana"` evaluates to `"banana"`.
    
`9` is "truthy", so `9 and None` evaluates to `None`, but jupyter notebooks do not output `None`, so it did not create an output cell.  
    
`print(9 and None)` prints the value of `9 and None`, which is `None`. Note that that is different than creating an output cell.
    
`[]` is an empty list, which is "falsy", so `[] and "banana"` evaluates to `[]`.
    
`[0]` is a non-empty list, which is "truthy", so `[0] and print("banana")` prints `banana` and returns `None`, which is why there is no output cell. Notice that the `banana` that is printed is not surrounded by quotes, because the function ***printed*** `banana`, it did ***not*** return `"banana"`.

`(0)` is the number zero, ***not*** a tuple, so it is "falsy" and `(0) and not "banana"` evaluates to `0`
    
`0` is "falsy", so `0 and print("Orange you glad I didn't say banana!")` short-circuits and never prints.
</details>

#### 3.3.3 Logical disjunction <a id="or_keyword"></a>

The logical operation of disjunction uses the keyword `or`. The following expression is truthy if either `expression1` ***or*** `expression2` are truthy (or ***both*** are truthy), otherwise the expression is falsy.
```python
expression1 or expression2
``` 
***Note 1:*** the expression does not necessary evaluate to `True` or `False`. Instead, it evaluates to `expression1` if `expression1` is truthy, otherwise it evaluates to `expression2`.

***Note 2:*** the expression ***short-circuits*** if `expression1` is truthy, meaning that `expression2` will not be evaluated at all in that case.

##### Exercise 3.3.3.1: `expression1 or expression2`

Predict the value of the following expressions ***AND*** the output of the cells. Then run the cells to see if you are correct.

***REALLY THINK*** about the truth value of ***each*** term before you form an answer.

In [None]:
7 or 8

In [None]:
8 or "banana"

In [None]:
None or 9

In [None]:
[] or "banana"

In [None]:
[0] or print("banana")

In [None]:
(0) or not "banana"

In [None]:
0 or print("Orange you glad I didn't say banana!")

<details>
    <summary><b>Solution</b></summary>
    
`7`, like any non-zero number, is "truthy", so `7 or 8` evaluates to `7`.
    
`8` is "truthy", so `8 or "banana"` evaluates to `8`.
    
`None` is "falsy", so `None or 9` evaluates to `9`.
    
`[]` is an empty list, which is "falsy", so `[] or "banana"` evaluates to `"banana"`.
    
`[0]` is a non-empty list, which is "truthy", so `[0] or print("banana")` evaluates to `[0]`, short-circuiting without printing `"banana"`.

`(0)` is the number zero, ***not*** a tuple, so it is "falsy" and `(0) or not "banana"` evaluates to `not "banana"`, which evaluates to `"False"` since `"banana"` is a non-empty string.
    
`0` is "falsy", so `0 and print("Orange you glad I didn't say banana!")` evaluates to `print("Orange you glad I didn't say banana!")`, which prints `Orange you glad I didn't say banana!` and evaluates to `None`, which is why there is no output cell.
</details>

##### Exercise 3.3.3.2: Alternate uses of Boolean operations 

Earlier, we saw the conditional logic in the first two cells below, that uses `if` and `else`.

Achieve the same result in the cell after those, without using  `if` and `else`, by passing some Boolean expression as the argument to `print`. Run the cells to see if you get the same results for both approaches. Next, change `msg` to `"Now, I got somethin."` and run the cells again, checking that you still get the same result for both approaches.

In [None]:
# Any object with length zero evaluates to False
msg = ""

In [None]:
if msg:
    print(msg)
else:
    print("I got nothin.")

In [None]:
# put your Boolean expression in the call to `print()` below
print()

<details>
    <summary><b>Solution</b></summary>
    
```python
print(msg or "I got nothin.")
```
</details>

#### 3.3.4 Conditional Expressions (aka ternary operator) <a id="ternary"></a>

Python also has [***conditional expressions***](https://docs.python.org/3/reference/expressions.html#conditional-expressions) with the following syntax:
```python
x if C else y
``` 
The truth value of condition `C` is evaluated first. If `C` is truthy, this expression evaluates as `x`, otherwise it evaluates as `y`.

Given those semantics, we can give the following equivalences for `and` and `or` in terms of conditional expressions:

- `x and y` is equivalent to `x if (not x) else y`
- `x or y` is equivalent to `x if x else y`

Of course, conditional expressions are more general, and the condition `C` does not have to be at all related to `x` or `y`.

In [None]:
msg = ""

print(msg if msg else "I got nothin.")

In [None]:
correct_answer = 42
student_answer = 24

print("Correct. Good job!" if student_answer == correct_answer else "Incorrect. Try again.")

### 3.4 `while` loop <a id="while_keyword"></a>

The syntax of a [***while***](https://docs.python.org/3/reference/compound_stmts.html#while) loop is similar to the syntax of an `if` statement:
```python
while some_expression:
    suite
```
Unlike an `if` statement, which checks its condition one time, a `while` loop (1) checks its condition, (2) executes the suite if the condition is true, then repeats those two steps until the condition is false. If the condition is false on the first check, the suite does ***not*** get executed.

If the condition never becomes false, the suite will be run until you somehow interrupt the computer, such as by pressing the square "stop" symbol ■ or holding "ctrl" and pressing "c".  This is called an "infinite loop" (usually a bug).

##### Exercise 3.4.1: `while` loop

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
# We can have the same block of code execute over and over until some condition is satisfied:
import math

guess = math.pi

# while the guess is less than 42
while guess < 42:
    
    # add 1 to guess, this is called an "augmented assignment"
    guess += 1
    
print(guess)

<details>
    <summary><b>Solution</b></summary>
    
`guess` is initialized to `3.1415926535898` before the loop. Since `3.1415926535898 < 42`, the suite is executed. The suite only consists of a single statement: an augmented assignment statement that increments `guess`. After each time the suite is executed, the condition `3.1415926535898 < 42` is checked. When the value of `guess` is finally large enough for `3.1415926535898 < 42` to evaluate to `False`, the loop is executed. In this example, the suite is executed 39 times, until `guess` is `42.1415926535898`. After the loop is exited, the line of code `print(guess)` prints the string representation of `guess`.
</details>

##### Exercise 3.4.2: Guessing Game with `input`

The [***input***](https://docs.python.org/3/library/functions.html#input) function below will prompt the user (you) for an input.  The return value is a string, but passing that to the `int()` constructor will try to change it to an integer.  If you type something that Python does not know how to convert to an integer (like letters), then it will raise an error. But if you type only digits, those digits will be converted to an integer, and the name `guess` will be bound to that value.

Try to give various answers to get each statement to print.

In [None]:
# Combining a while loop and some if, elif, else statements, we can make:

guess = 0

while guess != 42:
    
    # Let user input a new guess, and convert their input to an int
    guess = int(input("Guess"))
    
    if guess < 10:
        print("Way too low!")
    elif guess < 42:
        print("Too low!")
    elif guess > 90:
        print("Way too high!")
    elif guess > 42:
        print("Too high!")
    else:
        print("You win!")
              
print("Have a nice day!")

### 3.5 `for` loops <a id="for_keyword"></a>

In this section, we will use a simplified description of `for` loops that ignores how they actually work, which involves creating an [***iterator***](https://docs.python.org/3/glossary.html#term-iterator). We will explain those details later.

For now (no pun intended), we will think of a [***for***](https://docs.python.org/3/reference/compound_stmts.html#for) loop as iterating directly over an [***iterable***](https://docs.python.org/3/glossary.html#term-iterable) (examples of iterables include lists, tuples, ranges, etc.), and executing a suite for each item in the iterable.  The syntax of a for loop is:
```python
for some_variable in some_iterable:
    suite
```
In this syntax, `for` and `in` are keywords, `some_variable` can be any variable name that you want, and `some_iterable` is the name (or expression, [literal](https://docs.python.org/3/reference/lexical_analysis.html#literals), etc.) of whatever you want to iterate over.

Each iteration, the name `some_variable` is bound to the next item from the iterable `some_iterable`, then the `suite` is executed. This continues for each item in the iterable, until there are no more items (the iterable is exhausted).

If the iterable is empty, then the suite is never executed.

In the following example, the name `student` is bound to the first item in `my_students`, which is `'Alice'`, then the suite (of two print functions) is executed. Then `student` is bound to the next item, and the suite is executed, ...

In [None]:
my_students = ['Alice', 'Bob', 'Carl', 'Dave', 'Fred']

# Ask each of my students 
for student in my_students:
    print("Send in the next student.")
    print(f"\tWould you like to present at the next meeting, {student}?")

#### 3.5.1 `continue` statements <a id="continue_keyword"></a>

A [***continue***](https://docs.python.org/3/reference/simple_stmts.html#continue) statement skips the rest of the code in the suite for that iteration, and *continues* with the next iteration.

A continue statement can be used in a `for` or `while` loop.

In [None]:
for student in my_students:
    
    print("Send in the next student.")
    
    # if we don't want to ask Bob
    if student == "Bob":
        continue
        
    print(f"\tWould you like to present at the next meeting, {student}?")

#### 3.5.2 `break` statements <a id="break_keyword"></a>

A [***break***](https://docs.python.org/3/reference/simple_stmts.html#break) statement skips the rest of the code in the suite for that iteration and *breaks* out of the loop.

A break statement can be used in a `for` or `while` loop.

In [None]:
volunteer = "Carl"

for student in my_students:
    
    print("Send in the next student.")
    
    # if we don't want to ask Bob
    if student == "Bob":
        continue
        
    print(f"\tWould you like to present at the next meeting, {student}?")
    
    # if the student volunteers
    if student == volunteer:
        print(f"{student} will present!")
        
        # don't ask anyone else
        break

#### 3.5.3 `else` clause after `for` loop <a id="for_else"></a>

An [***else***](https://docs.python.org/3/reference/compound_stmts.html#for) clause, if present, will be executed after the iterable is exhauseted, ***if*** the loop was not terminated by reaching a `break` statement.

In [None]:
volunteer = "Carl"

for student in my_students:
    
    print("Send in the next student.")
    
    # if we don't want to ask Bob
    if student == "Bob":
        continue
        
    print(f"\tWould you like to present at the next meeting, {student}?")
    
    # if the student volunteers
    if student == volunteer:
        print(f"{student} will present!")
        
        # don't ask anyone else
        break
        
else:
    print("I guess I'll have to just pick someone.")

##### Exercise 3.5.3.1: `for` loop `else` clause

Look at the output of the cell above. Change the volunteer in the cell above to `None` and run the cell again. What happened?

<details>
    <summary><b>Solution</b></summary>
    
Because the `for` loop exited without executing the `break`, the `else` clause was executed.
</details>

##### Exercise 3.5.3.2: `break` out of `for` loop

Add a `break` to the following while loop so that if someone guesses the value `lose`, it prints ***"You lose!"*** and doesn't give them any more tries.

In [None]:
# an initial value for guess that makes sure we enter the loop
guess = 0

# if this value is guessed, we want the person to lose without any more tries
lose = 13

while guess != 42:
    
    # Let user input a new guess, and convert their input to an int
    guess = int(input("Guess"))
    
    if guess < 10:
        print("Way too low!")
    elif guess < 42:
        print("Too low!")
    elif guess > 90:
        print("Way too high!")
    elif guess > 42:
        print("Too high!")
    else:
        print("You win!")
              
print("Have a nice day!")

<details>
    <summary><b>Solution</b></summary>
    
This certainly is the only possible way to do it, but it's probably the most obvious way:
    
```python
# an initial value for guess that makes sure we enter the loop
guess = 0

# if this value is guessed, we want the person to lose without any more tries
lose = 13

while guess != 42:
    
    # Let user input a new guess, and convert their input to an int
    guess = int(input("Guess"))
    
    if guess == lose:
        print("You lose!")
        break
    
    if guess < 10:
        print("Way too low!")
    elif guess < 42:
        print("Too low!")
    elif guess > 90:
        print("Way too high!")
    elif guess > 42:
        print("Too high!")
    else:
        print("You win!")
              
print("Have a nice day!")
```
</details>

##### Exercise 3.5.3.3: Iterate over a string

Iterate over the string `"hello, world!"`, printing each character.  Note that strings are "iterable"; they are sequences of characters.

In [None]:
# create for loop


<details>
    <summary><b>Solution</b></summary>
    
```python
for c in "hello, world!":
    print(c)
```
</details>

### 3.6 Iterating over dictionaries <a id="iter_dict"></a>

Iterating over a dictionary only iterates over the keys.  To iterate over the values, or both the keys and the values at the same time, you nee to use methods described in this section.

In [None]:
# student: grade pairs
grades = {'Alice': 95, 'Bob': 84, 'Carl': 73, 'Dave': 68, 'Fred': 0}

# trying to print the grades, but not going to work this way
for g in grades:
    print(g)

#### 3.6.1 The `values` method of a dictionary <a id="dict_values"></a>

To iterate over the values of a dictionary, call its [***values***](https://docs.python.org/3/library/stdtypes.html#dict.values) method, which returns an object of type `dict_values`.

In [None]:
# to iterate over the values, use the `values` method
for g in grades.values():
    print(g)

#### 3.6.2 The `items` method of a dictionary <a id="dict_items"></a>

To iterate over the keys and values of a dictionary at the same time, call its [***items***](https://docs.python.org/3/tutorial/datastructures.html#looping-techniques) method, which returns an object of type `dict_items`

In [None]:
# each iteration, select the next `key:value` pair, and
#     assign the key to the name `s` and the value to the name `g`
for s, g in grades.items():
    print(f"{s} scored {g}")

In [None]:
# I used `s` for student, and `g` for grade, but generically you will usually see
#   `k` used as the name for the key and `v` used as the name for the value, as in:
for k, v in grades.items():
    print(f"{k} scored {v}")

In [None]:
# if we convert the dictionary items to a list, we see that it becomes a list of tuples
list(grades.items())

### 3.7 Packing and unpacking <a id="unpacking"></a>

There is syntax in Python that performs multiple [assignments](https://docs.python.org/3/reference/simple_stmts.html#assignment-statements) ([name bindings](https://docs.python.org/3/reference/executionmodel.html#naming-and-binding)) in a single statement. 

#### 3.7.1 Sequence unpacking <a id="unpacking"></a>

[***Sequence unpacking***](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) is when an assignment statement has multiple names (or other targets) on the left hand side (the [***target list***](https://docs.python.org/3/reference/simple_stmts.html#grammar-token-python-grammar-target_list)) and a sequence on the right hand side.  There must be as many targets as there are items in the sequence, except when performing extended iterable unpacking as described shortly.  Each name is bound to the corresponding item.

In [None]:
# bind the name `s` to the value 'Alice' and the name `g` to the value 95
s, g = ('Alice', 95)

print(f"{s=}")
print(f"{g=}")

In [None]:
# unpacking works with lists too
s, g = ['Alice', 95]
        
print(f"{s=}")
print(f"{g=}")

In [None]:
# or any "iterable", such as a string (a sequence of characters)
a, b, c = "XYZ"

print(f"{a=}")
print(f"{b=}")
print(f"{c=}")

In [None]:
my_students = ('Alice', 'Bob', 'Carl', 'Dave', 'Fred')

# but the number of values to unpack must be the same as the number of names on the left hand side...
valedictorian, others, failure = my_students

#### 3.7.2 Iterable unpacking <a id="iterable_unpacking"></a>

In [***iterable unpacking***](https://docs.python.org/3/reference/expressions.html#grammar-token-python-grammar-starred_expression), a starred expression is expanded into its items.

Compare the results of the following two cells:

In [None]:
a = [1, (2, 3), 4]
print(a)
print(len(a))

In [None]:
a = [1, *(2, 3), 4]
print(a)
print(len(a))

#### 3.7.3 [Extended iterable unpacking](https://peps.python.org/pep-3132/) <a id="extended_unpacking"></a>

When exactly one of the targets on the left side of an assignment is preceded by an asterisk, `*`, then the number of targets can be less than the number of items to unpack. The name preceded by the asterisk (called a starred target) is bound to a sublist containing the remaining items.

##### Exercise 3.7.3: Extended iterable unpacking

Put an asterisk before the name `others` below (without a space, as in `*others`, referred to as a starred target), then run the cell.

In [None]:
my_students = ('Alice', 'Bob', 'Carl', 'Dave', 'Fred')

# but the number of values to unpack must be the same as the number of names on the left hand side...
valedictorian, others, failure = my_students

In [None]:
# when one of the names is prefixed by an asterisk, 
#   a list of the remaining items are bound to that name
print(f"{valedictorian=}")
print(f"{others=}")
print(f"{failure=}")

In [None]:
# unpacking can be nested as well
(a, (b, c)), d, e = [1, [2, 3]], 4, 5

print(f"{a=}")
print(f"{b=}")
print(f"{c=}")
print(f"{d=}")
print(f"{e=}")

### 3.8 Use `enumerate` to iterate over values and non-negative integers <a id="enumerate"></a>

When an iterable is passed to the [***enumerate***](https://docs.python.org/3/library/functions.html#enumerate) constructor, it returns an enumerate object that is similar to a list of tuples that *enumerate* the items in the iterable.

Let's look at some examples:

In [None]:
list(enumerate(["apple", "butterfly", "cat", "dog"]))

In [None]:
my_students = ['Alice', 'Bob', 'Carl', 'Dave', 'Fred']

# use tuple unpacking with `enumerate` to number each item from the iterable:
for i, s in enumerate(my_students):
    print(f"{s} is number {i}")

### 3.9 Use `zip` to iterate over values from multiple iterables <a id="zip"></a>

When multiple iterables are passed to the [***zip***](https://docs.python.org/3/library/functions.html#zip) constructor, it returns a zip object that is similar to a list of tuples such that the Nth tuple contains the Nth item from each iterable.  It stops once one of the iterables is exhausted.

Let's look at some examples:

In [None]:
my_zip = zip(["apple", "butterfly", "cat", "dog"],
             ["1st", "2nd", "3rd"],
             ("aaa","bbb","ccc")
            )
list(my_zip)

In [None]:
last_names = ['Anderson', 'Brown', 'Collins', 'Davis', 'Foster']

# tuple unpacking and `zip` are useful for iterating over multiple iterables at the same time
for first, last in zip(my_students, last_names):
    print(f"{first} {last}")

##### Exercise 3.9.1: Putting it all together

***(not easy)*** Create a `for` loop that uses ***only*** the `grades` dictionary and `last_names` list (***not*** the `my_students` list) to print the grades for each student, using the students' full names. (e.g. "Alice Anderson earned a 95")

<details>
    <summary><b>Solution</b></summary>
    
```python
for (first, grade), last in zip(grades.items(), last_names):
    print(first, last, "earned a", grade)
```
</details>

## Part 4: Functions <a id="functions"></a>

### 4.1 Defining with `def` <a id="def_keyword"></a>

A [***function***](https://docs.python.org/3/glossary.html#term-function) is any sequence of statements which returns a value to a caller. Functions can be defined so that  [***arguments***](https://docs.python.org/3/glossary.html#term-argument) can be passed by the caller.  Besides returning a value, functions can also execute any desired code.

We define a function using the keyword [***def***](https://docs.python.org/3/tutorial/controlflow.html#defining-functions), as in the following syntax:
```python
def name_of_function(parameters):
    block
    return something
```
The name `name_of_function` can be any [***valid***](#naming_rules) Python name. The `parameters` will be discussed shortly.  The `block` of code (called the ***body*** of the function) can be any code that you want to run when the function is ***called***. 

Whatever follows the keyword [***return***](https://docs.python.org/3/reference/simple_stmts.html#return) is the value *returned* from the *call* to the function. That value is returned to wherever the function was called, meaning that, if `f(x)` returns `42`, then you can think of that as replacing `f(x)` by `42` in the calling code.

***Note*** that *defining* the function does ***not*** run the code in the body of the function. To run that code, you need to ***call*** the function.  We [***call***](https://docs.python.org/3/reference/expressions.html#calls) a function using its name followed by parentheses `()` containing the actual parameters (arguments) that we are passing to the function, or empty if we are not passing in any arguments.

Let's look at some examples:

The next cell defines a function that returns the surface area of a box without a top. Note that *defining* the function creates an object of type function, and binds the name in the [***function definition***](https://docs.python.org/3/reference/compound_stmts.html#def) to that object, but does ***not*** execute the code in the body of the function or return a value.

In [None]:
# function to compute the surface area of a box without top
def SA_of_box_without_top(length, width, height):
    
    area = 2*length*height + 2*width*height + length*width
    
    return area

In [None]:
type(SA_of_box_without_top)

The next cell calls the function usings its name, followed by a parenthesized list of arguments. The following call binds `length` to 2, `width` to 3, and `height` to 5, inside the [namespace](#namespace) of the function.

In [None]:
# Call the function, passing in arguments
SA_of_box_without_top(2, 3, 5)

The next cell redefines the function so that it prints multiple values, in addition to returning the area.

In [None]:
# prints the L,W,H, and SA;  returns the surface area.
def SA_of_box_without_top(length, width, height):
    
    print(f"{length=}")
    print(f"{width=}")
    print(f"{height=}")
    
    area = 2*length*height + 2*width*height + length*width
    print(f"{area=}")
    
    return area

In [None]:
# Call function with the arguments 2, 3, and 5
SA_of_box_without_top(2, 3, 5)

#### 4.1.1 `return` is optional <a id="return_keyword"></a>

We can define a function without a [***return***](https://docs.python.org/3/reference/simple_stmts.html#return) value, as in the following syntax:
```python
def name_of_function(parameters):
    block
```
When a function is defined without an explicit `return` statement, a call to the function returns `None`.

In [None]:
# Only prints the L,W,H, and SA. Does not return the SA.
def SA_of_box_without_top(length, width, height):
    
    print(f"{length=}")
    print(f"{width=}")
    print(f"{height=}")
    
    area = 2*length*height + 2*width*height + length*width
    print(f"{area=}")
    
# Call the function using its name, followed by a parenthesized list of arguments
SA_of_box_without_top(2, 3, 5)

***Tip:*** When functions are defined like the one above, their purpose is to do something other than return a value. Be careful not to use the return value of such functions under the false belief that the return value is meaningful, as in the next cell:

In [None]:
# Save the box area in a variable?
box_area = SA_of_box_without_top(2, 3, 5)

# then use it later
print("The box area is", box_area)

***Example:***

In [None]:
# an unsorted list
my_list = [4,2,9,1,7,5]

# let's sort it
my_sorted_list = my_list.sort()

# let's check it
print(my_sorted_list)

##### Exercise 4.1.1.1: Methods that return `None`

What happened?  If the [***sort***](https://docs.python.org/3/library/stdtypes.html#list.sort) method returned `None`, do you think the purpose of the method is to return a value?  Try printing `my_list`.

In [None]:
# print `my_list`


<details>
    <summary><b>Solution</b></summary>
    
```python
print(my_list)
```
</details>

The list was ***mutated*** (lists are mutable). If you want to return a sorted copy, instead of sorting the list ***in-place***, use the [***sorted***](https://docs.python.org/3/library/functions.html#sorted) built-in function:

In [None]:
# to return a sorted list (a copy):
my_sorted_list = sorted([4,2,9,1,7,5])

print(my_sorted_list)

#### 4.1.2 conditional `return` <a id="conditional_returns"></a>

A function can have multiple `return` statements. This is usually only useful in conjunction with conditional logic.  The first `return` statement that executes exits the body of the function are returns back to where the function was called, returning the return value.
```python
def name_of_function(parameters):
    if something:
        return value
    else:
        return some_other_value
```

If the entire body of the function executes without executing a `return` statement, then `None` is returned.
```python
def name_of_function(parameters):
    if something:
        value = 42
        return value
    else:
        value = 24 # this value is NOT returned
```

##### Exercise 4.1.2.1: Which `return`?

Predict the output of the following cell. Then run it to see if you are correct.

In [None]:
def answer(something):
    if something:
        msg = 42
        return msg
    elif not something:
        msg = 24
        return msg
    else:
        msg ="This will never be the answer."
        return msg
        
print(answer(something=False))

<details>
    <summary><b>Solution</b></summary>
    
Since something is `False`, `not something` is `True`, so the `elif` suite is executed, which includes the assignment statement `msg = 24` and the return statement `return msg`.  Therefore, the value returned is `24`, which is printed.
</details>

##### Exercise 4.1.2.2: Forgetting to `return`

Predict the output of the following cell. Then run it to see if you are correct.

In [None]:
def answer(something):
    if something:
        value = 42
        return value
    else:
        value = 24
        
print(answer(something=False))

<details>
    <summary><b>Solution</b></summary>
    
Since something is `False`, the `if` suite is not executed, so the `else` suite is executed, which includes the assignment statement `value = 24`. However, that value is never used, and no return statement is reached, so the function returns `None`.  Therefore, the value returned is `None`, which is printed.
</details>

##### Exercise 4.1.2.3: `return` in nested function

Run the next two cells, then predict the output of the following cell. Then run it to see if you are correct.

In [None]:
def answer(question):
    
    # this is a function defined inside a function
    def compute_answer(question):
        return len(question)
        
    if question:
        value = compute_answer(question)
        return value
    else:
        value = compute_answer("What's the answer to the ultimate question")

In [None]:
print(answer("What's the answer to the ultimate question"))

In [None]:
print(answer(0))

<details>
    <summary><b>Solution</b></summary>
    
Since the `question` is `0`, which is [***falsy***](#truth_value), the code in the `else` block is executed:
```python
value = compute_answer("What's the answer to the ultimate question")
```
<br></br>
The function `compute_answer` returns `42`, meaning that the line above effectively becomes:
```python
value = 42
```
<br></br>
However, that is the end of function `answer`. The body of the function does not have any code that does ***anything*** with that value.
    
Since the entire body of the function `answer` has executed without executing a `return` statement (`compute_answer` returned `42` ***to*** the body of `answer`, but no `return` has been executed ***in*** the body of `answer`), `answer` returns `None`.
    
***Remember***: If your function does ***not*** execute a return statement, then it will return `None` (If it returns. It could also get stuck in an infinite loop, if it contains a while loop, or it could terminate due to an unhandled exception, discussed later).

***Remember***: If you call `function1`, and inside that function `function2` is called, then when `function2` returns a value, it returns the value to where ***it*** was called, not to where `function1` was called.
</details>

### 4.2 [Parameters and arguments](https://docs.python.org/3/faq/programming.html#faq-argument-vs-parameter) <a id="parameters"></a>

A [***parameter***](https://docs.python.org/3/glossary.html#term-parameter) is a name in a function definition that specifies a value that the function can accept when called.  When the function is called, it is passed [***arguments***](https://docs.python.org/3/glossary.html#term-argument) whose values are assigned to those names.

Previously, when we called `SA_of_box_without_top(2, 3, 5)`, we were telling Python to assign the value 2 to the first parameter, 3 to the second, and 5 to the third; where 'first', 'second', and 'third' refer to the order of the parameters in the definition of the function. This is referred to as passing ***positional arguments***.

We can also pass the arguments to the parameters as [***keyword arguments***](https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments) using the syntax `parameter=argument` inside the parentheses of the function call.  When we do that, it does ***not*** matter if we specify the keyword arguments in the same order as in the definition of the function.

We can also mix the two, passing some positional arguments and some keyword arguments.  However, positional arguments ***cannot*** come after keyword arguments.

***Let's look at examples:***

If we don't remember the order of the parameters,  we can provide the arguments as ***keyword arguments***,  meaning that we precede the argument with the name of the parameter and `=`

In [None]:
SA_of_box_without_top(height=5, length=2, width=3)

If we know the first is parameter `length`, we can make that a ***positional argument*** (i.e. not a *keyword argument*), but pass the others as *keyword arguments*:

In [None]:
SA_of_box_without_top(2, height=5, width=3)

But *positional arguments* **CANNOT** come after *keyword arguments*:

In [None]:
SA_of_box_without_top(length=2, 3, 5)

#### 4.2.1 Positional Only & Keyword Only Arugments <a id="arguments"></a>

It is posible to require certain arguments to be passed by position, by including a `/` in the parameter list.  Any parameters ***before*** the `/` can only take positional arguments. Similarly, arguments can be required to be passed in as keyword arguments by including a `*` in the parameter list. Any parameters ***after*** the `*` can only take keyword arguments. This syntax is shown [here](https://docs.python.org/3/tutorial/controlflow.html#special-parameters):
```python
def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):
    block
```
When the function defined above is called, the parameters `pos1` and `pos2` can only be passed positional arguments, the parameters `kwd1` and `kwd2` can only be passed keyword arguments, and `pos_or_kwd` can be passed either.

***Note*** that the names of the parameters do not affect which are positional/keyword; only their locations relative to (before/after) the `/` and `*` do. They are named that way in this example just to better illustrate which are positional, which are keyword, and which can be either.

Let's look at some examples:

##### Exercise 4.2.1.0: positional vs keyword arguments

Predict what will happen when the following cell is run. Then run it to see if you are correct.

In [None]:
# which of these is positional only, keyword only, or either?
def f(a, /, b, *, c):
    print(a, b, c)

<details>
    <summary><b>Solution</b></summary>
    
The function `f` is defined, adding the name `f` to the local namespace, but the body of the function is not executed, so nothing is printed.
</details>

##### Exercise 4.2.1.1: positional vs keyword arguments

Predict what will happen when the following cell is run. Then run it to see if you are correct.

In [None]:
# What will happen in the following function call?
f(a=1, b=2, c=3)

<details>
    <summary><b>Solution</b></summary>
    
We tried to pass in an argument for the parameter `a` using the syntax `a=1`, but `a` only accepts ***positional*** arguments, so that raised a <span style="color:red">***TypeError***</span>.
</details>

##### Exercise 4.2.1.2:

Make `a` a positional argument and run the cell again.

<details>
    <summary><b>Solution</b></summary>
    
```python
f(1, b=2, c=3)
```
    
This works.
</details>

##### Exercise 4.2.1.3: 

Now make `b` a positional argument and run the cell again.

<details>
    <summary><b>Solution</b></summary>
    
```python
f(1, 2, c=3)
```
    
This also works, becuase `b` can take positional arguments or keyword arguments.
</details>

##### Exercise 4.2.1.4: 

Now make `c` a positional argument, and run the cell again

<details>
    <summary><b>Solution</b></summary>
    
```python
f(1, 2, 3)
```
    
We tried to pass a positional argument for the parameter `c` , but `c` only accepts ***keyword only*** arguments, so that raised a <span style="color:red">***TypeError***</span>.
</details>

***Recap:***

Any parameter ***before*** a lone `/` in the parameter list must be passed positional arguments.

Any parameter ***after*** a lone `*` in the parameter list must be passed keyword arguments

#### 4.2.2 Default arguments <a id="default_args"></a>

Some of the parameters in a function definition can be given [***default argument values***](https://docs.python.org/3/tutorial/controlflow.html#default-argument-values).  In that case, those parameters are ***not*** required to be passed argument values when the function is called, because the default values will be used if no arguments are passed.

Otherwise, arguments must be passed in for each parameter or an error will be raised.

A function can be defined with some parameters given default argument values and others not.  However, a parameter without a default argument value ***cannot*** come after a parameter with a default argument value.

In [None]:
# This throws an error for "missing 1 required positional argument"
SA_of_box_without_top(2,3)

In [None]:
# We can give default values for the arguments when defining the function
def SA_of_box_without_top(length, width=1, height=1):
    
    print(f"{length=}")
    print(f"{width=}")
    print(f"{height=}")
    
    area = 2*length*height + 2*width*height + length*width
    print(f"{area=}")
    
# Now only the length needs to be given
SA_of_box_without_top(2)

##### Exercise 4.2.2.1: Default argument values

In the cell above, make only the width have a default value, then run the cell again. What happened? Why?

<details>
    <summary><b>Remember</b></summary>
    
non-default arguments cannot follow default arguments.
</details>

#### 4.2.3 Arbitrary argument lists: `*args, **kwargs` <a id="arbitrary_arg_lists"></a>

A function can also take an unspecified number of parameters by using [***arbitary argument lists***](https://docs.python.org/3/tutorial/controlflow.html#arbitrary-argument-lists). This is denoted by a parameter of the form `*args`. Any extra positional arguments passed during a call to the function will become items in a tuple named `args` (or whatever the name after the `*` is).

Similarly, defining a function with a parameter of the form `**kwargs` will allow extra [***keyword arguments***](https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments) to be passed during a function call. Those keyword arguments will become items in a dictionary named `kwargs` (or whatever the name after the `**` is).

If present, the parameter `**kwargs` must come after the parameter `*args`.

Let's look at examples:

In [None]:
# Giving too many arguments throws an error
SA_of_box_without_top(2,3,5,7)

In [None]:
# As does giving keyword arguments with the wrong name
SA_of_box_without_top(length=2, width=3, heighth=5)

In [None]:
# but we can define our function to accept extra positional or keyword arguments
#   note that `extra` and `more` are just names that I chose. You can use any names
def SA_of_box_without_top(length, width=1, height=1, *extra, **more):
    
    print(f"{length=}")
    print(f"{width=}")
    print(f"{height=}")
    
    area = 2*length*height + 2*width*height + length*width
    print(f"{area=}")
    
    if extra or more:
        print("\nI didn't use:")
    if extra:
        print(f"{extra=}, which is a {type(extra)}")
    if more:
        print(f"{more=}, which is a {type(more)}")
    
SA_of_box_without_top(2,3,5,7,11,heighth=13,girth=17)

#### 4.2.4 Unpacking arguments <a id="unpack_args"></a>

When passing arguments during a function call, you can pass the arguments together in a tuple (or list) if you use a `*` before the tuple (or list) to [unpack](https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists) the arguments.

Similarly, you can unpack a dictionary of keyword arguments using `**` before the dictionary in the call to the function.

Let's look at some examples

In [None]:
# You can 'go the other way', by unpacking a list or tuple into positional arguments
lwh = [2, 3, 5, 113]

# Note that here the star is before the argument (unpacking), instead of
#   before the parameter in the function definition, which would make an arbitrary argument list
SA_of_box_without_top(*lwh)

In [None]:
# You can also unpack a dictionary into keyword arguments
lwh = {'length': 2, 'width': 3, 'height': 5, 'answer': 42}

# Note that here the star is before the argument (unpacking), instead of
#   before the parameter in the function definition, which would make an arbitrary argument list
SA_of_box_without_top(**lwh)

### 4.3 Scopes and Namespaces <a id="namespace"></a>

In Python, a [***namespace***](https://docs.python.org/3/glossary.html#term-namespace) is an association between names and objects.  A [***scope***](https://docs.python.org/3/tutorial/classes.html#python-scopes-and-namespaces) is a region of code in which a namespace is accessible.  At any point, there are multiple nested scopes whose namespaces are accessible. Different namespaces can use the same names, meaning that the same name in two different namespaces might refer to two different objects.  Python [***resolves***](https://docs.python.org/3/reference/executionmodel.html#resolution-of-names) names into objects by searching the namespace for the innermost scope first.

The code inside the body of a function does not execute when the function is defined, only when the function is called. That's when the local namespace of a function is created.  Any names in the body of the function are first searched for in that namespace.  After the function returns or raises an unhandled exception, that namespace is no longer accessible.

Let's look at some examples:

In [None]:
answer = 42

def reply(question):
    
    # since `answer` is not defined inside the function,
    # Python searches its enclosing scope, and finds `42`
    print(f"The answer is {answer}")
    
reply("What is the answer to life, the universe, and everything?")

In [None]:
# delete answer
del answer

In [None]:
# make sure it's deleted

# try to print it
try:
    print(answer)
    
# if that raised a NameError, then 'answer' was deleted
except NameError:
    print("There's no object named 'answer'!")

In [None]:
# define the function again
def reply(question):
    
    # `answer` is not defined inside the function
    print(f"The answer is {answer}")

# define `answer` before calling the function
answer = 42

# when the function is called, the code in the function is executed
#   when the name `answer` is resolved, it is not in the local namespace of the function
#   but it is found in the enclosing scope
reply("What is the answer to life, the universe, and everything?")

##### Exercise 4.3.1: Nested scopes 1

What do you think the following cell will print?

In [None]:
# Predict what this will print, then run the cell
answer = 42

def reply(question):
    
    answer = 777
    
    # now answer is defined inside the function
    print(f"The answer is {answer}")
    
answer = 42
    
reply("What is the answer to life, the universe, and everything?")

<details>
    <summary><b>Remember</b></summary>
    
The innermost scope is checked first. If a name used inside a function, was bound inside the function, the function uses that local value.
</details>

##### Exercise 4.3.2:  Nested scopes 2

What do you think the following cell will print?

In [None]:
# Predict what this will print, then run the cell
print(f"The answer is {answer}")

<details>
    <summary><b>Remember</b></summary>

Names [***bound***](https://docs.python.org/3/reference/executionmodel.html#binding-of-names) inside a function definition are local to that function (unless they are declared with the keyword [***nonlocal***](https://docs.python.org/3/reference/simple_stmts.html#nonlocal) or the keyword [***global***](https://docs.python.org/3/reference/simple_stmts.html#global)). Once a function returns, its namespace is no longer accessible.
</details>

In [None]:
# Predict what this will print, then run the cell
answer = [1,1,2,3]

def reply(question):
    
    # add another item to the end of the list
    answer.append(5)
    
    # now answer is defined inside the function
    print(f"The answer is {answer}")
    
reply("What is the answer to life, the universe, and everything?")

##### Exercise 4.3.3: Mutable arguments

Looking at the code above, what do you think the following cell will print?

In [None]:
# Predict what this will print, then run the cell
print(f"The answer is {answer}")

<details>
    <summary><b>Remember</b></summary>

Didn't I just say that names bound inside a function definition are local to that function, and once the function returns, its namespace is no longer accessible?  
    
That's true.  They are not accessible. 
    
But `answer` was ***not*** bound inside the function; it was mutated inside the function. 
    
When a mutable is mutated inside a function, it doesn't magically unmutate itself later.
https://nedbatchelder.com/text/names.html
</details>

##### Example 4.3.4: The unusual case of <span style="color:red">***UnboundLocalError***</span>

```python
x = 10
def foo():
    print(x)
    x += 1
```

When you assign a value to a name in a scope (e.g. inside a function), that name becomes local to that scope. The [augmented assignment](#augmented_assignment) above tries to look up the value of `x` in the namespace of the function `foo` in order to add 1 to it, but cannot find `x` in that namespace. See [***here***](https://docs.python.org/3/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value) for more details.

#### 4.3.1 Mutable default argument values <a id="mutable_default_args"></a>

<span style="color:red">Bottomline:  ***usually*** a bad idea.</span>

The code inside the ***body*** of a function is only executed when the function is called.  However, any expressions in the parameters are evaluated when the function is ***defined***. That means that the same default arguments are used each time the function is called without passing arguments to those parameters.

In [None]:
# a function that makes a list of its arguments, optionally appending them
#   to an already existing list (for instructional purposes only)
def list_add(*args, old_list=[]):
    
    old_list.extend(args)
    
    return old_list
    
my_list = [10,20,30]

new_list = list_add(40, 50, 60, old_list=my_list)
print(new_list)

##### Exercise 4.3.1.1: Mutable default arguments

Let's say we want to start a brand new list.  We made `my_list` optional, by giving it a default value, so let's use.

What do you think the following cell will print?

In [None]:
# starting a new list using the default argument for old_list
brand_new_list = list_add(1, 2, 3)

print(f"{brand_new_list=}")

##### Exercise 4.3.1.2: Mutable default arguments 2

Let's create ANOTHER brand new list, again using the default value for `my_list`.

What do you think the following cell will print?

In [None]:
# starting a new list using the default argument for old_list
another_new_list = list_add(44, 55, 66)

print(f"{another_new_list=}")

##### Exercise 4.3.1.3: Without using the defalut value

This time let's ***NOT*** use the default value for `my_list`.

What do you think the following cell will print?

In [None]:
my_other_list = [1000,2000,3000]

this_list = list_add(4000, 5000, 6000, old_list=my_other_list)
print(this_list)

##### Exercise 4.3.1.4: Mutable default arguments 3

Let's create ONE MORE brand new list, again using the default value for `my_list`.

What do you think the following cell will print?

In [None]:
# starting a new list using the default argument for old_list
newest_list = list_add(777, 888, 999)

print(f"{newest_list=}")

<details>
    <summary><b>Remember</b></summary>

When we defined `list_add`, the default value `old_list=[]` was created.  Every time we called `list_add` without an argument for `old_list`, it used that same list as the value for `old_list`, which is why it kept getting bigger. Mutable default arguments can give unexpected results!
</details>

##### Exercise 4.3.1.5: No mutable default arguments

Change `old_list=[]` to `old_list=None` in the definition of `list_add`.

Then add the line 
```python
if old_list is None: old_list = []
```
as the first line in the body of the function.

Finally, run that cell again, and all of the cells below it (up to this one).

What changed?

### 4.4 First-class functions <a id="first_class_functions"></a>

In Python, functions are first-class, which means that they can be used like other objects, e.g. as items in lists, or passed in as arguments during function calls, etc.

In [None]:
def f1(x):
    return x + 1

def f2(x):
    return x**2

def f3(x):
    return 3*x

def f4(x):
    return x//4

def f5(x):
    return x % 5

def compose(x, fns_list):
    
    y = x
    
    for f in fns_list:
        
        # print the input and output, saving the output as the next input
        print(f"{y} -> {(y := f(y))}")
        
    return y

##### Exercise 4.4.1: list of functions

Create a list of the five functions `f1` through `f5`. Then call `compose`, passing the argument `17` to the parameter `x` and your list to the parameter `fns_list`.

In [None]:
# Put your code here:


<details>
    <summary><b>Solution</b></summary>
    
```python
fun = [f1,f2,f3,f4,f5]
compose(17, fun)
```
</details>

### 4.5 Lambda Expressions <a id="lambda_keyword"></a>

There is a shorthand for creating small functions when you don't want to name them (anonymous functions), that uses the following syntax:
```python
lambda parameters: expression
```
[***Lambdas***](https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions) can only contain a single expression. They're often used to pass simple anonymous functions as arguments, as in the following example:

In [None]:
compose(17,
        [lambda x: x + 1,
         lambda x: x**2,
         lambda x: 3*x,
         lambda x: x//4,
         lambda x: x % 5,
        ]
       )

##### Exercise 4.5.1: use a lambda as an argument

The `sorted` built-in function returns a list of the items of a sequence (e.g. list or tuple) sorted in some order. It can take an optional argument passed to a parameter named [***key***](https://docs.python.org/3/howto/sorting.html#key-functions), which should be a function (or other callable). When an argument is passed to `key`, the sequence is sorted based on the return value of that function, instead of based on the items in the sequence.

Pass a lambda in for the `key` to sort the following list of tuples in order from highest grade to lowest:

In [None]:
# list of (student, grade) tuples
exam1 = [('Alice', 100), ('Bob', 85), ('Carla', 70), ('Dave', 68), ('Fred', 0)]

# sorted by grade


<details>
    <summary><b>Solution</b></summary>
    
```python
sorted(exam1, key=lambda x: -x[1])
```
</details>

### 4.6 Documentation strings <a id="docstrings"></a>

The first statement in the definition of a function can be a string literal, called the documenation string, or [***docstring***](https://docs.python.org/3/tutorial/controlflow.html#documentation-strings). By [***convention***](https://peps.python.org/pep-0257/), triple-quoted strings are used. The first line of a docstring is meant to be a concise summary of the purpose of the function. If there are more lines, they usually describe the arguments, return value, any side effects, and when an error is raised.

The docstring of a function is stored in an attribute named `__doc__`, and can also be displayed by calling the built-in function `help()` on the function.  In Jupyter notebooks, the docstring is displayed when two question marks are typed after the function name, as in the following example:

In [None]:
print??

In [None]:
def letter_grade(percent, 
                 cutoffs=[0.,60,70,80,90], 
                 letters=['F','D','C','B','A']):
    """
    Compute the letter grade for a given percentage.

    Parameters
    ----------
    percent : int or float
        The grade in a percentage.
    cutoffs : tuple or list of ints or floats, optional
        DESCRIPTION. The default is [0.,60,70,80,90].
    letters : tuple or list of str, optional
        DESCRIPTION. The default is ['F','D','C','B','A'].

    Raises
    ------
    ValueError
        If `percent` is not in the interval [0, 100].
        If the length of `cutoffs` is not equal to the length of `letters`.

    Returns
    -------
    str
        The letter grade for the given percentage.

    """
    
    if len(cutoffs) != len(letters):
        msg = "cutoffs and letters should have the same number of items."
        raise ValueError(msg)
        
    if percent < 0 or percent > 100:
        msg = "percent should be between 0 and 100, inclusive."
        raise ValueError(msg)
        
    index = max(t[0] if t[1] <= percent else 0 for t in enumerate(cutoffs))
    
    return letters[index]

In [None]:
help(letter_grade)

## Part 5: I/0 <a id="io"></a>

This part explains some of the concepts and Python syntax related to input and output (I/O). 

For input, we will mostly discuss how to open and read files, including changing directories and listing the contents of the current working directory. We will discuss several types of files: plain text files, files containing arbitrary binary data, compressed files, and specially formatted files such as json, jsonl, csv, or MS Excel files.

For output, we will discuss how to write to files or the console, including saving different types of objects for data interchange.

### 5.1 The os module <a id="os"></a>

The [***os***](https://docs.python.org/3/library/os.html) module contains functions that interface with your operating system, such as checking what your current working directory is, listing its contents, and changing it.

It contains many other functions, some of which are listed [***here***](https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module) in a table that shows corresponding functions in the [***pathlib***](https://docs.python.org/3/library/pathlib.html#) module.

#### 5.1.1 Check your current working directory <a id="getcwd"></a>

You can check which directory is your current working directory using the function [***getcwd***](https://docs.python.org/3/library/os.html#os.getcwd), which returns a string representing the full path to the current working directory. 

If you save a file using only a filename (instead of a full path), the file will be saved in your current working directory.

In [None]:
import os

os.getcwd()

#### 5.1.2 CHANGE your current working directory <a id="chdir"></a>

You can ***change*** which directory is your current working directory using the function [***chdir***](https://docs.python.org/3/library/os.html#os.chdir) with the path of the directory that you want to be your new working directory passed in as the only argument.
```python
os.chdir(path)
```
Remember that [raw strings](#raw_string_path) are good for file paths.

##### Exercise 5.1.2.1: change your working directory

Choose a directory on your computer to change your working directory to.  If you're not sure how to, review [this](#raw_string_path).

In [None]:
# change this to some directory on your computer
os.chdir(r'C:\Users\joseph.pedersen\ipynb\tutorial')

#### 5.1.3 List the contents of a directory <a id="listdir"></a>

You can list the contents of a directory (the files and/or sub-directories in it) using the function [***listdir***](https://docs.python.org/3/library/os.html#os.listdir), which has one parameter - the path to the directory whose contents you want to list, which defaults to your current working directory.
```python
os.listdir(path='.')
```

In [None]:
# list the contents of the current working directory
os.listdir()

### 5.2 Opening a file <a id="open"></a>

The built-in function [***open***](https://docs.python.org/3/library/functions.html#open) can be used to open a file and return a [***file object***](https://docs.python.org/3/library/io.html#overview) (also called a stream):
```python
f = open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
```
After you are finished with the file, it is recommended that you [***close***](https://docs.python.org/3/library/io.html#io.IOBase.close) the file (e.g. `f.close()`).  However, there is a convenient syntax using the [***with***](https://docs.python.org/3/reference/compound_stmts.html#the-with-statement) keyword that will close the file for you:
```python
with open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) as f:
    indented_block_of_code
```
After the code in the `indented_block_of_code` has executed, Python will automatically close the file.

#### 5.2.1 file mode: r/w/a/x <a id="file_mode"></a>

The most important parameter to understand is the `mode`. If you open an existing file with `mode=w`, it <span style="color:red">***WILL BE ERASED***</span> (also called *truncated*).

The first character in the mode is either `r` (read), `w` (truncate, then write), `a` (append), or `x` (create, then write). As can be seen in the signature above, if an argument is not passed to `mode`, it defaults to `r`.

In [None]:
# open a file named `blah.txt` for Writing and name the file object `f`
with open('blah.txt', 'w') as f:
    
    # write to the file object
    f.write("hello, world!")

In [None]:
# open a file named `blah.txt` for Reading and name the file object `f`
with open('blah.txt', 'r') as f:
    
    # read the whole file as a single string, and name that string `txt`
    txt = f.read()
    
print(txt)

##### Exercise 5.2.1.1: appending to a file

What do you expect the output of the following cell will be?

Form a specific answer, then run the cell to see if you are correct

In [None]:
# open a file named `blah.txt` for Appending and name the file object `f`
with open('blah.txt', 'a') as f:
    
    # write to the file object
    f.write("hello, again!")
    
# open a file named `blah.txt` for Reading and name the file object `f`
with open('blah.txt', 'r') as f:
    
    # read the whole file as a single string, and name that string `txt`
    txt = f.read()
    
print(txt)

<details>
    <summary><b>Solution</b></summary>
    
Opening the file with `mode=a` left `hello, world!` in the file, and when we wrote `hello, again!` to the file, it ***appended*** that to the end. That's why, when we opened the file to read it, we read `hello, world!hello, again!`.
</details>

##### Exercise 5.2.1.2: writing to a file

What do you expect the output of the following cell will be?

Form a specific answer, then run the cell to see if you are correct

In [None]:
# open a file named `blah.txt` for Writing and name the file object `f`
with open('blah.txt', 'w') as f:
    
    # write to the file object
    f.write("hello, one more time!")
    
# open a file named `blah.txt` for Reading and name the file object `f`
with open('blah.txt', 'r') as f:
    
    # read the whole file as a single string, and name that string `txt`
    txt = f.read()
    
print(txt)

<details>
    <summary><b>Solution</b></summary>
    
Opening the file with `mode=w` <span style="color:red">***ERASED THE FILE!***</span>. That's why, after we wrote `hello, one more time!` to the file, that was the one thing in the file.
</details>

##### Exercise 5.2.1.3: create, then write to a file

What do you expect the output of the following cell will be?

Form a specific answer, then run the cell to see if you are correct

In [None]:
# open a file named `blah.txt` for Xclusive creation then writing,
#   and name the file object `f`
with open('blah.txt', 'x') as f:
    
    # write to the file object
    f.write("hello, one LAST time!")
    
# open a file named `blah.txt` for Reading and name the file object `f`
with open('blah.txt', 'r') as f:
    
    # read the whole file as a single string, and name that string `txt`
    txt = f.read()
    
print(txt)

<details>
    <summary><b>Solution</b></summary>
    
Opening the file with `mode=x` ensures that the file does not exist before opening it for writing. If the file exists, it raises <span style="color:red">***FileExistsError***</span>.
</details>

***update mode*** You can open a file with mode `r+`, `w+`, `a+`, or `x+`, for reading and writing from the same stream (called updating).  However, the behavior of the reading and writing is complicated, due to buffering. Therefore, I do not discuss that here in this introduction.

#### 5.2.2 file mode: text vs binary <a id="text_vs_bytes"></a>

After the first character in the mode (`r`, `w`, `a`, or `x`), optionally followed by a `+`, you can then optionally include either the character `t` (for text mode) or `b` (for binary mode). If you do not specify, the default is text mode.

When you open a file in ***text*** mode and name the file object `f`, `f.read()` returns a [***string***](https://docs.python.org/3/library/stdtypes.html#textseq) (in read mode) and `f.write(something)` will raise an error if `something` is not a string (in write mode).

When you open a file in ***binary*** mode and name the file object `f`, `f.read()` returns a [***bytes***](https://docs.python.org/3/library/stdtypes.html#bytes-objects) object (in read mode) and `f.write(something)` will raise an error if `something` is not a bytes object (in write mode).

***Those are the only two possibilities: strings or bytes.***  You cannot use `open` to open a JPEG image, a PDF, a Microsoft Excel spreadsheet, or any other file type ***unless*** either (1) the data is stored as plain text, and you want to open the text as a string, or (2) you want to work directly with the bytes in which the data is encoded.

There are Python modules that can open other file types, such as Pandas for spreadsheets.

The next important point to understand is that ***files only contain bytes***, i.e. `0`s and `1`s. That is the language of computers.  To open a file in text mode, it needs an [***encoding***](https://docs.python.org/3/glossary.html#term-text-encoding). If you do not specify an encoding, Python will use the default encoding for your machine. If you read a file in text mode, and see many strange characters, that ***may not*** be a problem with the file. That may indicate that you opened it with a different encoding than was used to write the data to the file.

To learn more about unicode and encodings, you can read the [***Unicode HOWTO***](https://docs.python.org/3/howto/unicode.html).

In [None]:
# open a file named `blah.txt` for Writing and name the file object `f`
with open('blah.txt', 'w', encoding="hz") as f:
    
    # write to the file object
    f.write('祝你今天过得愉快')
    
# open a file named `blah.txt` for Reading Bytes and name the file object `f`
with open('blah.txt', 'rb') as f:
    
    # read the whole file as a bytes object, and name that bytes object `data`
    data = f.read()

# look at the bytes used to encode the unicode
#   note that bytes that are printable ASCII will be displayed as such
print(data)

In [None]:
# open the file and read with a different encoding than we used to write
with open('blah.txt', 'r', encoding="utf8") as f:
    
    # read the whole file as a string, and name that string `txt
    txt = f.read()

# look at the text
print(txt)

In [None]:
# open the file and read with the encoding that we used to write
with open('blah.txt', 'r', encoding="hz") as f:
    
    # read the whole file as a string, and name that string `txt
    txt = f.read()

# look at the text
print(txt)

***Important note:*** the safest option in terms of data preservation and interoperability is to explicitly choose an encoding rather than relying on the platform dependent default. That way you will be sure which encoding your data is stored in and can decode the data with that encoding.

According to [python.org](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files), "Because UTF-8 is the modern de-facto standard, `encoding="utf8"` is recommended unless you know that you need to use a different encoding." The list of standard encodings is [here](https://docs.python.org/3/library/codecs.html#standard-encodings).

#### 5.2.3 Encoding Errors <a id="encoding"></a>

As we said [earlier](#strings_intro), strings contain unicode code points. However, not every unicode code point can be encoded using every encoding.  If you try to encode a string that contains code points which the encoding does not support, you will get a <span style="color:red">***UnicodeEncodeError***</span>.

In [None]:
# open a file named `blah.txt` for Writing and name the file object `f`
with open('blah.txt', 'w', encoding="latin1") as f:
    
    # write to the file object
    f.write("Have a nice day! 😀")

The `errors` parameter allows you to specify how errors should be [***handled***](https://docs.python.org/3/library/codecs.html#error-handlers). 

In [None]:
# open a file named `blah.txt` for Writing and name the file object `f`
with open('blah.txt', 'w', encoding="latin1", errors='namereplace') as f:
    
    # write to the file object
    f.write("Have a nice day! 😀")
    
# open a file named `blah.txt` for Reading and name the file object `f`
with open('blah.txt', 'r', encoding="latin1", errors='namereplace') as f:
    
    # read the file object
    txt = f.read()
    
print(txt)

#### 5.2.4 newline <a id="newline"></a>

The `newline` parameter indicates how to handle reading lines from a file and writing lines to a file.  How [***lines***](https://docs.python.org/3/library/os.html#os.linesep) are separated is platform dependent. On POSIX compliant operating systems (e.g. UNIX), lines are separated by the newline character `\n` (also called a line feed: LF). On Windows, lines are separated by the two character sequence `\r\n` ( a carriage return, then a line feed: CR+LF).

In [None]:
multiline_text = """Line 1
Line 2
Line 3"""

# look at the characters that make up the string
# Python represents the line breaks by the newline character \n
print(repr(multiline_text))

In [None]:
# Write the string to file without an argument to `newline` (default is None)
# On my Windows machine, this translates the \n to \r\n
with open('blah.txt', 'w') as f:
    f.write(multiline_text)

# Read the file with newline='', which keeps the newline characters as they are in the file
with open('blah.txt', 'r', newline='') as f:
    txt = f.read()
    print(repr(txt))

### 5.3 Reading a file <a id="read"></a>

There are multiple different ways to read a file:
- `read()` to read the whole file at once, or a specific number of characters/bytes, 
-  `readline()` to read the file line by line, 
- `readlines()` to read the entire file into a list of lines, or
- iterate over the file, line by line

#### 5.3.1 Reading a specific number of characters/bytes

We have used `f.read()` many times already. It reads the entire file. If we pass a positive integer in as the single argument, then only that many characters (unicode code points) are read (in text mode), or that many bytes are read (in binary mode)†.

†Technically, an internal buffer may be used for efficiency, so more than that many characters/bytes may be read at a time, with only that many returned, and the rest kept in the buffer.

In [None]:
multiline_text = """I am happy! o͜o
I am indifferent. o͟o
Help, I am upside down! o͝o

"""

# write the text to a file
with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(multiline_text)

with open('blah.txt', 'r', encoding='utf8') as f:
    
    # only read 22 characters
    txt = f.read(22)
    print(repr(txt))

In [None]:
"Help, I am upside down! o\u0360o"

#### 5.3.2 Reading a file line by line

Often we work with files line by line. Each time the method `readline()` is called, it returns the next line of a file object as a string.  If a file is very large, this prevents needing to read it all into memory at the same time.

In [None]:
multiline_text = """I am happy! o͜o
I am indifferent. o͟o
Help, I am upside down! o͝o
"""

# write the text to a file
with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(multiline_text)

with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    for i in range(3):
            print(repr(f.readline()))

##### Exercise 5.3.2.1: working on file lines

Add conditional logic to the cell above so that it only prints lines that begin with a capital H.

<details>
    <summary><b>Solution</b></summary>
    
```python
with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    for i in range(3):
        this = f.readline()
        if this[0] == "H":
            print(repr(this))
```
<br>
    
Make sure not to try the following:

```python
with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    for i in range(3):
        if f.readline()[0] == "H":
            print(repr(f.readline()))
```
<br>
    
If you did that, the `f.readline()` inside the `print()` would be the line ***after*** the line checked to start with an `"H"`.
</details>

#### 5.3.3 Iterating over a file line by line

You can also iterate directly over the file object:

In [None]:
multiline_text = """I am happy! o͜o
I am indifferent. o͟o
Help, I am upside down! o͝o
"""

# write the text to a file
with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(multiline_text)

with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    for line in f:
        print(repr(line))

#### 5.3.4 Creating a list of lines

You can also make a list of the lines using `readlines()`.  This reads the entire file into memory, so take care not to do this if the file is too large.

In [None]:
multiline_text = """I am happy! o͜o
I am indifferent. o͟o
Help, I am upside down! o͝o
"""

# write the text to a file
with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(multiline_text)

with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    lines = f.readlines()
    
print(lines)

### 5.4 Writing and Printing to a file <a id="write"></a>

There are multiple different ways to write to a file:
- `write()` to write a string (in text mode) or bytes (in binary mode) to a file
-  `writelines()` to write a list of strings (in text mode) or bytes (in binary mode) to a file
- `print()` to write a single or multiple objects (not necessarily strings) to a file, with formatting

#### 5.4.1 Writing a string or bytes object to file

We have used `f.write()` many times already. It takes a single argument, a string (in text mode) or a bytes object (in binary mode), and writes that to the file†.

†Technically, an internal buffer may be used for efficiency, in which case the string/bytes will be written to the file when the buffer [***flushes***](https://docs.python.org/3/library/io.html?highlight=flush#io.BufferedWriter.flush). In fact, [***python.org***](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files) warns that if you do not use [***with***](https://docs.python.org/3/reference/compound_stmts.html#with) to open the file, or call [***close***](https://docs.python.org/3/library/io.html#io.IOBase.close) on the file when you are finished with it, then the string/bytes might not be completely written to the file.

#### 5.4.2 Writing a list of strings or bytes objects to file

Calling `f.writelines(some_iterable)` will write each item returned from the iterable (e.g. list) to file. It takes a single argument, an iterable of strings (in text mode) or an iterable bytes objects (in binary mode), and writes those to file.

A [***line separator***](https://docs.python.org/3/library/os.html#os.linesep) is not automatically added after each string. If the items should be separated by newlines or spaces, you need to add those yourself.

Ensure that you do not call `f.writelines()` on a single string (or bytes object), because the write will take much longer than calling `f.write()`, although it will work, since strings and bytes objects are iterables.

In [None]:
multiline_text = """I am happy! o͜o
I am indifferent. o͟o
Help, I am upside down! o͝o
"""

# write the text to a file
with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(multiline_text)

with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read one line at a time
    lines = f.readlines()
    
# this is a list of lines, each ending in a line separator
print(lines)

print() # blank line

with open('blah2.txt', 'w', encoding='utf8') as f:
    
    # write each item in `lines` to blah2.txt
    f.writelines(lines)
    
with open('blah2.txt', 'r', encoding='utf8') as f:
    
    # read the file 
    txt = f.read()

# the contents of blah2.txt
#   note that \n characters CAUSE a newline, rather than getting printed as \n
print(txt)

#### 5.4.3 Printing to file <a id="print"></a>

We have called [***print***](https://docs.python.org/3/library/functions.html#print) many times in this tutorial so far, but only for printing inside the notebook. We can also print to file. The signature for print is:
```python
print(*objects, sep=' ', end='\n', file=None, flush=False)
```
If you remember [***arbitrary argument lists***](#arbitrary_arg_lists), you will recognize that print can accept any number of positional arguments, which all get saved to the parameter `objects` (a list). These are what get printed. One convenience print provides that write does not is that the arguments do not need to be strings; they are automatically converted to strings using [***str( )***](https://docs.python.org/3/library/stdtypes.html#str).

If there are multiple arguments passed to `objects`, the argument passed to `sep` (which should be a string or `None`) is printed in between each. Then, the argument to `end` (another string or `None`) is printed *at the end* (even if `objects` was empty).

The argument passed to `file` should either be a file object or `None`. It cannot be a string (even if that string is a filename or path). If `file=None`, then the writing is done to [***sys.stdout***](https://docs.python.org/3/library/sys.html#sys.stdout) (e.g. in this notebook). If a file object is passed to `file`, it should be in text mode, since `print` converts everything to strings before writing.

In [None]:
print(1, 2, 3)

##### Exercise 5.4.3.1: `sep`

Pass an argument to `sep` in the call to print in the cell below so that each number is printed on its own line.

In [None]:
# only change the line below by adding a `sep` argument
print(1, 2, 3)

<details>
    <summary><b>Solution</b></summary>
    
```python
print(1, 2, 3, sep='\n')
```
</details>

##### Exercise 5.4.3.2: `sep` and `end`

Pass arguments to `sep` and `end` so that there are dashes between the numbers, instead of spaces, and there is an exclamation point after the 3.

In [None]:
# only change the line below by adding arguments for `sep` and `end`
print(1, 2, 3)

<details>
    <summary><b>Solution</b></summary>
    
```python
print(1, 2, 3, sep='-', end='!')
```
</details>

##### Exercise 5.4.3.3: Multiple appends to the same file

Think about what the output of the following cell will be and why.  Then run it to see if you are correct.

In [None]:
f = open('blah.txt', 'w') # truncate the file
f.close() # then close it

# a function for logging messages to file
def logger(msg, logfile):
    
    with open(logfile, 'a') as f:
        print(msg, file=f)

        
# open blah.txt in Append mode, for logging messages
with open('blah.txt', 'a', encoding='utf8') as f:
    
    # print a message to the file
    print("This is the FIRST message.", file=f)
    
    msg2 = "This is the second message."
    
    # use logger to print a second messages
    logger(msg2, 'blah.txt')
    
# after logging all of the messages, open the file
with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read the file 
    txt = f.read()

# print the messages
print(txt)

<details>
    <summary><b>Solution</b></summary>
    
Here is the order of the code executed:
- The file `"blah.txt"` is opened in append mode, and the message `"This is the FIRST message."` is written to the ***buffer*** for that file object. 
- When logger is called, it also opens `"blah.txt"` in append mode, with its own file object, and writes `"This is the second message."` to the ***buffer*** for THAT file object. 
- When the block of code inside the with statement, inside logger, is finished executing, the buffer for that file object is flushed (printing `"This is the second message."` to the file) and the file is closed. 
- Then, after logger returns, the block of code inside the with statement (the with statement not inside logger) has finished, so the buffer for that file object is flushed (printing `"This is the FIRST message."` to the file) and that file is closed.
</details>

##### Exercise 5.4.3.4: `flush`

Add `flush=True` to the calls to `print()`.  Then run it to see if anything changes.

In [None]:
f = open('blah.txt', 'w') # truncate the file
f.close() # then close it

# a function for logging messages to file
def logger(msg, logfile):
    
    with open(logfile, 'a') as f:
        print(msg, file=f)

        
# open blah.txt in Append mode, for logging messages
with open('blah.txt', 'a', encoding='utf8') as f:
    
    # print a message to the file
    print("This is the FIRST message.", file=f)
    
    msg2 = "This is the second message."
    
    # use logger to print a second messages
    logger(msg2, 'blah.txt')
    
# after logging all of the messages, open the file
with open('blah.txt', 'r', encoding='utf8') as f:
    
    # read the file 
    txt = f.read()

# print the messages
print(txt)

### 5.5 gzip files<a id="gzip"></a>

The [***gzip***](https://docs.python.org/3/library/gzip.html) module in the Python Standard Library provides an interface to compress/decompress files similar to the GNU programs gzip and gunzip.  The signature of the open function in gzip is:
```python
gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
```
Note that the default mode is ***binary***, not text like it is for the built-in function open. To select text mode, you need to explicitly include the `t` in the argument to `mode`.

The `compresslevel` parameter controls the level of compression: an integer from 0 (no compression) to 9 (most compression, and slowest).

Files compressed with gzip are usually saved with the extension `.gz` to indicate that, although that's just a convention.

Also note that there are compression methods that GNU gzip/gunzip support that are not supported by the Python module gzip.

##### Exercise 5.5.1: `gzip` `open`

Each character in the string below takes one byte to store in the UTF-8 encoding.

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
import gzip
import os

long_string = "This is a short string." * 1000

with open('blah.txt', 'w', encoding='utf8') as f:
    f.write(long_string)

with gzip.open('bla.txt.gz', 'w', encoding='utf8') as f:
    f.write(long_string)
    
# look at the sizes of the files
print(f"blah.txt is {os.stat('blah.txt').st_size} bytes")
print(f"bla.txt.gz is {os.stat('bla.txt.gz').st_size} bytes")

##### Exercise 5.5.2: `gzip open mode`

Change the `mode` in the cell above from `w` to `wt`, then rerun the cell.

### 5.6 json <a id="json"></a>

The [***json***](https://docs.python.org/3/library/json.html) module in the Python Standard Library provides an interface to serialize ***some*** objects to [***JSON***](https://www.json.org/) (JavaScript Object Notation), which is a very popular format for data interchange that stores data as plain text. 

The json module [can serialize](https://docs.python.org/3/library/json.html#py-to-json-table) the following Python data structures: strings, numbers, True, False, None, as well as dictionaries (called objects in JSON) and lists (called arrays in JSON) containing serializable items. If you serialize tuples, they get converted to lists. If you try to serialize an object that the json module does not know how to serialize, it raises <span style="color:red">***TypeError***</span>: Object of type set is not JSON serializable.

[Python.org](https://docs.python.org/3/library/json.html#) warns, <span style="color:red">"Be cautious when parsing JSON data from untrusted sources. A malicious JSON string may cause the decoder to consume considerable CPU and memory resources. Limiting the size of data to be parsed is recommended."</span>

#### 5.6.1 `dump` and `load`

The json function [***dump***](https://docs.python.org/3/library/json.html#json.dump) takes several keyword arguments, but the two position arguments it takes are: `obj` (the object to be serialized), and `fp` (the file object to write to, which must be in text mode).

The json function [***load***](https://docs.python.org/3/library/json.html#json.load) takes a file object of a file containing JSON, and returns the corresponding Python data structure.

##### Exercise 5.6.1.1: `dump` and `load`

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
import json

data = [True, None, 3.14, ('apple', 'banana', 'cherry')]

with open('blah.txt', 'w', encoding='utf8') as f:
    json.dump(data, f)
    
with open('blah.txt', 'r', encoding='utf8') as f:
    data2 = json.load(f)
    
print(data2)

<details>
    <summary><b>Solution</b></summary>
    
The tuple `('apple', 'banana', 'cherry')` is converted to a list, `['apple', 'banana', 'cherry']`, because JSON does not have two separate sequence types like Python does.
</details>

##### Exercise 5.6.1.2: more `json`

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
data = [True, None, 3.14, {'apple', 'banana', 'cherry'}]

with open('blah.txt', 'w', encoding='utf8') as f:
    json.dump(data, f)
    
with open('blah.txt', 'r', encoding='utf8') as f:
    data2 = json.load(f)
    
print(data2)

<details>
    <summary><b>Solution</b></summary>
    
Sets are not a type that the json module can convert to JSON, since the JSON format does not support sets.  Therefore, this raises <span style="color:red">***TypeError***</span>: Object of type set is not JSON serializable.  Depending on what you are trying to achieve, you could convert the set to a list to convert it to JSON, or you could [pickle](#pickle) the set.
</details>

#### 5.6.2 `dumps` and `loads`

The json function [***dumps***](https://docs.python.org/3/library/json.html#json.dumps) works similarly to dump, except that it returns a string instead of writing to a file.

Similary, the json function [***loads***](https://docs.python.org/3/library/json.html#json.loads) takes a string of JSON as its only positional argument, instead of a file object, and returns the corresponding Python data structure.

In [None]:
data = {
    'answer': True, 
    'grade': None, 
    'constant': 3.14, 
    'fruit': ('apple', 'banana', 'cherry')
}

# create a string encoding the data structure as JSON
s = json.dumps(data)

# decode the JSON and bind the data structure to the name `data2`
data2 = json.loads(s)

print(data)
print(s)
print(data2)

##### Exercise 5.6.2.1: compact `dumps`

Add `separators=(',', ':')` to the call to `dumps()` in the cell below, then see if you can tell the difference in the string that it creates (compared to the one above). If you are storing large amounts of JSON, this could save memory.

In [None]:
# create a string encoding the data structure as JSON
s = json.dumps(data)

print(s)

<details>
    <summary><b>Solution</b></summary>
    
```python
s = json.dumps(data, separators=(',', ':'))
```
</details>

##### Exercise 5.6.2.2: pretty `dumps`

Add `indent=4` to the call to `dumps()` in the cell below, then see if you can tell the difference in the string that it creates (compared to the one above).

In [None]:
# create a string encoding the data structure as JSON
s = json.dumps(data)

print(s)

<details>
    <summary><b>Solution</b></summary>
    
```python
s = json.dumps(data, indent=4)
```
</details>

#### 5.6.3 JSON Lines (JSONL) <a id="jsonl"></a>

Another popular data interchange format is [***JSON Lines***](https://jsonlines.org/), which is just the same as storing multiple JSON strings in the same file, separated by `\n`.

It's ok if your platform separates lines with `\r\n`, because the `\n` is considered the separator between the JSON values, and the `\r` is treated as extra whitespace, which is ignored.

Files containing data in the JSON Lines format are usually named with the `.jsonl` extension to indicate that.

In [None]:
# practice data
a = [1,2,3]
b = {'L': 4, 'W': 5, 'H': 6}
c = "bacon cheeseburger"

with open('blah.jsonl', 'w') as f:
    json.dump(a, f)
    json.dump(b, f)
    json.dump(c, f)

##### Exercise 5.6.3.1: JSON lines

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
with open('blah.jsonl', 'r') as f:
    data = json.load(f)
    
print(data)

<details>
    <summary><b>Solution</b></summary>
    
You received the error "<span style="color:red">***JSONDecodeError***</span>: Extra data" indicating that there is more data in the file beyond the JSON.  That is because this file does not contain a single JSON value. We are trying to read JSON Lines.  Try the next exercise.
</details>

##### Exercise 5.6.3.2: more JSON lines

Predict the output of the following cell, then run it to see if you are correct.

In [None]:
with open('blah.jsonl', 'r') as f:
    a = json.loads(f.readline())
    b = json.loads(f.readline())
    c = json.loads(f.readline())
    
print(a, b, c, sep='\n')

<details>
    <summary><b>Solution</b></summary>
    
Same error? "<span style="color:red">***JSONDecodeError***</span>: Extra data" indicating that there is more data in the file beyond the JSON.
    
Let's look at the data to see what is going on:
</details>

In [None]:
with open('blah.jsonl', 'r') as f:
    txt = f.read()
    
print(txt)

##### Exercise 5.6.3.3: actual JSON lines

There were NOT any newlines between the JSON values, so it was not JSON Lines.

You could write a newline to the file after each call to `dump`, or you can call `dumps` to create a string then `print` that to the file (which automatically adds the newline).

Your call.  Fix the next cell, then run it, and the one after it.

In [None]:
# practice data
a = [1,2,3]
b = {'L': 4, 'W': 5, 'H': 6}
c = "bacon cheeseburger"

with open('blah.jsonl', 'w') as f:
    json.dump(a, f)
    json.dump(b, f)
    json.dump(c, f)

In [None]:
with open('blah.jsonl', 'r') as f:
    a = json.loads(f.readline())
    b = json.loads(f.readline())
    c = json.loads(f.readline())
    
print(a, b, c, sep='\n')

### 5.7 pickle <a id="pickle"></a>

The [***pickle***](https://docs.python.org/3/library/pickle.html) module in the Python Standard Library provides an interface to serialize most Python objects to bytes, called *pickling*, and deserialize back to a Python object, called *unpickling*. This is a Python specific protocol, so not used for data interchange as much as JSON, but capable of serializing many more types of objects, and often much faster for large data.

The interface is very similar to the `json` module, except that `dump` and `load` should be written to files in binary mode, `dumps` and `loads` expect bytes instead of a string, and you can `load` multiple times from the same file if `dump` was used to pickle multiple objects to the same file.

[Python.org](https://docs.python.org/3/library/pickle.html) warns, <span style="color:red">"Warning The pickle module is not secure. Only unpickle data you trust."</span>

In [None]:
import pickle as pkl

# practice data
a = [1,2,3]
b = {'L': 4, 'W': 5, 'H': 6}
c = "bacon cheeseburger"

with open('blah.pkl', 'w') as f:
    pkl.dump(a, f)
    pkl.dump(b, f)
    pkl.dump(c, f)

##### Exercise 5.7.1: pickling object into bytes

What went wrong? Read the intro again if you need to, then fix the cell above.

In [None]:
with open('blah.pkl', 'rb') as f:
    a = pkl.load(f)
    b = pkl.load(f)
    c = pkl.load(f)
    
print(a, b, c, sep='\n')

### 5.8 Reading spreadsheets and CSVs with Pandas <a id="pandas"></a>

There is a [***csv***](https://docs.python.org/3/library/csv.html) module in the Python Standard Library that can be used to read CSV files, but I prefer [***pandas***](https://pandas.pydata.org/).  It can read many formats of tabular data: CSV, MS Excel spreadsheets, HTML, JSON Lines, and more.

Since pandas is not part of the Python Standard Library, you need to install it before you can import it. If you are using a virtual environment (such as a conda environment), then you need to have pandas installed in your current environment in order to import it.

An introduction to pandas would require an entire tutorial itself.  There is a lot of great information on the [pandas website](https://pandas.pydata.org/docs/getting_started/index.html#intro-to-pandas), including the [10 minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html).

In [None]:
import pandas as pd

In [None]:
# create a practice CSV file
data = """Name,Height,Weight
Sam,70,185
Bill,68,179
Henry,73,234"""

with open('blah.csv', 'w') as f:
    f.write(data)

In [None]:
# create a dataframe
df = pd.read_csv('blah.csv')

In [None]:
# look at the first 5 rows (but there are only 3 rows)
df.head()

In [None]:
# create a practice JSONL file
data = """{"Name": "Sam", "Height": 70, "Weight": 185}
{"Name": "Bill", "Height": 68, "Weight": 179}
{"Name": "Henry", "Height": 73, "Weight": 234}
"""

with open('blah.jsonl', 'w') as f:
    f.write(data)

In [None]:
# create a dataframe
df = pd.read_json('blah.jsonl', lines=True)

# look at the first 5 rows (but there are only 3 rows)
df.head()

## Part 6: Containers (in more detail) <a id="containers_more"></a>

In this part, we will go over many of the different built-in [***container***](https://docs.python.org/3/library/collections.html) types in Python.  The three main categories of containers are [***sequences***](https://docs.python.org/3/reference/datamodel.html#sequences) (such as lists and tuples), [***mappings***](https://docs.python.org/3/reference/datamodel.html#mappings) (different flavors of dictionaries), and [***sets***](https://docs.python.org/3/reference/datamodel.html#set-types) (sets and frozensets). We will go over the methods typically associated with types in these categories, as well as the differences between the specific types in these categories.

### 6.1 Membership testing with `in` <a id="in_keyword"></a>

Abstractly speaking, a ***container*** is any object which implements the [`__contains__()`](https://docs.python.org/3/reference/datamodel.html#object.__contains__) method, which enables testing whether or not another object is *contained* inside of it. This is referred to as [***membership testing***](https://docs.python.org/3/reference/expressions.html#membership-test-details). To test if `item` is contained in `obj`, we use the keyword [***in***](https://docs.python.org/3/reference/expressions.html#in), with the following syntax:
```python
item in obj
```
If the above syntax does ***not*** follow the [***for***](#for_keyword) keyword, then the syntax is an expression that evalutes to either `True` or `False` (or raises an error).

If `obj` is not a container, a <span style="color:red">***TypeError***</span> will be raised, but the error message may seem confusing, since instead of saying that the type is not a container, it will say that the type is not iterable. This is because, if an object does ***not*** have a contain method, membership testing will try to turn the object into an iterable (covered later), to check if any of the values returned are (or are equal to) the `item`.

***Note*** that `for item in obj` is ***not*** membership testing; it is the [***for statement***](#for_keyword) which begins a for loop. 

Let's look at some examples:

In [None]:
names = ('Randy', 'Seth', 'Tim')
ages = [31, 42, None]
group_size = 3
group_name = 'Ransim7'
hometowns = {'Randy': 'Middletown', 'Seth': 'Newburgh', 'Tim': 'Oswego'}

##### Exercise 6.1.1: Is it in?

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
'Randy' in names

In [None]:
None in ages

In [None]:
31.0 in ages

In [None]:
[31, 42] in ages

In [None]:
3 in group_size

In [None]:
'r' in group_name

In [None]:
'a' in group_name

In [None]:
'sim' in group_name

In [None]:
7 in group_name

In [None]:
'Middletown' in hometowns

In [None]:
('Randy', 'Middletown') in hometowns

In [None]:
['Randy', 'Middletown'] in hometowns

In [None]:
'Randy' in hometowns

<details>
    <summary><b>Solutions</b></summary>
    
The solutions are explained in the next few subsections
    
</details>

#### 6.1.1 Membership testing for sequences

When we test membership for most sequences, such as lists and tuples, using the syntax:
```python
some_object in some_sequence
```
...the value will be `True` if there is an item in `some_sequence` that equals `some_object` (has the same [***identity***](#identity) ***or*** the same [***value***](#value)), and `False` otherwise. That's why `31.0 in ages` evaluated to `True` even though the list `ages` does not have any items that are floats.

***Note*** that we cannot use `in` to test if a list is a *sub-list*, such as with `[31, 42] in ages`. That membership test evaluated to `False` because none of the items in `ages` are equal to the list `[31, 42]`.

#### 6.1.2 Membership testing for strings

A string is an immutable sequence of unicode code points. However, membership testing works differently for strings. The test...
```python
some_object in some_string
```
***...will*** evaluate to `True` if `some_object` is a *sub-string* of `some_string`. That's why `'sim' in group_name` evaluated to `True`. Notice that the membership test is ***case-sensitive***, which is why `'r' in group_name` evaluated to `False`.

***Note*** also that `some_object` must be a string or a <span style="color:red">***TypeError***</span> will be raised rather than simply returning `False`, such as when we tried `7 in group_name`.

#### 6.1.3 Membership testing for dictionaries

For dictionaries, the membership test...
```python
some_object in some_dictionary
```
evaluates to `True` if `some_object` is a ***key*** of `some_dictionary`. That's why both `'Middletown' in hometowns` and `('Randy', 'Middletown') in hometowns` evaluated to `False`, but `'Randy' in hometowns` evaluated to `True`.

***Note*** also that `some_object` must be hashable or a <span style="color:red">***TypeError***</span> will be raised rather than simply returning `False`, such as when we tried `['Randy', 'Middletown'] in hometowns`.  The reason why will be explained in the section on *Membership testing for sets*.

#### 6.1.4 Membership testing for sets

For sets, the membership test...
```python
some_object in some_set
```
evaluates to `True` if `some_object` is an item in `some_set`.

The ***very important*** thing to note about membership tests for sets, compared to those for lists or tuples, is that the time it takes to perform a membership test for a set does not depend on the size of the set, whereas it does for lists and tuples.  

If list `a` is a million times longer than list `b`, then on average it will take a million times longer to perform a membership test `x in a`. This is because the membership test scans through the list item by item until it finds an item equal to `x` or the list is exhausted.

However, if set `c` is a million times larger than set `d`, it doesn't take any longer to perform a membership test on `c`.  This is because sets are implemented as resizable hash tables (in CPython), so membership tests are performed by computing the hash code of `some_object` to look up it's location in the set (if it is there) and check that one location.

***Note*** that this is why `some_object` must be hashable or a <span style="color:red">***TypeError***</span> will be raised rather than simply returning `False`. The same is true for dictionary keys, which are also implemented as resizable hash tables.

***Important tip***:  if you have a very large list or tuple, and are going to be testing membership on it many times, you may want to convert it to a set first so that the tests are faster. Of course, this is only possible if the items in the list or tuple are hashable. If the items are not hashable (such as lists), then you may want to try to make them hashable (such as converting lists to tuples).

#### 6.1.5 `not in`

Just like the `is not` operator, which is opposite of `is`, Python has a `not in` operator that is the opposite of `in` (i.e. `True` when `in` is `False`, and vice versa). For readability, use
```python
some_object not in some_container
```
...rather than
```python
not some_object in some_container
```

##### Exercise 6.1.2: Is it in?

Predict the output of the following cells. Explain ***why*** you expect that output. Then run each to see if you are correct.

In [None]:
[123] in [123, 456, 789]

In [None]:
'3' in "hello"

In [None]:
("he") in "hello"

In [None]:
"" in "hello"

In [None]:
None in "hello"

In [None]:
[55] in {"Ned": [33], "Oscar": [44], "Pete": [55]} 

In [None]:
(22) in {11, 22, 33, 44}

In [None]:
x = float('nan')
y = float('nan')
x in [y]

In [None]:
1 in [True, False, None]

In [None]:
a = [1,2,3]
b = [1,2,3]
c = [b]
a in c

<details>
    <summary><b>Solutions</b></summary>
    
The ***list*** `[123]` is ***not*** in the list `[123, 456, 789]`.
    
The string `'3'` is not in the string `"hello"`, but the string `"he"` is (it's a sub-string of it). Note that the parentheses do ***not*** make a tuple, because that would require a trailing comma, as in `("he",)`.
    
The empty string ***is*** a substring of ***every*** string, just like the empty set is a subset of every set. If this seems strange, think about a definition of sub-string: Every character in the sub-string appears in the string, in the same order and contiguous. This is true for the empty string (called vacuously true).
    
`None` is not a string, so the membership test raises a <span style="color:red">***TypeError***</span>.
    
`[55]` is not a ***key***, and it is not hashable, so it raises a <span style="color:red">***TypeError***</span>.

`22` is in the set. (again note that this is ***not*** a tuple!)
    
***Remember*** that membership tests if the object is in the container ***or*** there is a value in the container that is equal to the value of the object. Since `x is y` ***and*** `x == y` both evaluate as `False`, so does `x in [y]`! (see [here](#nan_comparison)).
    
`1` has the same value as `True` (i.e. `1==True`), so `1 in [True, False, None]` evaluates as `True`.
    
Although you ***cannot*** access the list `a` through the list `c` (`c` does ***not*** contain a reference to `a`), the membership test evaluates as `True` because `c` contains a list with the same value (i.e. `a == b`).
    
</details>

### 6.2 Subscription using `[]` <a id="subscription"></a>

To reference an item in a container named `some_container`, use the `key` for that item inside square brackets following the container, as in:
```python
some_container[key]
```
The expression above will evaluate to the item in `some_container` designated by `key`, if there is one, otherwise an error will be raised. 

Note that `key` can be any expression that evaluates to an object with the same value as one of the keys of the container.

This is referred to as [***subscription***](https://docs.python.org/3/reference/expressions.html#subscriptions) (also called indexing). The expression is evaluated by calling the `__getitem__` [***method***](https://docs.python.org/3/reference/datamodel.html#object.__getitem__) of `some_container` with the argument `key`.

Sequences of length `N` accept integer keys (also called indices) from `0` to `N-1`, referred to as the [index set](https://docs.python.org/3/reference/datamodel.html#sequences). Built-in sequences also accept negative integer keys between `-N` and `-1`, which is described in the section on [***negative indices***](#negative_indices), and accept slice objects, described in the section on [***slicing***](#slicing). If subscription is performed with an invalid key (i.e. invalid index), <span style="color:red">***IndexError***</span> will be raised.

Mappings can have arbitrary hashable keys (not just integers and slices). If subscription is attempted with an unhashable key, a <span style="color:red">***TypeError***</span> will be raised. If subscription is attempted with a value that is not one of the keys in the mapping, a <span style="color:red">***KeyError***</span> will be raised. 

Let's look at some examples:

In [None]:
names = ('Randy', 'Seth', 'Tim')
ages = [31, 42, None]
group_size = 3
group_name = 'Ransim'
hometowns = {'Randy': 'Middletown', 'Seth': 'Newburgh', 'Tim': 'Oswego'}

##### Exercise 6.2.1: Which item is it?

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
names[1]

In [None]:
ages[False]

In [None]:
ages[2]

In [None]:
group_size[3]

In [None]:
group_name[6]

In [None]:
hometowns[0]

<details>
    <summary><b>Solutions</b></summary>
    
***Remember*** that Python is zero-indexed, so that `names[1]` is the ***second*** item: `'Seth'`.
    
`False` has the same value as `0` (i.e. `False == 0`) so `ages[False]` is the same as `ages[0]`, which is `31`.
    
`ages[2]` is `None`, but that does not produce any output by default. If you evaluate `print(ages[2])`, that will print `None`.
    
`group_size` is an integer, which is not subscriptable so you get a <span style="color:red">***TypeError***</span>. Integers are not containers.
    
`group_name` only has a length of `6`, so the index `6` is "out of range", raising an <span style="color:red">***IndexError***</span>. The index set includes integers from `0` to `5`.

`0` is not one of the keys for the dictionary `hometowns`. To get the first item, you need to use the first key, `'Randy'`, as in `hometowns['Randy']`, which would evaluate to `'Middletown'`.  You could also convert the dictionary's [values](#dict_values) to a list using the [list constructor](#set_dedupe_list), then select the first item in that list, as in `list(hometown.values())[0]`
    
</details>

#### 6.2.1 Negative indices <a id="negative_indices"></a>

Built-in sequences accept negative indices, which count position backward from the end of the sequence, meaning that `some_sequence[-1]` is the last item, `some_sequence[-2]` is the item before that, and so on... up to `some_sequence[-N]`, where `N = len(some_sequence)`, which is the first item in the sequence.

For example, each of the following expressions would evaluate as `True`:
```python
"Python"[-1] == "n"
"Python"[-2] == "o"
"Python"[-3] == "h"
"Python"[-4] == "t"
"Python"[-5] == "y"
"Python"[-6] == "P"
```
Although Python uses 0-based indexing, the reason the last item is `-1`, instead of `-0`, is because `-0 == 0`.

One way to remember which item is referred to by a negative sequence is by adding the length of the sequence to the index. So if the sequence is length `N`, then `-N` is the same as `-N+N`, or `0` (i.e. the first item), and `-1` is the same `-1+N`, or `N-1` (i.e. the last item).

Let's look at some examples:

In [None]:
names = ('Randy', 'Seth', 'Tim')
ages = [31, 42, None]
group_size = 3
group_name = 'Ransim'
hometowns = {0: 'Middletown', 1: 'Newburgh', 2: 'Oswego'}

##### Exercise 6.2.1: Which item is it?

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
names[-0]

In [None]:
ages[-1]

In [None]:
group_size[-3]

In [None]:
group_name[-6]

In [None]:
hometowns[-1]

<details>
    <summary><b>Solutions</b></summary>
    
***Remember***, although Python is zero-indexed, `-0 == 0`, so that `names[-0]` is the ***first*** item: `'Randy'`.
    
`ages[-1]` is `None`, but that does not produce any output by default. If you evaluate `print(ages[2])`, that will print `None`.
    
`group_size` is an integer, which is not subscriptable so you get a <span style="color:red">***TypeError***</span>. Integers are not containers.
    
`group_name` has a length of `6`, so the index `-6` is the first item. It might take practice to remember that, for a sequence of length `N`, the index `N` is "out of range" and raises an <span style="color:red">***IndexError***</span>, but the index `-N` is not out of range; it's the first item.

`-1` is not one of the keys for the dictionary `hometowns`. To get the last item, you need to use the last key, `2`, as in `hometowns[2]`, which would evaluate to `'Oswego'`.  Making the keys to dictionaries be integers does ***not*** make them behave as sequences.  You could also convert the dictionary's [values](#dict_values) to a list using the [list constructor](#set_dedupe_list), then select the last item in that list, as in `list(hometown.values())[-1]`
    
</details>

#### 6.2.2 Slicing <a id="slicing"></a>

Built-in sequences also accept [***slices***](https://docs.python.org/3/glossary.html#term-slice) as keys, which is referred to as [***slicing***](https://docs.python.org/3/reference/expressions.html#slicings) instead of subscription. A slice of a sequence is a sequence of the same type, which is a [***shallow copy***](#list_copy) of a (possibly empty) part of the sequence (or the whole thing).

The syntax for slicing a sequence named `some_sequence` is one or two colons, `:`, inside square brackets after the name, optionally with expressions before and/or after each colon that evaluate to ***integers*** or `None`. The `step` ***cannot*** be zero.  Here are some examples of the syntax:
```python
some_sequence1[start:stop]
some_sequence2[start:stop:step]
```
Before describing `start`, `stop`, and `step` (which behave somewhat similarly as they do in [ranges](#ranges_intro)) in detail, let's look at some examples:

In [None]:
numbers = (0,1,2,3,4,5,6,7,8,9)
letters = "abcdefghijklmnopqrstuvwxyz"

print(numbers[:]) # shallow copy
print(letters[:]) # shallow copy
print(numbers[2:6])
print(letters[2:6])
print(numbers[7:4]) # empty tuple
print(letters[7:4]) # empty string
print(numbers[1:9:2]) # every other, starting at 1, stopping before 9
print(letters[1:9:2]) # every other, starting at 1, stopping before 9
print(numbers[9:2:-3])
print(letters[9:2:-3])
print(numbers[::-1]) # reversed
print(letters[::-1]) # reversed

##### 6.2.2.1 `start`

When `start` is given, that is the first item in the new sequence, if it is non-empty. 

If `start` is not given, it defaults to `None`, which will make the first item in the new sequence the same as the first item in the original sequence ***if step is positive***, or the last item in the original sequence if step is negative.

If `start` is negative, the length of the sequence is added to it, just like for negative indices. 

However, `start` is also ***clipped*** to be between `0` and `N-1` (for a sequence of length `N`), so you can use ***any integer*** value for `start` without raising an error.

If, after adding the length of the sequence to negative values of `start` and/or `stop`, `start >= stop` and `step > 0` or `step is None`, then the slice will be empty. Similarly, if `start <= stop` and `step < 0`, then the slice will be empty.

In [None]:
# running this code will print examples
x = (0,1,2,3,4,5)
print(f"For x = {x}, which has length {len(x)}\n")
for i in range(-7, 6):
    print(f"{f'x[{i}:4] =':>9}", x[i:4])

##### 6.2.2.2 `stop`

When `stop` is given, the item at that index will ***not*** be in the slice. This is similar to [ranges](#ranges_intro).

If `stop` is not given, it defaults to `None`, which for a sequence of length `N` is the same as having `stop = N` `step > 0`, or `stop = -(N+1)` for `step < 0`.

If `stop` is negative, the length of the sequence is added to it, just like for negative indices. 

Just like `start`, `stop` is ***clipped*** so you can use ***any integer*** value for `stop` without raising an error.

Again, after adding the length of the sequence to negative values of `start` and/or `stop`, if `start >= stop` and `step > 0` or `step is None`, then the slice will be empty. Similarly, if `start <= stop` and `step < 0`, then the slice will be empty.

In [None]:
x = (0,1,2,3,4,5)

print(x[:3]) # everything up to, but NOT including the value at index 3
print(x[:4])
print(x[:17]) # the whole sequence
print(x[:-99]) # since
print(x[:-99:-1])
print(x[:2:-1])

##### 6.2.2.3 `step`

The `step` has a similar effect as it does in [ranges](#ranges_intro).

If `step` is `None`, it behaves as `step = 1`.

If `step > 0`, the slice is taking in order of increasing indices from the original sequence.

If `step < 0`, the slice is taking in order of ***decreasing*** indices from the original sequence.

`step` cannot be `0`.

If step is not `1` or `-1`, then items are skipped.

In [None]:
x = (0,1,2,3,4,5)

print(x[::]) # same as x[:]
print(x[::1]) # same as x[:]
print(x[::2]) # every other item
print(x[::3]) # every third item
print(x[::-1]) # reversed
print(x[::-2]) # every other item, reversed

##### Exercise 6.2.2.3: Predict that slice!

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
x = (0,1,2,3,4,5,6,7,8,9)

x[3:6]

In [None]:
x[8:2]

In [None]:
x[0:8:2]

In [None]:
x[6::-1]

In [None]:
x[9:0:-1]

In [None]:
x[0:11]

In [None]:
x[3:6:-1]

<details>
    <summary><b>Solutions</b></summary>
    
`x[3:6]` is a ***copy*** of the sublist that starts at `3` and ends ***before*** `6` (increasing indices).
    
Since `8 > 2` and `step` is not given, `x[8:2]` is an empty list.
    
`x[0:8:2]` has a `step = 2`, so it is every other number, starting at `0` and ending ***before*** `8`.
    
`x[6::-1]` has `step = -1`, so it is every item starting at index `6` and going backwards (decreasing indices).
    
`x[9:0:-1]` is a ***copy*** of ***almost*** the whole list backwards, but it stops ***before*** reaching item `0`.
    
`x[0:11]` is a ***copy*** of the whole list. ***Remember*** that indices that are "out of range" are ***clipped***, so that there is no error raised.
    
`3 < 6`, but `step < 0`, so `x[3:6:-1]` is an empty list.
</details>

##### 6.2.2.4 slice objects

When the syntax `some_container[i:j:k]` is parsed, it creates a [***slice object***](https://docs.python.org/3/reference/datamodel.html#slice-objects) that is equal to the slice object created using the built-in [***slice( )***](https://docs.python.org/3/library/functions.html#slice) called as:
```python
slice(start=i, stop=j, step=k)
```
If any of `i`, `j`, or `k` are missing, the equivalent slice object would be created by using `None` for that argument.

If you ever want to create a slice object programmaticaly, you could use:
```python
x = slice(start=i, stop=j, step=k)
some_container[x]
```
...to do the same thing as:
```python
some_container[i:j:k]
```

#### 6.2.3 Assignment statements with subscriptions

This section will cover several examples of [***assignment statements***](#assignment) with subscriptions (and slicings), including subscriptions as targets on the left hand side and subscriptions as expressions on the right hand side.

As the target on the left hand side, there's an important difference between a mapping subscription, a sequence subscription, and a sequence slicing.

##### 6.2.3.1 Subscriptions in the expression list

A normal subscription (i.e. not a slicing) in the expression list of an assignment statement will be evaluated, and that ***item*** of the container will be assigned to the appropriate target in the target list. Now that item will have at least two names. For example:
```python
some_name = some_container[some_key]
```
In the above assignment, the name `some_name` is bound to the item in `some_container` designated by the key `some_key`. Now that item can be referenced either as `some_name` or as `some_container[some_key]`.

***Note*** that for mutable objects, you can mutate the object through either reference.

##### Exercise 6.2.3.1: subscription in the expression list

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
a = [0,1,2,3,4,5]
b = a[3]
print(b)

In [None]:
b = 6
print(a)

In [None]:
x = {'a': [1], 'b': (2), 'c': {3}}
y = x['a']
print(y)

In [None]:
y[0] = 4
print(x)

<details>
    <summary><b>Solutions</b></summary>

`a[3]` equals `3`, so now `b` equals `3`.
    
`b = 6` rebinds the name `b` so that it now refers to `6`. That does ***not*** change the item that `a[3]` refers to, which is the integer `3`. In fact, since integers are not mutable, there is no way to effect `a` through `b`.
    
`x['a']` equals the list `[1]`, so now the name `y` refers to that ***same*** list (remember that assignment does not create a copy).
    
`y[0] = 4` mutates the list that `y` refers to, by changing its first item to the integer 4. That is the ***same*** list that `x['a']` refers to, so we see that mutation when we print `x`.
    
***Note*** the value in `y` identified by the key `'b'`. It is not a tuple.
</details>

##### 6.2.3.2 Slicing in the expression list

A slicing in the expression list of an assignment statement will be evaluated, and that ***new container object*** will be assigned to the appropriate target in the target list. If the container is not empty, then it is a  [***shallow copy***](#list_copy) of all or part of the container that was sliced. For example:
```python
some_name = some_container[some_slice]
```
In the above assignment, the name `some_name` is bound to a container of the same type as `some_container`. If there are any items in the container `some_name`, those ***same*** items are also in `some_container` and can be referenced by subscripting either name.

***Note*** that for mutable objects, you can mutate the object through either name.

##### Exercise 6.2.3.2: slicing in the expression list

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
a = [0,1,2,3,4,5]
b = a[2:5]
print(b)

In [None]:
b[1] = 6
print(a)

In [None]:
x = ['a', (2,3), [4,5,6], None, lambda x: x**2, True]
y = x[1:5]
print(y)

In [None]:
y[2] = "789"
print(x)

In [None]:
y[1][0] = "four"
print(x)

<details>
    <summary><b>Solutions</b></summary>
    
`a[2:5]` makes a ***new*** list that contains the items `a[2]`, `a[3]`, and `a[4]`. In fact, in this example, `a[2:5]` does the same thing as `[a[2], a[3], a[4]]` would. That's not true in general though; this is only equivalent when `a` is a list with at least five items.
    
`b[1] = 6` mutates the list `b`, but that does not affect the list `a`, since they are distinct lists.
    
`x[1:5]` makes a ***new*** list that contains the items `x[1]`, `x[2]`, `x[3]`, and `x[4]`.
    
`y[2] = "789"` mutates the list `y` (changes the third item in `y` to be a different object), but that does not affect the list `x`.
    
`y[1][0] = "four"`  mutates the list `y[1]` (changes the first item in `y[1]` to be a different object), which is the ***same*** list as `x[2]`. Changing the first item of `y[1]` to `"four"` is the ***same*** as changing the first item of `x[2]` to `"four"`, because they are the ***same*** list.
    
***This is important to understand:***
    
`y = x[1:5]` makes a ***new*** list and binds the name `y` to it, so `y` is a ***different*** list than `x`, ***but*** all of the ***items*** in `y` are the ***same*** as items in `x`, since the slice is a [***shallow copy***](#list_copy).
    
There is a difference between mutating a list, and mutating one of the mutable items ***in*** the list. `y[1][0] = "four"` does ***not*** mutate `y` (it still contains the same items), but it does mutate one of the items contained in `y`.
    
The lists `x` and `y` are ***different***, meaning that `(y is x) == False`, but the item `y[1]` is the ***same*** as the item `x[2]`, meaning that  `(y[1] is x[2]) == True`. That item is a list, which is mutable. Mutating `y[1]` is the same as mutating `x[2]`, because `x[2]` and `y[1]` are two names for ***one*** list.
</details>

##### 6.2.3.3 Subscriptions in the target list <a id="subscription_target"></a>

A normal subscription (i.e. not a slicing) in the target list of an assignment statement will raise a <span style="color:red">***TypeError***</span> if the container is not mutable, stating that the object `does not support item assignment`.

For mutable containers, there is an important difference in behavior for sequences versus mappings. For sequences (e.g. lists), the key in the subscription must be a valid index for the sequence. Assigning to a subscription ***cannot*** change the size of the sequence. For example:
```python
some_sequence[some_key] = some_expression
```
In the above assignment, if `some_key` is not a valid index for the sequence, an <span style="color:red">***IndexError***</span> will be raised.  If it is a valid index, then `some_expression` will be evaluated and `some_sequence[some_key]` will be bound to that object.

For mappings (e.g. dictionaries), the key in the subcription does ***not*** need to be one of the keys for the mapping. Assigning to a subscription ***can*** change the size of the mapping. For example:
```python
some_mapping[some_key] = some_expression
```
In the above assignment, if the `some_key` ***is*** already one of the keys of `some_mapping`, then the assignment rebinds `some_mapping[some_key]` to refer to the object that `some_expression` evaluated to. Otherwise,`some_key` becomes one of the keys for `some_mapping`, and refers to the object that `some_expression` evaluated to.

##### Exercise 6.2.3.3: subscription in the target list

Predict the output of the following cells, then run each to see if you are correct.

In [None]:
p = ['a', 'b', 'c', 'd', 'e']
p[5] = 'f'
print(p)

In [None]:
q = (0, 1, 2)
q[1] = 4
print(q)

In [None]:
x = {'a': [1], 'b': (2), 'c': {'d', 4}}
x['e'] = 5
print(x)

In [None]:
x['a'][0] = 11
print(x)

In [None]:
x['b'][0] = 22
print(x)

In [None]:
x['c']['d'] = 44
print(x)

In [None]:
y = ([0], [1], [2])
y[1][0] = 11
print(y)

<details>
    <summary><b>Solutions</b></summary>
    
`p` only has 5 items, so `p[5]` raises an <span style="color:red">***IndexError***</span> (the valid indices are `0` through `4`).
    
`q` is a tuple, which is immutable, so `q[1] = 4` raises a <span style="color:red">***TypeError***</span>. You cannot assign anything to an item of a tuple, so a subscription or slicing of a tuple should never be in the target list of an assignment statement.
    
`x['e'] = 5` adds a `key:value` pair to the dictionary, so that the key `'e'` now refers to the value `5`.
    
`x['a'][0] = 11` mutates the list referenced by `x['a']` so that its first item is `11`.
    
`x['b']` is an integer, which is not a container, so it does not support item assignment. Hence, `x['b'][0] = 22` raises a <span style="color:red">***TypeError***</span>.

`x['c']` is a set, which is immutable, so `x['c']['d'] = 44` raises a <span style="color:red">***TypeError***</span>.
    
`y[1]` refers to a list with at least one item, so `y[1][0] = 11` mutates that list so that its first item is `11`. Remember, you ***cannot*** mutate a tuple (i.e. change which items it contains), but you ***can*** mutate the mutable items ***in*** a tuple.
</details>

# Work in progress

#### 6.2.4 `del` statements with subscriptions

- Sequences
- Mappings
- Slicings

### 6.3 Sequences <a id="sequences"></a>

### 6.4 Sets <a id="sets"></a>

### 6.5 Mappings <a id="mappings"></a>

### 6.6 Strings <a id="strings"></a>

### 6.7 Other Containers <a id="other_containers"></a>

In [None]:
Mutable sequences should provide methods append(), count(), index(), extend(), insert(), pop(), remove(), reverse() and sort()

In [None]:
mapping methods keys(), values(), items(), get(), clear(), setdefault(), pop(), popitem(), copy(), and update()

What to cover:
* [containers](https://docs.python.org/3/library/collections.html)
  * [membership](https://docs.python.org/3/reference/expressions.html#membership-test-operations)
  * set vs list
  * dictionary - key, not pair
* sequences
  * subscription (indexing) - including negative, assignment, augmented assignment
  * adding, multiplying (same list multiple times)
  * slices, are copies, assignment, augmented assignment
  * immutable methods
  * mutable methods <a id="lists_more"></a>
  * namedtuple
* sets
  * methods
  * frozenset
* mappings
  * subscription, assignment, augmented assignment
  * methods
  * ordereddict
  * defaultdict
  * counter
* strings <a id="strings_more"></a>
  * methods
  * formatting
  * unicode: `chr` and `ord`
  * bytes
* others
  * arrays
  * bytearrays
  * ?

## Part 7: Dates and Times

What to cover:
* time module
  * time stamps
  * UTC
  * time zones - aware/naive
  * leap seconds
  * strftime
  * timetuples
* datetime
  * dates
  * times
  * timezones - UTC offsets
  * datetimes
  * timedeltas

<a id="Iters_Gens"></a>
## Part 8: Iterators and Generators

What to cover:
* iteration
  * iterables
  * iter
  * over same iterable concurrently
  * next
* generators
  * generating functions
  * yield
  * infinte generators
  * short circuiting
  * exhaustion
  * send
  * throw
* assignment expression in while loop

## Part 9: Error Handling <a id="error_handling"></a>

What to cover:
* the concept of errors
  * overt/covert
  * persistent/intermittent
  * syntax errors, runtime errors, and logic errors
* raise
* try, except, else, finally
* identify existence, find in code, understand cause, fix
* testing (glassbox and whitebox) and debugging
* finding errors
  * print and logging
  * commenting out code
  * setting the seed
  * defensive programming - fail fast (pros & cons)
* error messages
  * reading
  * documentation
  * googling
* understanding errors
  * object types
  * mutability/immutability
  * incorrect logic
* common errors for beginners, and Python gotchas
* warnings
  * vs errors
  * creating your own

## Part 10: Classes <a id="classes"></a>

## Part 11: Regex and match statements <a id="regex_and_match"></a>

## Part 12: HTML and XML <a id="html_and_xml"></a>

## Part 13: Modules and Packages <a id="modules"></a>

## Part 14: Advanced Function Topics

What to cover:
* Scoping: global, local, nonlocal
* Closures
* functools
* Decorators
* recursive functions

## Part 15: More on file/os functionality

What to cover:
* Checking if files or directories exist
* Making directories
* Copying files and directories
* Deleting directories
* Walking directories
* Filename pattern matching
* Constructing paths
* File/Directory Metadata
* Disk Usage

## Part 16: Multiprocessing

## Part 17: Databases/SQLite

# Index

## reused markdown syntax

<details>
    <summary><b>Solution</b></summary>
    
    <span style="color:red">***NameError***</span>
</details>

<details>
    <summary><b>Remember</b></summary>
    
    What is a <span style="color:red">***NameError***</span>?
</details>

***[Errors](https://docs.python.org/3/tutorial/errors.html#):***

The two most important lines to read are:
1. The line at the very bottom, which usually states what kind of error it is (e.g. <span style="color:red">***NameError***</span>).
2. The line that shows which line (with line number) caused the error.

<a id="some_id"></a>

<details>
    <summary><b>Remember</b></summary>

What is a <span style="color:red">***NameError***</span>?
</details>

<details>
    <summary><b>Remember</b></summary>
What is a <span style="color:red">***NameError***</span>?
</details>

<details>
    <summary><b>Remember</b></summary>
    
    What is a <span style="color:red">***NameError***</span>?
</details>