# In this notebook, we will look at the basics of Python syntax, data types, and control flow
We're mainly going to focus on the core operations, the ones that you will use most often.
There is much, much more that you can do with all of the different data types in Python.

There's a lot here so it may seem pretty dry, but it will be useful for you going forward
as a quick **Python cheatsheet**.

## Data type: int and float

In [112]:
# Arithmetic
x = 1 + 4        # 5
x = -2.5 * 8     # -20.0
x = -16 / 3      # -5.33 (Note: Division always returns a float, even if the result is a whole number)
x = 16 // 3      # 5 (Division rounded down to the nearest whole number. Returns an int if both operands are ints)
x = 8 % 3        # 2 (Returns the remainder of division. Returns an int if both operands are ints)
x = 2 ** 3       # 8 (Exponentiation [i.e. 2 * 2 * 2]. Returns an int if both operands are ints)
x += 5           # 13 (Equivalent to saying x = x + 5 [i.e. x = 8 + 5])

All of the above operations can be applied to ints, floats, or mixes of the two. As a general rule,
the result will only be an `int` if both operands are also `int`s **and** the result is a whole number.
The exception is division, which will always return a `float` no matter what. If you need to convert
a `float` to an `int`, you can "cast" it as shown below (note that if you're ever unsure of the type
if a variable, you can always get it using the built-in `type` function).

In [32]:
x = 2.5
print(x, type(x))
y = int(x)
print(y, type(y))

2.5 <class 'float'>
2 <class 'int'>


## Data type: string (str)

In [61]:
# We can join two strings using the + operator
x = 'We can join this string' + ' with this one'
print(x)

# We can join a list of strings and fill whatever we want in between
x = '-'.join(['We can join', 'a list of strings', 'with stuff in between', '(in this case, dashes)'])
print(x)

# Similarly, we can split a string into a list by whatever we want
x = 'We can split this string by its spaces'.split(' ')
print(x)

# We can "cast" (i.e. convert) numbers to strings
x = str(2.5)
print(x, type(x))

# We can format strings with certain placeholder markers (useful for neat printing)
pi = 3.14159
x = 'PI is {}, and rounded to the nearest 100th it is {:.2f}'.format(pi, pi)
print(x)

# If we want to use certain special characters in our string (for instance, '), we can "escape" them by prepending a \
x = 'Let\'s escape the "\'" character'
print(x)

We can join this string with this one
We can join-a list of strings-with stuff in between-(in this case, dashes)
['We', 'can', 'split', 'this', 'string', 'by', 'its', 'spaces']
2.5 <class 'str'>
PI is 3.14159, and rounded to the nearest 100th it is 3.14
Let's escape the "'" character


## Data type: boolean (bool) [i.e. True/False]

In [89]:
# Literals
x = True
y = False

# "and" returns True when both sides are True
z = x and y        # False
z = x and x        # True
z = y and y        # False

# "or" returns True when at least one side is True
z = x or y        # True
z = x or x        # True
z = y or y        # False

# "not" returns the opposite of what comes after it
z = not x         # False
z = not y         # True

# Order of operations: not -> and -> or
z = x or y and not x    # What do you think this outputs?

# Comparison operators
z = 5 > 2         # True (greater than). There is also <
z = 5 >= 5        # True (greater than or equal to). There is also <=
z = 5 == 2        # False (equal to). Remember than just a single "=" is used for assignment, so Python uses a double "=" for equality
z = 5 != 2        # True (not equal to)

## Data type: list

In [72]:
# Definition
x = [5, 1, 0, 12, 6]

# Getting the size of the list
y = len(x)      # 5 (x has 5 entires)

# Indexing one entry
y = x[0]        # 5 (1st entry, because Python starts counting from 0)
y = x[2]        # 2 (3rd entry)

# Indexing a sequence of entries
y = x[1:4]      # [1, 0, 12] (Read as start:[end - 1], so 2nd - 4th in this case)
y = x[:4]       # [5, 1, 0, 12] (Read as 0:[end - 1], so 1st - 4th in this case)
y = x[1:]       # [1, 0, 12, 6] (Read as start:[len(x) - 1], so 2nd - 5th in this case)
y = x[1:len(x)] # [1, 0, 12, 6] (Same as above)

# Adding to lists
x.append(1)     # x is now [5, 1, 0, 12, 6, 1]
y = [8, 2] + x  # [8, 2, 5, 1, 0, 12, 6, 1]

# Modifying entires
x[3] = 20       # x is now [5, 1, 0, 20, 6, 1]

# Checking membership
y = 12 in x    # True (x contains the value 12)

## Data type: tuple

In [101]:
# Definition
x = ('eric', 'python', 'canada')

# Getting the size of the tuple
y = len(x)      # 2

# Indexing one entry
y = x[0]        # 'eric'
y = x[1]        # 'python'

# Indexing a sequence of entries
y = x[1:3]      # ('python', 'canada')

# "Unpacking" tuples
name, language, country = x    # name = 'eric', language = 'python', and country = 'canada'

# Combining tuples
y = (8, 2) + x  # (8, 2, 'eric', 'python', 'canada')

# Checking membership
y = 12 in x    # False (x does not contain the value 12)

# You cannot modify tuple elements! They are "immutable" (this is the essential difference with lists)
# Executing the code below would cause errors
#x.append(1)    # AttributeError: 'tuple' object has no attribute 'append'
#x[3] = 20      # TypeError: 'tuple' object does not support item assignment

It might seem weird that tuples are essentially just lists that can't be modified. What are they good for?
Essentially, the difference is that they're used for different purposes:
- When you use a tuple, you're telling the people who read your code "this variable will not change".
- In practice, people use lists for "homogenous" data (i.e. a collection of the same sort of stuff), whereas the entires in tuples are often "heterogeneous" and might represent very different things. In the example above, the first entry represents my name, the second entry represents the coding language I'm using, and the third entry represents the country I live in. It would be strange to define these different concepts in the same list. Tuples are often even composed of different data types.
- Tuples can be used as "keys" in dictionary data structures (which we'll see below), whereas lists cannot.

These differences communicate the variable's function, and helps make your code more understandable.

## Data type: dictionary (dict)
These are one of the more interesting data types. They're kind of like lists, but you index them using "keys" rather
than their position, where these "keys" can take on numerous data types (e.g. `str`, `tuple`, `int`, ...), so long
as the data type is what we call "hashable" (Google this term if you're interested, it's pretty clever!).

In [103]:
# Definition
x = {'bonnie': 82, 'eric': 100, 'lynxi': 12}    # These are 3 key:value pairs. For instance, maybe test marks for different people

# Getting the size of the dictionary
y = len(x)            # 3

# Indexing
y = x['eric']         # 100 (The value at the 'eric'th entry)

# Adding new entries to dictionaries
x['gad'] = 76         # {'bonnie': 82, 'gad': 76, 'eric': 100 'lynxi': 12}

# Modifying existing entires in dictionaries
x['eric'] = 60        # {'bonnie': 82, 'gad': 76, 'eric': 60 'lynxi': 12}

# Checking membership (of keys!)
y = 'eric' in x       # True (x contains the key 'eric')
y = 12 in x           # False (Although the value 12 is in x, it is not a key)

# Checking membership of values
y = 12 in x.values()  # True (x contains the value 12)

**Note**: Dictionaries are not "ordered" data types, and it is dangerous to write your code in a way that assumes they are.
It makes no sense to say that the 'bonnie'th entry comes before the 'eric'th entry, even though this is how we have written
it above.

## Control flow: "if" statements

"if" statements are one of the most essential concepts in any programming language because they allow the code to make decisions. Without "if" statements, programs would pretty much be like large, complex calculators. The format of an if statement is as follows:
```python
if [boolean expression]:
    [what to do when the boolean expression evaluates to True]
else:   # optional
    [what to do when the boolean expression evaluates to False]
```

In [98]:
# Using the data structure above, let's output something different depending on whether or not someone passed the course
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
pass_mark = 60

if course_marks['lynxi'] > pass_mark:
    print('Lynxi passed the course!')
    # Note that we can have as many lines inside here as we want
else:
    print('Lynxi failed :(')

# The else statement is optional. Often, you want to do nothing if the condition is not met
if course_marks['lynxi'] > pass_mark:
    print('Lynxi passed the course!')
    # Note that we can have as many lines inside here as we want

Lynxi failed :(


## Control flow: "for" loops
If all we were able to do with lists, tuples, and dictionaries was store data in them, they would essentially
just be useful for organizing our code. Luckily, we can iterate through them using "for" loops. The "for" loop
has the following format:
```python
for [variable] in [iterable data structure (like a list):
    [what you want to do using the current value each iteration]
```

In [113]:
# Let's compute an average course mark using different data structures

# Iterating through lists (or tuples)
course_marks = [82, 100, 12]
marks_sum = 0
for mark in course_marks:    # ("mark" takes on the value 82 the first time the loop executes, then 100, then 12)
    marks_sum += mark
    # At the end of what's in the "body" of the "for" loop, it starts again. This happens until we reach the end of the list.
mean = marks_sum / len(course_marks)
print(mean)

# Iterating through lists (or tuples) by their index (this is sometimes useful, for instance if you want to change the values in the list)
course_marks = [82, 100, 12]
for i in range(len(course_marks)):    # range(len(course_marks)) creates a list of ints going from 0 to [len(coure_marks) - 1]
    course_marks[i] -= 10
print(course_marks)

# Iterating through dictionary keys
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
marks_sum = 0
for name in course_marks:
    mark = course_marks[name]
    marks_sum += mark
mean = marks_sum / len(course_marks)

# Iterating though dictionary values
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
marks_sum = 0
for mark in course_marks.values():
    marks_sum += mark
mean = marks_sum / len(course_marks)

# Iterating though dictionary keys and values simultaneously
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
marks_sum = 0
for name, mark in course_marks.items():
    print('{} has mark: {}'.format(name, mark))
    marks_sum += mark
mean = marks_sum / len(course_marks)

64.66666666666667
[72, 90, 2]
bonnie has mark: 82
eric has mark: 100
lynxi has mark: 12


## Control flow: "while" loops
Sometimes, we don't want our loop to iteratate through the values of some data structure, but
instead want it to execute until some condition is no longer met. For this, we use a "while" loop, which has the
following format:
```python
while [boolean expression]:
    [what you want to do each iteration]
```

In [114]:
# Let's keep bumping up everyone's grade until we have an average of 80 or above
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}

marks_sum = 0
for mark in course_marks.values():
    marks_sum += mark
mean = marks_sum / len(course_marks)

while mean < 80:    # Stop the loop once the variable "mean" is 80 or above
    # Mean is below 80, so increase everyone's mark by 1 (but don't go past 100!)
    marks_sum = 0
    for name, mark in course_marks.items():    # FYI we can nest loops and "if" statements as much as we like
        if mark < 100:
            mark += 1
            course_marks[name] = mark
        marks_sum += mark
    mean = marks_sum / len(course_marks)
    
print(course_marks)

{'bonnie': 100, 'eric': 100, 'lynxi': 40}


*Warning*: If you accidentally write a condition that will always evaluate to True (for instance, by forgetting to recompute `mean`
at the end of the loop, the loop will keep running forever. This is called an **infinite loop** (I made this mistake when writing
this tutorial!).

## Functions
Parts of code are bound to be reused. For instance, above, we keep rewriting the same exact code for computing the mean
of a classes marks. Not only does this make our code longer, but it also makes it less readable. Every time we see
the code for computing the mean, we have to read it a bit to know what it does.

A **function** let's us define a chunk of code that takes some **inputs** and returns some **outputs**. Think of it as
a mini-program that we can reuse as many times as we want in the rest of our code. This is one of the most powerful
tools at your disposal as a programmer, and you should try to make a function out of any chunk of code that shares a
concrete purpose. As a general rule, if you can give a simple name to what the chunk of code does, it should be a function.
This is called **modular programming**, and it will make your code both easier to write and understand.

The format of a function definition is as follows:
```python
def function_name(argument_1, argument_2, etc.):
    [do something using the arguments]
    return val_1, val_2, etc.
```
Note that a function need not take in any arguments at all, and need not return anything.

Elsewhere in our program, we can "call" our function as follows, substituting any values we want
for the arguments:
```python
output = function_name(value_1, value_2, etc.)
```

To illustrate the concept, let's rewrite the code above using functions.

In [122]:
# Takes as input a list of numbers, and returns their mean
def mean(numbers):
    numbers_sum = 0
    for num in numbers:
        numbers_sum += num
    numbers_mean = numbers_sum / len(numbers)
    return numbers_mean

# Takes in a dictionary of string:number pairs and increments each number by "increment".
# By default, we are setting "increment" to 1 (i.e. if this function is "called" without specifying
# a value for "increment" it will just take on the value 1)
def increment_marks(course_marks, increment=1):
    for name, mark in course_marks.items():
        if mark < 100:
            course_marks[name] += 1
    # This function doesn't return anything
    
# Let's use our functions to bump up everyone's grade until we have an average of 80 or above.
# Notice how much more readable it is than what we had in the previous code, because we've
# split up the individual pieces of logic into their own functions
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
course_mean = mean(course_marks.values())
while course_mean < 80:
    increment_marks(course_marks)
    course_mean = mean(course_marks.values())
print(course_marks)

# Let's pass in a value for the "increment" argument in the "increment_marks" function instead of using the default
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
course_mean = mean(course_marks.values())
while course_mean < 80:
    increment_marks(course_marks, 5)    # Increment the marks by 5 at a time so that the loop runs faster (fewer iterations)
    course_mean = mean(course_marks.values())

# We can also call functions with their argument names. This makes it easier for people reading
# the code to understand what the function does without having to go look at the argument names
# in the function's definition. This is most useful when a function has many arguments
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
course_mean = mean(numbers=course_marks.values())
while course_mean < 80:
    increment_marks(course_marks, increment=5)    # We can specify the names of only some of the arguments if we want. Here, we only specify it for "increment"
    course_mean = mean(numbers=course_marks.values())

{'bonnie': 100, 'eric': 100, 'lynxi': 40}


## Errors and exception handling
When we write a line of code that the computer couldn't execute, the program
stops running and we get an **error**. This error will tell us what line of
code caused the problem, and ideally will also give us some useful hints
about what went wrong. Let's look at an example.

In [123]:
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
donalds_mark = course_marks['donald']

KeyError: 'donald'

We got a `KeyError`. This means that we were trying to access an entry for which there was no existing
key in our dictionary (i.e. it doesn't contain an entry for 'donald'). In this case, the error didn't give
us much detailed information, because `KeyError` is such a common occurrence and its cause is so obvious.

If we expect certain errors occuring in our program and don't want them to break when encountered,
we can "catch" them. The syntax for doing so looks like this:
```python
try:
    [write the code that might cause an error]
except error_type_1, error_type_2, etc. as error_variable:
    [the code that you want to run when the error occurs]
```
If an error occurs within the `try` block, **and** the error is of a type that we're trying to
catch (e.g. error_type_1), then the code within the `except` block executes and the program will
continue to run normally. In addition, `error_variable` will be a data structure that contains
information about the error that occured, in case we want to do something with it (e.g. log
its occurence in a file). For instance, we could catch the above error using the following:

In [126]:
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
try:
    donalds_mark = course_marks['donald']
except KeyError as e:
    print('Uh oh, an error occured: {} with type {}'.format(e, type(e)))

Uh oh, an error occured: 'donald' with type <class 'KeyError'>


In most other programming languages, code like what we have written above would be perfectly fine. In Python, however, catching exceptions
is typically frowned upon. It is considered much better practice to use `if` statements instead to check whether or not an error could happen.
For instance, we could rewrite the above as:

In [127]:
course_marks = {'bonnie': 82, 'eric': 100, 'lynxi': 12}
if 'donald' in course_marks:
    donalds_mark = course_marks['donald']
else:
    print('Uh oh, donald is not in the course')

Uh oh, donald is not in the course


## Copying
Unfortunately, variable assignment in Python is not always consistent. Sometimes, variable assignment creates
a **copy** of the expression on the right-hand side, while other times the variable becomes an **alias**
(a different name for the same thing) of what is on the right-hand side. Consider the examples below.

In [128]:
x = 5
y = x
y = 7
print('x = {}, y = {}'.format(x, y))

x = 5, y = 7


When we wrote `y = x`, a **copy** of `x` was created. Therefore, when we modified `y` afterwards, `x` did not change.
Therefore, we say that `int` types are **assigned by value**.

In [129]:
x = [3, 5, 1]
y = x
y[1] = 12
print('x = {}, y = {}'.format(x, y))

x = [3, 12, 1], y = [3, 12, 1]


When we wrote `y = x`, an **alias** of `x` was created. `x` and `y` were essentially different variable names
both referencing the same object. Therefore, we say that `list` types are **assigned by reference**.

Typically, more complex data types are assigned by reference, while simple data types are assigned by value. Luckily, if we want to assign complex
data types by value, there is an easy solution using the built-in `copy` module.

In [134]:
from copy import copy # from the "copy" module (built-in python file), import the "copy" function for use in our program

x = [3, 5, 1]
y = copy(x)
y[1] = 12
print('x = {}, y = {}'.format(x, y))

x = [3, 5, 1], y = [3, 12, 1]


Unfortunately, the copy function does not get applied recursively. Consider the following.

In [140]:
from copy import copy

# This works fine (x does not get modified)
x = [[1, 2], [3, 4], [5, 6]]    # Lists within a list
y = copy(x)
y[1] = [7, 8]
print('x = {}, y = {}'.format(x, y))

# This does not work (x gets modified)
x = [[1, 2], [3, 4], [5, 6]]    # Lists within a list
y = copy(x)
y[1][0] = 7
print('x = {}, y = {}'.format(x, y))

x = [[1, 2], [3, 4], [5, 6]], y = [[1, 2], [7, 8], [5, 6]]
x = [[1, 2], [7, 4], [5, 6]], y = [[1, 2], [7, 4], [5, 6]]


To get around this, we simply use the `deepcopy` function from the `copy` module.

In [142]:
from copy import deepcopy

x = [[1, 2], [3, 4], [5, 6]]    # Lists within a list
y = deepcopy(x)
y[1][0] = 7
print('x = {}, y = {}'.format(x, y))

x = [[1, 2], [3, 4], [5, 6]], y = [[1, 2], [7, 4], [5, 6]]


## Bonus data types: date and time
Python contains built-in data types for handling date and time. This is great, because as any programmer eventually learns, 
dates and times are one of the most annoying things to deal with due to all the various exceptions (think time zones, leap years, daylight savings, etc.).

Here, we're just going to go over some **very** basic uses of the `datetime` module in Python, but see [this helpful tutorial](https://www.programiz.com/python-programming/datetime).

In [165]:
import datetime

# Get the current time in your computer's timezone
now = datetime.datetime.now()
print(now)

# Get parts of the datetime
current_year = now.year
print(current_year)

# Get the current time in the standard UTC timezone
utcnow = datetime.datetime.utcnow()
print(utcnow)

# Just get the current date
today = datetime.date.today()
print(today)

# Create your own datetime object
some_date = datetime.datetime(year=1994, month=11, day=23, hour=11)  # Can also specify minute, etc.
print(some_date)

# Compute the passage of time up to sub-millisecond precision
start = datetime.datetime.now()
x = 0
for i in range(1000):
    x += 1
end = datetime.datetime.now()
elapsed_time = end - start
print(elapsed_time.total_seconds())

2020-09-13 02:05:11.376362
2020
2020-09-13 06:05:11.376572
2020-09-13
1994-11-23 11:00:00
0.000175
