# Class 2 - 11.3.19


## Conditionals and Iteration

Even though you've learned it by yourself in the first class and while doing homework, we'll go through these subjects (quickly) one more time, for the sake of completeness.

### The `if` Statement

In [1]:
# Notice the indentation and colon
x = 10
if x > 10:
    print("x is bigger than 10")

In [2]:
y = 11
if x > y:
    print("x")
else:
    print("y")

y


In [3]:
# Chained conditionals
z = 12
if x > y:
    print("x")
elif x > z:
    print("x > z")
elif z < y:
    print("z is small")
    if z < x:
        print("wow, z IS small")
# No need for else

### Iteration
One of Python's strongest features.

#### How would one loop in other programming languages? Let's examine MATLAB:


```matlab
% Sum all items in array
data = [10, 20, 30];
result = 0;
for idx = 1:size(data, 1)
    result = result + data(idx);
end
```
    


In [4]:
# In Python we can iterate over the values themselves
data = [10, 20, 30]
result = 0
for value in data:
    print(value)
    result += value

print("The result is:", result)

# We iterate over the values with the "in" operator.

10
20
30
The result is: 60


In [5]:
# Of course we can iterate over indices
indices = [0, 1, 2]  # 0-based indexing
data = [10, 20, 30]
result = 0
for idx in indices:
    result += data[idx]
print(result)

60


In [6]:
# In some cases you really do need to use indices. In these cases, the range() function is your friend.
for i in range(10):
    print(i)

# Range is a "generator" (well, kinda), which means that it doesn't create the actual 
# list of items when it's called - only when iterated over.
# We'll discuss generators at a later stage of our course.
range(10000000)  # requires nearly 0 memory

0
1
2
3
4
5
6
7
8
9


range(0, 10000000)

In [7]:
# Range has a start, stop and step arguments
list(range(1, 10, 2))

[1, 3, 5, 7, 9]

In [8]:
# We can iterate over nearly anything - two examples follow:
tup = (1, 2, True, 3.0, 'four')
for item in tup:
    print(item)

print('--------')
string = "abcdef"
for letter in string:
    print(letter)

1
2
True
3.0
four
--------
a
b
c
d
e
f


In [9]:
# Getting both the index and the value of some sequence is done using the "enumerate" keyword:
tup = (1, 2, True, 3.0, 'four')
for index, item in enumerate(tup):
    print("The index of the item %s is %d" %(item, index))
    

The index of the item 1 is 0
The index of the item 2 is 1
The index of the item True is 2
The index of the item 3.0 is 3
The index of the item four is 4


In [10]:
# By the way, it means that strings are also sliceable
string = 'abcde'
print(string[0])
print(string[-1])
print(string[-1:-3:-1])

a
e
ed


In [11]:
# Dictionary iteration - the default is over the keys
print('Key iteration:')
dict1 = {'a': 1, 'b': 2, 'c': 3}
for key in dict1:
    print(key)
    
print('-----')
print('Value iteration:')
# If you wish to iterate over the values you have to be explicit:
for val in dict1.values():
    print(val)

print('-----')
print('Pairs iteration:')
# Iterating over the pair is also easy:
for key, val in dict1.items():
    print(key, val)

Key iteration:
a
b
c
-----
Value iteration:
1
2
3
-----
Pairs iteration:
a 1
b 2
c 3


### While Loop

In [12]:
def countdown(n):
    """ Explodes a bomb when n is zero. """
    while n > 0:
        print("{}...".format(n))
        n = n - 1
    print("BOOM!")
    return True

In [13]:
n = 10
val = countdown(n)

print(val)

10...
9...
8...
7...
6...
5...
4...
3...
2...
1...
BOOM!
True


In [14]:
# Stopping a loop is done with break
data = [1., 1., 1., 4.,]
for datum in data:
    if datum != 1.:
        print(datum)
        break
    print("Still 1...")

Still 1...
Still 1...
Still 1...
4.0


## Formatting

We can print together variables and text in several different manners:

In [15]:
# Not recommended as it doesn't allow for customizations
a = 42
print("The value of a is", a)

The value of a is 42


In [16]:
# Older version, similar to other languages
a = 42
b = 32
print("The value of a is %d, while the value of b is %d" % (a, b))

The value of a is 42, while the value of b is 32


In [17]:
# A decent option
a = 42
b = 32
print("The value of a is {}, while the value of b is {}".format(a, b))

The value of a is 42, while the value of b is 32


In [18]:
# Only for Python 3.6+ - but it's pretty cool. It's called f-strings
a = 42
b = 32
print(f"The value of a is {a}, while the value of b is {b}")

The value of a is 42, while the value of b is 32


In [19]:
# Another example
a = 42
b = 32
print(f"The value of a is {a:.2f}, while the value of b is not {b + 1}.")
# You can write any expression you'd like inside the curly brackets.

The value of a is 42.00, while the value of b is not 33.


Throughout the course you'll see me using mostly the f-string option, which is the most readable assuming you only work on Python 3.6+.

## Comprehensions

If you're still not convinced on this whole Python business, comprehensions might be the thing that will flip you to the good side. Comprehensions are a different way to create lists and other iterables. It's usually faster and more readable than the alternatives.

Assume we wish to create a list with the squared values of the numbers in the range [0, 10). Let's observe the different implementation options we have:

In [20]:
# Populate the list the usual way
squares = []
for item in range(10):
    squares.append(item ** 2)
    
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [21]:
# A classic Python one-liner - list comprehension
squares = [x ** 2 for x in range(10)]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


This amazing piece of software iterates of the range with the variable `x`, and places its square inside a list, which is then allocated to the `squares` variable.

What other goodies do comprehensions allow? Filtering.

In [22]:
squares = [x ** 2 for x in range(10) if x != 8]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 81]


The `if` statement is evaluated on each iteration of `x`. Notice how similar to English this expression is?

Funnily enough, performance isn't hindered:

In [23]:
%timeit squares = [x ** 2 for x in range(100) if x != 8]

34.3 µs ± 1.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [24]:
%%timeit
squares = []
for item in range(100):
    if item != 8:
        squares.append(item ** 2)

41.1 µs ± 3.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [25]:
# Another example
strings = [cons.upper() for cons in 'abcdefghij' if cons not in 'aeiou']

print(strings)

['B', 'C', 'D', 'F', 'G', 'H', 'J']


Happily enough, comprehensions aren't limited to lists. You can also comprehend dictionaries, but for that you'll need the built-in `zip` function:

In [26]:
keys = 'abcde'
vals = [1, 2, 3, 4, 5]
bools = [True, True, False]
# Zip bundles up these iterables together, allowing iteration
for a, b, c in zip(keys, vals, bools):
    print(a, b, c)

a 1 True
b 2 True
c 3 False


In [27]:
# With zip we can bundle up the pairs of key-value
keys = 'abcde'
vals = [1, 2, 3, 4, 5]
dic = {key: val for key, val in zip(keys, vals) if key is not 'c'}

print(dic)

# If we didn't want to use the "if" part of the comprehension - this is easier:    
dict3 = dict(zip(keys, vals))
print(dict3)

{'a': 1, 'b': 2, 'd': 4, 'e': 5}
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}


In [28]:
# And set comprehension
se = {item for item in [1, 1, 1, 5, 6, 7, 7]}

print(se)

{1, 5, 6, 7}


Lastly, list comprehensions can iterate over more than one iterable:

In [29]:
summed_list = [a + b for a in range(10) for b in range(10, 20)]

# Identical to:
"""
summed_list = []
for a in range(10):
    for b in range(10, 20):
        summed_list.append(a + b)
"""
print(summed_list)

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]


Once your eyes and brains are used to looking at comprehensions, they're the most "natural" way to iterate over iterables. Their brevity on the one hand, and readability on the other, are their most important features. That's why comprehension are very encouraged when writing true "Pythonic" code.

## Function Default Arguments

One thing we didn't show when introducing function was the default argument value feature of function definitions. 

Say you have a function which calculates the nth-power of any number its given. It obviously has to be given two inputs - the number and the power by which it will be multiplied. Here's such a function:

In [44]:
import math


def power_of(x, power):
    """ Raises x to the power of power """
    return math.pow(x, power)

This function works wonderfully, but we're assuming that most of the times users will raise `x` to the power of two. If we want to save the hassle for the caller, we can use default arguments when defining the function:

In [45]:
def power_of(x, power=2):
    """ Raises x to the power of power """
    return math.pow(x, power)

We can now call this function either with the power argument or without it, which will default it to 2:

In [47]:
print(power_of(10, 3))

print(power_of(10))

1000.0
100.0


# Version Control

## Introduction

Version control is the active management of the history of your source code. It is an essential part of every developers' work cycle, for both small and large projects.

With version control, in any point in time during the work on your code you can decide to "commit" the change. Committing your code means that the system will remember the current state of your work (all files in a folder), and will allow you to return to this exact state of your codebase whenever you wish.

Using version control is orthogonal to the traditional save operation. Saving records the current state of your codebase, but usually doesn't allow you to "go back in time" to previous versions.

This property is useful in many occasions. For example, if you have a working version of some function, but you wish to make it better - add a feature, or change its internal structure (refactoring). Version control allows you to record this point in time - when you have a good, functioning function - and change the working copy of the function however you'd like. If you fail to refactor the function you can simply jump back to the latest working version.

There are many version control applications, but the most popular one is Git, developed by Linus Torvald (the Linux guy) in the early 2000's. In this course we'll be using Git with GitHub.com - an online backup site for your code.

There are numerous great Git and GitHub tutorials ([here's one](https://github.com/pluralsight/git-internals-pdf)), but for our course we'll be using only the most basic features of Git and GitHub, so going through all features and intricacies of the software is unnecessary. `PythonSetup.md` contains instructions on how to setup Git and GitHub, which should be enough for homework submission.

## A Simple Walkthrough (command line)

Assume I have a directory that I wish to track with Git:
- /home/
    - my_proj/
        - `__init__.py`
        - `class1.py`
        - `class2.py`
        - README.md
    - etc/
    - etc2/

Git is usually run fro the command line. Navigate (in the command line) to the `my_proj` folder (using `cd`) and write `git init`. This signifies that the current folder should be tracked by Git.

Next I wish to tell Git which files to track, which is called "adding the files". I can either name all specific files inside `my_proj`, or just write `git add .`. The dot means "the entire folder".

Now git remembers that inside the `my_proj` folder, all files should be tracked.

Finally I wish to commit the initial state of my repository (repo). To do so I write `git commit -m "Initial commit"`.

The flag `-m` is the commit message - it should signify what has changed in this commit since the last one.

If you create new files, you have to remember to add the with `git add new_file.py`, otherwise they won't be tracked and committed. 

This created a local repo in our computer. If the whole folder is erased our backups and commit history go away with it as well. 

To make an online backup for a repository you use `git push`. I won't go over it right now, but `HW1.md` has instructions about it, and you can also ask me if there's anything unclear. Since GitHub is the only available way to submit homework in this course, learning how to push commits is essentially mandatory :)

# The Python Stack

## How to Run Python?

### How does Matlab do it?

MATLAB has its excellent application GUI which essentially everyone uses all the time.

But all interpreted programming languages, including Python and MATLAB, can be run from the command line. 

In the case of MATLAB, though, rarely do you see people running it from the command line:

![MATLAB from the CL](matlab_cl.png)

It's obviously possible, but less comfortable than the standard GUI we all know.

If all you wish to do is run a MATLAB `.m` file, you can also do it from the command line by simply writing `matlab -r myfile.m`. 

### Python is very similar
Just like MATLAB, the quickest option to run Python is from the command line, by simply writing `python`:

![Python CL](python_cl.png)

Running Python scripts, like `myfile.py`, is as easy as `python myfile.py`.

### Python "GUI"

However, more often than not we wish to both write a script, experiment with it a little, and then run it, just as some of us are used to do from the MATLAB environment. Python, being non-propietary, has several similar solutions that go by the general name IDE - integrated development enviornment.

### Spyder
Spyder is an open-source science-oriented IDE that was designed with MATLAB's GUI in mind. It contains many similar functions and might look very similar in a short glance:

![Spyder IDE](extra_material/spyder.png)

Unfortunately, Spyder's main financial support was cut off in November 2017. It's still in active development, but on a much slower pace, and its future is unclear at the moment.

### PyCharm
PyCharm is a full-blown IDE which contains many advanced features that any modern IDE has, like refactoring capabilities, testing suites and more. It has a free community edition, and a paid proffessional edition - which is actually free for poor students like us:

![PyCharm IDE](extra_material/pycharm.jpg)

### Visual Studio Code
VSCode is a free, open-source editor for nearly all existing programming languages. It doesn't have all features that PyCharm has, but it's great nonetheless. The fact that it's a light-weight editor is an advantage for some.

![VS Code](extra_material/vs_code.jpg)

### Jupyter Notebook
While not technically an IDE, Jupyter is designed with data exploration in mind. It's less suited for writing long, complex application, but great when it comes for a quick "plot-n-go" on some data you recently acquired.

### What's Best?
There's no __best__ option, but in broad strokes:
* When you wish to do something "quick-and-dirty", you should probably use Jupyter. This can be thought of as the stand-in replacement for "smart" Excel spreadsheets.
* For more complicated analysis scripts I prefer VSCode, which grants you with enough power in your fingertips without being too complicated.

# The Module System and Its Uses

## Namespaces

One of the largest differences between Python and MATLAB is the concept of namespaces. At first it's also one of the most aggrevating differences. But I promise that sooner than later you'll understand its importance, and start disliking MATLAB's approach.

Namespaces is a system set to avoid ambiguity. We have many Hagai's, but much less Hagai Har-Gil's. In computers, we can have many files with the name `test1.py` saved in _different folders_ of our harddrive. In contrast - we can only have one `test1.py` per folder, since the file system dislikes these collisions.

Similarly, In Python we have to declare which namespace does our function belong to. Usually functions aren't included in the scope of the program until we `import` them. After importing we can use the function with its "pathname", i.e. the module it belongs to. This way, the same identifier (function name, class name, etc.) can be used multiple times in different modules.

You might have realised that you're already familiar with this topic from your previous usage of functions. The following example should be clear:

In [30]:
# Namespaces are defined inside functions
a = 1
def f(a):
    """ Scopes and namespaces exemplified """
    a = 2
    print(f"Inside the function, a={a}")  # Python 3.6 f-strings are kinda awesome...
f(a)
print(f"But outside of it, a={a}")

Inside the function, a=2
But outside of it, a=1


Namespaces for modules are the same. Python modules are an object with variables, classes and functions in it. A module can be a single class, a file or a folder containing files and other sub-folders. As we've said, a module is brought into scope (into our namespace) with the import statement:

In [31]:
# The objects "math" or "pi" do not exist here yet.
import math
math.pi

3.141592653589793

`pi` is only defined in the context of the `math` module. Without the import statement there's no special meaning attached to neither `math` nor `pi`, and they can be used as normal variables.

In [32]:
import os
os.sep

'/'

When we try to use a function, class or constant from a package (module) we didn't import, we'll receive a `NameError`. That's also the exception raised when we try to use a variable that didn't exist beforehand.

## The Default Python Namespace

Python comes with many functions already defined, which we've obviously met:
```python
list(), dict(), set(), int(), float()
```
and it also has many keywords, some reserved and some are not:
```python
int, class, def, return
```

The true power of Python comes from its ecosystem. Its one of the largest and most comprehensive around, certainly for a scripting language, and allows you to do basically whatever you want with minimum effort.

In [33]:
import antigravity  # try this at home!

![Antigravity XKCD](./extra_material/antigravity.png)

The standard library of Python includes the packages that come with every installation of Python. I didn't have to do anything special to import the `math` module - it was just there, "waiting" for me to import it. 

Most of the functions you'd expect a programming language to have are indeed included in the standard library, available automatically to everyone who downloaded Python. Other modules, including many popular ones, are available online. We'll discuss later how to import them.

I'll let you discover yourself what is included in the standard library, but some of the highlights include:

In [34]:
import pathlib
p = pathlib.Path(".")
print(p.absolute())

/home/hagaihargil/Classes/PythonCourseMaster/classes


In [35]:
import urllib.request
url = urllib.request.urlopen("https://www.google.com")
print(url.read())

b'<!doctype html><html dir="rtl" itemscope="" itemtype="http://schema.org/WebPage" lang="iw"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="bIJi6JUgvDKqxMAjHx8ztQ==">(function(){window.google={kEI:\'lVWGXM3ABcyXmwXvkZi4Dw\',kEXPI:\'0,1353747,57,1957,2423,698,527,730,224,1575,30,927,301,805,1088,70,2,513,160,191,78,203,2334033,329476,1294,12383,4855,32692,15247,867,12163,14325,2196,363,3320,5505,1240,1196,266,5107,575,835,284,2,205,1101,2432,1361,4323,4968,773,2251,2820,3074,2,1970,2590,3601,669,535,515,1808,1397,81,7,491,620,29,2373,6722,904,307,874,412,2,554,33,3420,796,1220,38,363,557,584,134,35,120,1217,1364,484,47,1080,2736,1558,1245,258,2,631,217,2345,2,4,2,670,44,935,1252,598,1519,354,125,1051,1556,632,1136,3,275,350,25,938,2,46,108,21,318,55,4,175,20,2,526,783,349,1318,138,404,135,357,251,721,322,274,569,111,2,22,207,549,85,1

It can be quite irritating to write the full namespace address of all functions. To this end we have the `from` and `as` keywords:

In [36]:
from pprint import pprint as pp

In [37]:
pp(urllib.request.urlopen("https://www.google.com").read())

(b'<!doctype html><html dir="rtl" itemscope="" itemtype="http://schema.org/WebP'
 b'age" lang="iw"><head><meta content="text/html; charset=UTF-8" http-equiv="Co'
 b'ntent-Type"><meta content="/images/branding/googleg/1x/googleg_standard_colo'
 b'r_128dp.png" itemprop="image"><title>Google</title><script nonce="ospBrifJSf'
 b'7X9EvHan7i1w==">(function(){window.google={kEI:\'lVWGXKrhIsGNmwXAnbqIBQ\','
 b"kEXPI:'0,1353746,58,1957,2423,667,30,528,731,223,1575,30,1227,806,1088,69,33"
 b'7,179,350,80,55,147,2334025,329484,1294,12383,4855,32691,8162,7086,867,12163'
 b',16521,363,3320,1262,4243,2442,260,1028,4079,575,835,284,2,205,374,727,2432,'
 b'1361,4323,4967,774,2253,4741,1151,2,1970,2590,1021,2580,669,1050,1808,1397,8'
 b'1,7,491,620,29,2373,6264,458,47,1162,694,182,412,2,554,3451,798,1222,36,363,'
 b'557,681,37,36,119,1214,1214,85,68,484,47,552,4,19,1,504,2736,49,1509,1245,25'
 b'8,2,631,217,2345,2,4,2,670,46,2185,598,960,403,156,354,124,1161,1449,630,464'
 b',206,744,350,25,439,490,25,

In [38]:
# Enumerations are great, you should familiarize yourself with them
from enum import Enum
class Day(Enum):
    SUNDAY = 1
    MONDAY = 2
    TUESDAY = 3
    WEDNESDAY = 4
    THURSDAY = 5
    FRIDAY = 6
    SATURDAY = 7

d = Day.MONDAY
print(d)
pp(d)

Day.MONDAY
<Day.MONDAY: 2>


In [39]:
from random import randint, randrange

In [40]:
# Instead (or on top) of from:
import multiprocessing.pool as mpool

In [41]:
# A non-recommended version of importing is the star import
from math import *
print(pi)
print(e)
print(cos(pi))

3.141592653589793
2.718281828459045
-1.0


The list of all standard library modules in Python 3 is [here](https://docs.python.org/3/library/index.html). However, it's just a drop in the ocean that is the Python ecosystem. 

## `pip` and External Modules

As noted before, Python's gigantic ecosystem is one of its biggest strengths. And the fact that you can easily install many of these package with a simple command-line instruction is even more important. The standard tool to do that is `pip`, a recursive acronym that stands for "`pip` installs Python".

`pip` itself is a Python program, __but its not run from inside the Python interpreter.__ Instead its run from a shell - the Windows command line (or PowerShell), for example, or the Mac's terminal - making it an external application to Python. Happily enough, basically all Python distributions come with `pip` pre-installed, so you don't have to install it yourself.

`pip`, like any package manager, has two main jobs. The first is to provide a convenient API to a package repository. In essence, it's a download manager for a single site - the [Python Package Index](https://pypi.python.org/pypi) (PyPI, pronounced _pai-pee-eye_), the official Python repository for packages.

PyPI holds installation files for the packages hosted in it, alongside with some metadata, like version number and dependencies. `pip` (and other tools we'll discuss soon) downloads the wanted package from PyPI, together with its dependencies, and installs them in a pre-defined location in our personal computer.

`pip`'s second important job is handling dependencies. Many packages rely on other packages, which in turn rely on other, more basic packages, finally leading to the basic Python interpreter. `pip` has to make sure to install all dependencies of the package you're currently after, and to avoid any collisions with other installed packages. For example, a common problem in the Python world is the Python 2-3 schism, which means that packages written for Python 2 can't run on a Python 3 interpreter, and vice versa. The package manger's job is to grant you the right version of the package you're looking for.

## File Structure

`import` statements are useful for more than just importing code - they're also our way of arranging our project's files. Here's the standard arrangement of files I'd like you to use throughout the course:

The base folder can contain many other files, including sample data, for example. The point here is that your actual code in confined to the `project_name` folder, which has an empty `__init__.py`. This file allows Python to import user-defined objects from that folder.

Thus, if you wish to use a function you defined in `class_1.py` in `class_2.py`, you should write inside `class_2.py` the following statement: `from class_1.py import my_func`.