# Class 2 - 16.3.20


## Conditionals and Iteration

Even though you've learned it by yourself in the first class and while doing homework, we'll go through these subjects (quickly) one more time, for the sake of completeness.

### The `if` Statement

In [1]:
# Notice the indentation and colon
x = 10
if x > 10:
    print("x is bigger than 10")

In [2]:
y = 11
if x > y:
    print("x")
else:
    print("y")

y


In [3]:
# Chained conditionals
z = 12
if x > y:
    print("x")
elif x > z:
    print("x > z")
elif z < y:
    print("z is small")
    if z < x:
        print("wow, z IS small")
# No need for else

### Iteration

One of Python's strongest features.

#### How would one loop in other programming languages? Let's examine MATLAB:

```matlab
% Sum all items in array
data = [10, 20, 30];
result = 0;
for idx = 1:size(data, 1)
    result = result + data(idx);
end
```
    

In [4]:
# In Python we can iterate over the values themselves
data = [10, 20, 30]
result = 0
for value in data:
    print(value)
    result += value

print("The result is:", result)

# We iterate over the values with the "in" operator.

10
20
30
The result is: 60


Note: I know that technically you can iterate over values in newer versions of MATLAB, but it's very clunky (row vs. column vectors) and doesn't work in some important cases (e.g. ``parfor``).

In [5]:
# Of course we can iterate over indices
indices = [0, 1, 2]  # 0-based indexing
data = [10, 20, 30]
result = 0
for idx in indices:
    result += data[idx]
print(result)

60


In [51]:
# In some cases you really do need to use indices. In these cases, the range() function is your friend.
for i in range(10):
    print(i)

# Range is a "generator" (well, kinda), which means that it doesn't create the actual 
# list of items when it's called - only when iterated over.
# We'll discuss generators at a later stage of our course.
range(10_000_000_000)  # requires nearly 0 memory

0
1
2
3
4
5
6
7
8
9


range(0, 10000000000)

In [7]:
# Range has a start, stop and step arguments
list(range(1, 10, 2))

[1, 3, 5, 7, 9]

We can iterate over nearly anything - two examples follow:


In [8]:
tup = (1, 2, True, 3.0, 'four')
for item in tup:
    print(item)

1
2
True
3.0
four


In [9]:
string = "abcdef"
for letter in string:
    print(letter)

a
b
c
d
e
f


In [10]:
# Getting both the index and the value of some sequence is done using the "enumerate" keyword:
tup = (1, 2, True, 3.0, 'four')
for index, item in enumerate(tup):
    print("The index of the item %s is %d" %(item, index))
    

The index of the item 1 is 0
The index of the item 2 is 1
The index of the item True is 2
The index of the item 3.0 is 3
The index of the item four is 4


In [11]:
# By the way, it means that strings are also sliceable
string = 'abcde'
print(string[0])
print(string[-1])
print(string[-1:-3:-1])

a
e
ed


In [12]:
# Dictionary iteration - the default is over the keys
print('Key iteration:')
dict1 = {'a': 1, 'b': 2, 'c': 3}
for key in dict1:
    print(key)
    
print('-----')
print('Value iteration:')
# If you wish to iterate over the values you have to be explicit:
for val in dict1.values():
    print(val)

print('-----')
print('Pairs iteration:')
# Iterating over the pair is also easy:
for key, val in dict1.items():
    print(key, val)

Key iteration:
a
b
c
-----
Value iteration:
1
2
3
-----
Pairs iteration:
a 1
b 2
c 3


### While Loop

In [13]:
def countdown(n):
    """ Explodes a bomb when n is zero. """
    while n > 0:
        print("{}...".format(n))
        n = n - 1
    print("BOOM!")
    return True

In [14]:
n = 10
val = countdown(n)

print(val)

10...
9...
8...
7...
6...
5...
4...
3...
2...
1...
BOOM!
True


In [15]:
# Stopping a loop is done with break, and in order to continue execution from the start of the loop we use 'continue'
data = [1., 2., 1., 1., 4., 1.]
for datum in data:
    if datum == 2.:
        continue
    if datum != 1.:
        print(datum)
        break
    print("Still 1...")

Still 1...
Still 1...
Still 1...
4.0


## Formatting

We can print together variables and text in several different manners:

In [16]:
# Not recommended as it doesn't allow for customizations
a = 42
print("The value of a is", a)

The value of a is 42


In [17]:
# Older version, similar to other languages
a = 42
b = 32
print("The value of a is %d, while the value of b is %d" % (a, b))

The value of a is 42, while the value of b is 32


In [18]:
# A decent option
a = 42
b = 32
print("The value of a is {}, while the value of b is {}".format(a, b))

The value of a is 42, while the value of b is 32


In [19]:
# Only for Python 3.6+ - but it's pretty cool. It's called f-strings
a = 42
b = 32
print(f"The value of a is {a}, while the value of b is {b}")

The value of a is 42, while the value of b is 32


In [20]:
# Another example
a = 42
b = 32
print(f"The value of a is {a:.2f}, while the value of b is not {b + 1}.")
# You can write any expression you'd like inside the curly brackets.

The value of a is 42.00, while the value of b is not 33.


Throughout the course you'll see me using mostly the f-string option, which is the most readable assuming you only work on Python 3.6+.

## Comprehensions

If you're still not convinced on this whole Python business, comprehensions might be the thing that will flip you to the good side. Comprehensions are a different way to create lists and other iterables. It's usually faster and more readable than the alternatives.

Assume we wish to create a list with the squared values of the numbers in the range [0, 10). Let's observe the different implementation options we have:

In [21]:
# Populate the list the usual way
squares = []
for item in range(10):
    squares.append(item ** 2)
    
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [22]:
# A classic Python one-liner - list comprehension
squares = [x ** 2 for x in range(10)]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


This amazing piece of software iterates of the range with the variable `x`, and places its square inside a list, which is then allocated to the `squares` variable.

What other goodies do comprehensions allow? Filtering.

In [23]:
squares = [
    x ** 2 
    for x in range(10) 
    if x != 8
]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 81]


The `if` statement is evaluated on each iteration of `x`. Notice how similar to English this expression is?

Funnily enough, performance isn't hindered:

In [24]:
%timeit squares = [x ** 2 for x in range(100) if x != 8]

32.9 µs ± 2.65 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [52]:
%%timeit
squares = []
for item in range(100):
    if item != 8:
        squares.append(item ** 2)

51.7 µs ± 4.05 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [26]:
# Another example
strings = [cons.upper()
           for cons in 'abcdefghij' 
           if cons not in 'aeiou']

print(strings)

['B', 'C', 'D', 'F', 'G', 'H', 'J']


Happily enough, comprehensions aren't limited to lists. You can also comprehend dictionaries, but for that you'll need the built-in `zip` function:

In [27]:
keys = 'abcde'
vals = [1, 2, 3, 4, 5]
bools = [True, True, False]
# Zip bundles up these iterables together, allowing iteration
for a, b, c in zip(keys, vals, bools):
    print(a, b, c)

a 1 True
b 2 True
c 3 False


In [28]:
# With zip we can bundle up the pairs of key-value
keys = 'abcde'
vals = [1, 2, 3, 4, 5]
dic = {aaa: lll 
       for aaa, lll in zip(keys, vals) 
       if aaa is not 'c'}

{'a': 1, 'b': 2}
print(dic)

# If we didn't want to use the "if" part of the comprehension - this is easier:    
dict3 = dict(zip(keys, vals))
print(dict3)

{'a': 1, 'b': 2, 'd': 4, 'e': 5}
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}


In [29]:
# And set comprehension
se = {item for item in [1, 1, 1, 5, 6, 7, 7]}

print(se)

{1, 5, 6, 7}


Lastly, list comprehensions can iterate over more than one iterable:

In [30]:
summed_list = [outer + inner 
               for outer in range(10) 
               for inner in range(10, 20)]

# Identical to:
"""
summed_list = []
for outer in range(10):
    for inner in range(10, 20):
        summed_list.append(outer + inner)
"""
print(summed_list)

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]


Once your eyes and brains are used to looking at comprehensions, they're the most "natural" way to iterate over iterables. Their brevity on the one hand, and readability on the other, are their most important features. That's why comprehension are very encouraged when writing true "Pythonic" code.

## Function Default Arguments and Types

One thing we didn't show when introducing function was the default argument feature of function definitions, as well as the optional typing syntax. We'll start with showing what are default arguments.

Say you have a function which calculates the nth-power of any number its given. It obviously has to be given two inputs - the number and the power by which it will be multiplied. Here's such a function:

In [31]:
def power_of(x, power):
    """ Raises x to the power of power """
    return x ** power

This function works wonderfully, but we're assuming that most of the times users will raise `x` to the power of two. If we want to save the hassle for the caller, we can use default arguments when defining the function:

In [54]:
def power_of(x, power=2):
    """ Raises x to the power of power """
    return x ** power

We can now call this function either with the power argument or without it, which will default it to 2:

In [55]:
print(power_of(10, 3))

print(power_of(10))

1000
100


#### Type hinting

Another interesting feature of Python (3.6+) is its optional typing capabilities. These allow us coders to 'hint' at the wanted types of certain variables. We say 'hint' since these annotations have no effect during runtime, i.e. a variable can be marked as a string but later receive an integer value without the Python interpreter caring. For functions, Python added a special "arrow" -> sign that marks the output type of it. Let's see how it works:

In [34]:
def typed_add(a: int, b: int = 0) -> int:  # combination of type hinting and default arguments
    """Adds the two inputs together"""
    return a + b

In [35]:
a: float = 0.25
b: str = 'a'

a = True  # Python doesn't care that we've overriden the type hint

Type hinting is used for documentation purposes mostly. Later in the course we'll discuss its more advanced features by utilizing the `mypy` package.

## File Input/Output

Yet another error-prone area in applications is their I/O (input-output) module. Interfacing with objects outside the scope of your own project should always be handled carefully. You never know what's really out there.

Assume we wish to write some data to a file - a list filled with counts of some sort, for example.

To write (and read) from a file, you have to do several operations:
1. Define the file path and name.
2. Open the file with the appropriate mode - read, write, etc.
3. Flush out the data.
4. Close the file.

Here's a mediocre example of how it's done:

In [56]:
data_to_write = 'A B C D E F'
filename = 'data.txt'
file = open(filename, 'w')  # w is write, 'open' is a built-in function
file.write(data_to_write)
file.close()

The variable `file` is a file object, and it has many useful methods, such as:
* `.read()` - reads the entire file.
* `.readline()` - reads a single line.
* `.readlines()` - read the entire file as strings into a list.
* `.seek(offset)` - go the `offset` position in the file.

File objects in Python can be opened as string files (the default) or as binary files (`open(filename, 'b')`), in which case their content will be interpreted as bytes rather than text.

When dealing with files, we generally first `open()` them, `read()` \ `write()` something, and `close()` them. The real issue stems from the fact that these steps are very error prone. For example, you can open a file to write something to it, but while the file is opened someone else (or some other Python process) can close and even delete the file.

Another example - some connection error might occur after you've flushed the data into the file, but before you managed to close it, leading to a file that can't be accessed by the operating system.

Gladly, Python is here to help, and its main method of doing so is context managers, called upon with the `with` keyword. Context managers are awesome, and I'll only briefly describe their capabilities. That being said, they shine the most when doing I/O, like in the following example:

In [37]:
data_to_write = 'A B C D E F'
filename = 'data.txt'
with open(filename, 'w') as file:
    file.write(data_to_write)
    file.write('abc')
    a = 1 + 2
a = 1

The unique thing here is that once we've opened the file, the `with` block guarantees that the file will be closed, regardless of what code is executed. 

Even if an error occurs while the file is open - the context manager will ensure proper handling of the file and prevent our data from disappearing into the void of the file system.

# The Python Stack

## How to Run Python?

### How does Matlab do it?

MATLAB has its excellent application GUI which essentially everyone uses all the time.

But all interpreted programming languages, including Python and MATLAB, can be run from the command line. 

In the case of MATLAB, though, rarely do you see people running it from the command line:

![MATLAB from the CL](extra_material/matlab_cl.png)

It's obviously possible, but less comfortable than the standard GUI we all know. If all you wish to do is run a MATLAB `.m` file, you can also do it from the command line by simply writing `matlab -r myfile.m`. 

Thinking about it further, the MATLAB application we're familiar with combines a few sub-applications:
1. Text editor
2. Debugger and variable explorer
3. Command prompt (REPL, a place to write and immediately evaluate MATLAB expressions)
4. MATLAB's engine or interpreter - the program that actually does the job of reading the source files and 'computing' them

Unfortunately in the Python world we don't have an application as comprehensive as MATLAB (the app) is. VSCode comes very close with some of its features, and its many other features are actually _more_ advanced than MATLAB's, but it will probably feel a bit clunky at the start.

### Python is very similar
Just like MATLAB, the quickest option to run Python is from the command line, by simply writing `python`:

![Python CL](extra_material/python_cl.png)

Running Python scripts, like `myfile.py`, is as easy as `python myfile.py`.

### Python "GUI"

However, more often than not we wish to both write a script, experiment with it a little, and then run it, just as some of us are used to do from the MATLAB environment. Python, being non-propietary, has several similar solutions that go by the general name IDE - integrated development enviornment.

### Spyder
Spyder is an open-source science-oriented IDE that was designed with MATLAB's GUI in mind. It contains many similar functions and might look very similar in a short glance:

![Spyder IDE](extra_material/spyder.png)

Unfortunately, Spyder's main financial support was cut off in November 2017. It's still in active development, but on a much slower pace, and its future is unclear at the moment.

### PyCharm
PyCharm is a full-blown IDE which contains many advanced features that any modern IDE has, like refactoring capabilities, testing suites and more. It has a free community edition, and a paid proffessional edition - which is actually free for poor students like us:

![PyCharm IDE](extra_material/pycharm.jpg)

### Visual Studio Code
VSCode is a free, open-source editor for nearly all existing programming languages. It doesn't have all features that PyCharm has, but it's great nonetheless. The fact that it's a light-weight editor is an advantage for some.

![VS Code](extra_material/vs_code.jpg)

### Jupyter Notebook
While not technically an IDE, Jupyter is designed with data exploration in mind. It's less suited for writing long, complex application, but great when it comes for a quick "plot-n-go" on some data you recently acquired.

### What's Best?
There's no __best__ option, but in broad strokes:
* When you wish to do something "quick-and-dirty", you should probably use either VSCode or Jupyter. This can be thought of as the stand-in replacement for "smart" Excel spreadsheets.
* For more complicated analysis scripts I prefer VSCode, which grants you with enough power in your fingertips without being too complicated. 

[In class we show a quick tour of VSCode here]

# Version Control

## Introduction

Version control is the active management of the history of your source code. It is an essential part of every developers' work cycle, for both small and large projects.

With version control, in any point in time during the work on your code you can decide to "commit" the change. Committing your code means that the system will remember the current state of your work (all files in a folder), and will allow you to return to this exact state of your codebase whenever you wish.

![Final.doc, from PhD Comics](extra_material/vcs_final.png)

Using version control is orthogonal to the traditional save operation. Saving records the current state of your codebase, but usually doesn't allow you to "go back in time" to previous versions.

This property is useful in many occasions. For example, if you have a working version of some function, but you wish to make it better - add a feature, or change its internal structure (refactoring). Version control allows you to record this point in time - when you have a good, functioning function - and change the working copy of the function however you'd like. If you fail to refactor the function you can simply jump back to the latest working version.

Another important version control use case is collaborative work. Version control systems (VCS) can help you communicate changes in code base between developers, without having to somehow transfer updated versions of files from one person to the other.

There are many version control applications, but the most popular one is Git, developed by Linus Torvald (the Linux guy) in the early 2000's. In this course we'll be using Git with GitHub.com - an online backup site for your code.

There are numerous great Git and GitHub tutorials ([here's one](https://github.com/pluralsight/git-internals-pdf)), but for our course we'll be using only the most basic features of Git and GitHub, so going through all features and intricacies of the software is unnecessary. `PythonSetup.md` and `CreateGitRepo.md` contain instructions on how to setup Git and GitHub, which should be enough for homework submission.

## A Simple Walkthrough (Using VSCode)

Assume I have a directory that I wish to track with Git:
- /home/
    - my_proj/
        - `__init__.py`
        - `class1.py`
        - `class2.py`
        - `README.md`
    - etc/
    - etc2/

Git can be run from the command line, but in this quick tutorial we'll be using the built-in Git interface in VSCode, which should be good enough for 99% of your Git needs. 

Open VSCode, click 'Open Folder...' and point it to the `my_proj` directory. Click the 'fork' icon to the right which symbols version control, you should see a message saying **"No source control providers registered"**, which makes sense. Click the "+" icon and choose the current folder. This process is called "initialization", and now the folder is tracked by Git.

Next we wish to tell Git which files to track, which is called "adding" or "staging" the files. The "stage" is a metaphor to us putting these files on the stage - and from there we'll decide what to do with them. We see our files and folders under the "Changes" headline, which means that git detected changes in these files since its previous snapshot of our repository (which was really a 'clean slate' without any files). There's a green _U_ next to each file signifying that it's "Untracked", i.e. git doesn't yet acknowledge the existence of this file. To start tracking the files, we should click the "+" button next to the _U_ when we hover with our mouse next to the names of these files. Alternatively, if we just want to add all files to the git archive, we can click the "+" on the "Changes" line (again, visible when we hover with the mouse on it).

Stage all files by clicking the "+". You should see them under a new headline - "Staged Changes". They have an _A_ for "Added".

Now git knows that inside the `my_proj` folder, all files should be tracked but it hasn't captured a snapshot of the repository status quite yet.

To do that, we'll need to "commit" the current state of the repository (repo). Committing is the name given in the git lingo to the act of setting some version of our codebase as an improtant one, which we can name, label, describe what has happened and go back to if we need it. We almost always commit files right after we add, or stage them.

Before committing it's recommended to write a one-liner describing what this commit contains, in terms of changes to the codebase. The message goes in the line above "Staged Changes"; write "Initial commit" there - it's good enough for this tutorial. Now we can commit by clicking the "V" checkmark in the line above.

Congratulations! You've made your first git commit, you're now a real programmer.

If you create new files, you have to remember to add the with `git add new_file.py`, otherwise they won't be tracked and committed. You'll see a green _U_ that should remind you of that.

If you remember, earlier I described how git can also be used as a backup service to our code by publishing it online to the cloud, to a potentially private repository which is a clone of our local repository. However, our actions so far merely created a local repo in our computer. If the whole folder is erased our backups and commit history go away with it as well. 

To make an online backup for a repository we use a command called "Push", found when clicking the "..." icon to the right of the "Commit" checkmark. I won't go over it right now since these are included in the second homework assignment, and you can also ask me if there's anything unclear. Since GitHub is the only available way to submit homework in this course, learning how to push commits is essentially mandatory :)

# The Module System and Its Uses

## Namespaces

One of the largest differences between Python and MATLAB is the concept of namespaces. At first it's also one of the most aggrevating differences. But I promise that sooner than later you'll understand its importance, and start disliking MATLAB's approach.

Namespaces is a system set to avoid ambiguity. We have many Hagai's, but much less Hagai Har-Gil's. In computers, we can have many files with the name `test1.py` saved in _different folders_ of our harddrive. In contrast - we can only have one `test1.py` per folder, since the file system dislikes these collisions.

Similarly, In Python we have to declare which namespace does our function belong to. Usually functions aren't included in the scope of the program until we `import` them. After importing we can use the function with its "pathname", i.e. the module it belongs to. This way, the same identifier (function name, class name, etc.) can be used multiple times in different modules.

You might have realised that you're already familiar with this topic from your previous usage of functions. The following example should be clear:

In [38]:
# Namespaces are different inside functions
a = 1
def f(a):
    """ Scopes and namespaces exemplified """
    a = 2
    print(f"Inside the function, a={a}")
f(a)
print(f"But outside of it, a={a}")

Inside the function, a=2
But outside of it, a=1


Namespaces for modules are the same. Python modules are an object with variables, classes and functions in it. A module can be a single class, a file or a folder containing files and other sub-folders. As we've said, a module is brought into scope (into our namespace) with the import statement:

In [39]:
# The objects "math" or "pi" do not exist here yet.
import math
math.pi

3.141592653589793

`pi` is only defined in the context of the `math` module. Without the import statement there's no special meaning attached to neither `math` nor `pi`, and they can be used as normal variables.

In [40]:
import os
os.sep

'/'

When we try to use a function, class or constant from a package (module) we didn't import, we'll receive a `NameError`. That's also the exception raised when we try to use a variable that didn't exist beforehand.

In [41]:
# An exception is raised
cos  # we probably want math.cos, but we didn't import it, so we got a NameError

NameError: name 'cos' is not defined

## The Default Python Namespace

Python comes with many functions already defined, which we've obviously met:
```python
list(), dict(), set(), int(), float()
```
and it also has several keywords, some reserved and some are not:
```python
int, class, def, return
```

However, the number of functions and symbols inside the default Python namespace is extremely limited on purpose, definitely in comparison to MATLAB's vast array of default functions.

The true power of Python comes from its ecosystem. Its one of the largest and most comprehensive around, certainly for a scripting language, and allows you to do basically whatever you want with minimum effort.

In [42]:
import antigravity  # try this at home!

![Antigravity XKCD](./extra_material/antigravity.png)

The standard library of Python includes the packages that come with every installation of Python. I didn't have to do anything special to import the `math` module - it was just there, "waiting" for me to import it. 

Most of the functions you'd expect a programming language to have are indeed included in the standard library, available automatically to everyone who downloaded Python. Other modules, including many popular ones, are available online. We'll discuss later how to import them.

I'll let you discover yourself what is included in the standard library, but some of the highlights include:

In [43]:
import pathlib
p = pathlib.Path(".")
print(p.absolute())

/home/hagai/Teaching/python_master/classes


In [44]:
import urllib.request
url = urllib.request.urlopen("https://www.google.com")
print(url.read())

b'<!doctype html><html dir="rtl" itemscope="" itemtype="http://schema.org/WebPage" lang="iw"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="5JpWRWZw0R6nbq4sMxT8Mg==">(function(){window.google={kEI:\'AHVvXpyrAsG1kwWBso6YDg\',kEXPI:\'0,1353746,5663,731,223,5104,207,3204,10,1051,175,364,1435,4,60,239,503,75,383,246,5,960,168,226,82,291,439,104,5,1126845,1197789,235,125,41,329077,1294,12383,4855,32692,15247,867,28684,363,8825,8384,4858,1362,4323,4967,3026,2819,1923,2136,982,7915,1808,4998,7931,5297,2054,622,298,873,1217,2975,6430,11307,2883,20,318,1981,2536,2777,519,400,2277,8,570,2226,1593,1279,2212,202,328,149,1103,327,513,517,1466,8,48,157,4101,109,203,1137,2,2669,1839,184,1777,520,1947,748,428,1053,93,328,1284,17,2926,2246,474,459,880,748,1039,3227,773,257,1815,7,1320,3488,2825,4682,1833,2660,642,632,1817,2459,1226,1462,280,3655,127

It can be quite irritating to write the full namespace address of all functions. To this end we have the `from` and `as` keywords:

In [61]:
from math import cos as cosine
cosine(1.)

0.5403023058681398

In [47]:
# Enumerations are great, you should familiarize yourself with them
from enum import Enum
class Day(Enum):
    SUNDAY = 1
    MONDAY = 2
    TUESDAY = 3
    WEDNESDAY = 4
    THURSDAY = 5
    FRIDAY = 6
    SATURDAY = 7

d = Day.MONDAY
print(d)
pp(d)

Day.MONDAY
<Day.MONDAY: 2>


In [48]:
from random import randint, randrange

In [49]:
# Instead (or on top) of from:
import multiprocessing.pool as mpool

In [62]:
# A non-recommended version of importing is the star import
from math import *
print(pi)
print(e)
print(cos(pi))

3.141592653589793
2.718281828459045
-1.0


1

The list of all standard library modules in Python 3 is [here](https://docs.python.org/3/library/index.html). However, it's just a drop in the ocean that is the Python ecosystem. 

## `pip` and External Modules

As noted before, Python's gigantic ecosystem is one of its biggest strengths. And the fact that you can easily install many of these package with a simple command-line instruction is even more important. The standard tool to do that is `pip`, a recursive acronym that stands for "`pip` installs Python".

`pip` itself is a Python program, __but its not run from inside the Python interpreter.__ Instead its run from a shell - the Windows command line (or PowerShell), for example, or the Mac's terminal - making it an external application to Python. Happily enough, basically all Python distributions come with `pip` pre-installed, so you don't have to install it yourself.

To install a library with pip, open a command line (in VSCode you can simply press Ctrl + ~) and type `pip install package_name`. 

`pip`, like any package manager, has two main jobs. The first is to provide a convenient API to a package repository. In essence, it's a download manager for a single site - the [Python Package Index](https://pypi.python.org/pypi) (PyPI, pronounced **pai-pee-eye**), the official Python repository for packages.

PyPI holds installation files for the packages hosted in it, alongside with some metadata, like version number and dependencies. `pip` (and other tools we'll discuss soon) downloads the wanted package from PyPI, together with its dependencies, and installs them in a pre-defined location in our personal computer.

`pip`'s second important job is handling dependencies. Many packages rely on other packages, which in turn rely on other, more basic packages, finally leading to the basic Python interpreter. `pip` has to make sure to install all dependencies of the package you're currently after, and to avoid any collisions with other installed packages. For example, a common problem in the Python world is the Python 2-3 schism, which means that packages written for Python 2 can't run on a Python 3 interpreter, and vice versa. The package manger's job is to grant you the right version of the package you're looking for.

### Comparison with MATLAB's approach

We should take a minute here to contrast Python's approach with that of MATLAB. In MATLAB, once we added a directory to the path using `pathtool`, each file in that file is now directly in the MATLAB namespace. This means that we don't have to `import` anything - adding something to `pathtool` is essentially `import`ing the entire folder into the general namespace, which is the only namespace in MATLAB world. This is a pretty straight-forward approach, but it's also one that no other programming language, especially a modern one, uses. This is because cluttering the one and only namespace is a bad idea, since you can quite easily overwrite names of functions from files with names of variables you use, and you won't even know it. Moreover, you don't need **all** functions around **all** the time. Each file and project will usually need a few different functions from a couple of toolboxes, and that's it. 

In Python we have another layer between our code and the `import`-able functions. Inside our code we can only use built-in functions (`list()`, `int()`, `print()`, etc.) and functions that we explicitly imported, which will vary between files and project. In addition, we can only call `import` on files and packages which are in our Python's path. We'll discuss Python's equivilant of `pathtool` in the next class, but for now you should see the separation Python forces on us between the specific file's namespace and the general Python namespace (what packages can we even import).

## File Structure

`import` statements are useful for more than just importing code - they're also our way of arranging our project's files. Here's the standard arrangement of files I'd like you to use throughout the course:

The base folder can contain many other files, including sample data, for example. The point here is that your actual code in confined to the `project_name` folder, which has an empty `__init__.py`. This file allows Python to import user-defined objects from that folder.

Thus, if you wish to use a function you defined in `class_1.py` in `class_2.py`, you should write inside `class_2.py` the following statement: `from class_1.py import my_func`.

## Scripts and Functions 
#### (Code examples in `import_demonstration` folder)

If you're familiar with MATLAB there's a good chance that you've written a script before. A script is a file which is run sequentially, while using other functions and definitions. Python supports scripts as well, as can be seen in `main.py`.

However, in the Python world people usually prefer to stay away from scripts. This is due to a number of reasons, the most important one being that running a `main()` function as it is easy as running a `main.py` script. You can see examples of a procedural and script-like approach in the `main.py` file, but keep in mind the the script version is discouraged.

If you wish to run a file full with functions from the command line or from your IDE, you should include the following lines:
```python
if __name__ == '__main__':
    run_main()
```

Every Python file which is being run has a caller. If this file was run directly from the Python interpreter its `__name__` will be `'__main__'`. This `if` statement basically tells the Python interpreter "Start from here", and is the conventional way to run Python procedures.

In this course you're highly encouraged to divide your code into many small functions and methods in well-defined compact classes. Each method should have a single purpose, documented in its docstring. Each class should have a logical structure that envelopes its methods and attributes in a sensible way.

Beware of God classes, or God scripts and functions. These are monolithic objects that encompass the entirety of your application, and are very hard to reason about. Simply create more files, each with a descriptive name and a bunch of related functions, and import this file into the other folders and main file.

Another important reason to partition our code into many small bits is *unit testing* which we'll cover later on in the course.

Importing code from one file to the other isn't as easy as you'd like it to be, especially since each language deals with this issue in a different way. It's completely expected that you won't be able to 'nail' the first couple of import statements you write. Keep trying, verify that your code is in the right directory structure, and don't be afraid to ask friends, Google or me.