# Environment Basics

This notebook presents a crash course into the Python language and the Jupyter environment

## Python

 - most widely used language for scientific computation & AI/ML/DS
 - friendly syntax
 - huge community support

introductory level, not in depth, just most things that you'll need

also some tips that you might have not known, even if you're familiar with python (string interpolation, for/else, type hints, doctest)

assumes some familiarity with programming

not touching upon classes at all

In [139]:
# this is a comment

### Primitives

The most used python primitives are:
 - `int`: integers
 - `float`: real numbers
 - `str`: strings (note that there is no character primitive)
 - `bool`: truth value
 - `None`: absence of a value (similar to `null` in other languages)

### Basic arithmetic

In [4]:
1 + 5

6

In [79]:
2 ** 3  # power operator

8

### Variables

In [185]:
name = 'Peter Parker'  # either single or double quotes
age = 21
height = 5.84
married = False

In [6]:
age / 2

10.5

In [80]:
age // 2  # only whole part

10

---

In [188]:
type(age)  # checking the type of a variable

int

In [189]:
type(height) is int

False

In [190]:
int(2.8)  # type conversion

2

In [119]:
int('2')

2

In [120]:
float(2)

2.0

In [121]:
str(123)

'123'

### Data Structures

The most used basic data structures are:
 - `list`: a sequence of elements
 - `dict`: dictionary mapping keys to values

### List

In [61]:
squares = [0, 1, 4, 9]

In [62]:
squares.append(25)  # add one element at the end

In [63]:
len(squares)  # number of elements

5

In [64]:
squares[0]  # zero-indexed

0

In [65]:
squares[-1]  # last element

25

In [74]:
4 in squares  # test membership

True

In [75]:
5 in squares  # the list does not contain 5

False

In [66]:
squares.index(4)  # get the index of an element

2

In [67]:
squares[1:]  # slice: from the second element onwards

[1, 4, 9, 25]

In [143]:
squares[:-1]  # all but the last element

[0, 1, 4, 9]

In [68]:
squares + [49, 'large', 'too large']  # concatenate lists

[0, 1, 4, 9, 25, 49, 'large', 'too large']

---

In [69]:
# a string behaves just like a list
len(name)

12

In [70]:
name[0]

'P'

In [38]:
name[1:]

'eter Parker'

In [41]:
'Hello ' + name

'Hello Peter Parker'

In [40]:
'ha' * 3

'hahaha'

In [71]:
# string-specific functions
name.split()  # by words

['Peter', 'Parker']

In [72]:
name.lower()

'peter parker'

In [126]:
'-'.join(['a', 'b', 'c'])

'a-b-c'

In [134]:
f'{name} is {age} years old'  # note the f at beginning

'Peter Parker is 21 years old'

### Dictionary

A dictionary behaves like a list where the keys are not numbers

In [None]:
has the time complexity of a hashmap, but through some very clever engineering, its elements are ordered

In [192]:
superhero_ages = {
    'Ironman':   36,
    'Hulk':      38,
    'Thor':      'varies',  # does not have to be homogenous
}

In [193]:
len(superhero_ages)

3

In [195]:
superhero_ages['Ironman']

36

In [196]:
superhero_ages['Spiderman'] = 21  # add an element

In [197]:
del superhero_ages['Hulk']  # remove an element

In [198]:
superhero_ages

{'Ironman': 36, 'Thor': 'varies', 'Spiderman': 21}

In [78]:
'Deadpool' in superhero_ages  # whether the key is in the dictionary

False

---

In [191]:
# tuples should be thought of as light-weight classes: they behave similar to lists, but are meant for heterogenous elements

In [None]:
t = ()

### Iteration

In [82]:
# list iteration
for sq in squares:
    print(sq)

0
1
4
9
25


In [124]:
# dict iteration
for name, age in superhero_ages.items():
    print(name, age)

Spiderman 21
Ironman 36
Hulk 38
Thor varies


In [84]:
for i in range(10):  # idiom for C-like languages' for(i = 0; i<10; i++)
    print(i)

0
1
2
3
4
5
6
7
8
9


Iteration-related functions:

In [85]:
colors = ['red', 'green', 'blue', 'black']

In [88]:
for col in reversed(colors):
    print(col)

black
blue
green
red


In [86]:
for i, col in enumerate(colors):
    print(i, col)

0 red
1 green
2 blue
3 black


In [89]:
for sq, col in zip(squares, colors):
    print(sq, col)

0 red
1 green
4 blue
9 black


In [None]:
# `while` is not used as often
k = 0
while k <= 10:
    k += 1  # idiom, k++ does not exist (because it makes implementing operator overload much easier)

---

Multiple ways of creating a collection based on another's elements:

In [90]:
color_lengths = []
for col in colors:
    l = len(col)
    color_lengths.append(l)

In [91]:
color_lengths

[3, 5, 4, 5]

In [92]:
[len(col) for col in colors] # list comprehension, equivalent but shorter

[3, 5, 4, 5]

In the next subsection we'll learn an even shorter way, using `map`

In [93]:
{col: len(col) for col in colors} # dict comprehension

{'red': 3, 'green': 5, 'blue': 4, 'black': 5}

In [None]:
# while is much less likely to be used

### Functions

In [106]:
def double(n):
    # takes an argument and returns it doubled
    return n * 2

In [None]:
double = lambda n: n * 2  # equivalent but shorter way to write simple functions

In [144]:
double(5)

10

Duck-typing allows the function to work on any kind of argument that supports the `*` operator:

In [147]:
double(1.2)

2.4

In [145]:
double('ha')

'haha'

In [146]:
double([1, 2, 3])

[1, 2, 3, 1, 2, 3]

---

Functions can take anything as arguments, even other functions.

One such special function is `map`, which takes two arguments, a function and a collection, and it applies the function to each element of the collection.

In [169]:
map(len, colors)  # get the length of each color, equivalent to definitions above

<map at 0x10572af98>


*Note*: in order to preserve memory and computation, it is designed to use lazy-evaluation, meaning that it actually returns a `generator` object, which yields one element at a time, upon being called.

In [170]:
result = map(len, colors)
for l in result:
    print(l)

3
5
4
5


Calling it again correctly yields no elements, as they have all been consumed:

In [171]:
for l in result:
    print(l)

In [172]:
# evaluate all elements
list(map(double, squares))

[0, 2, 8, 18, 50]

---

A function can have default arguments:

In [150]:
def repeat(s, times=3):
    # by default, it repeats 3 times
    return s * times

In [151]:
repeat('ha ')

'ha ha ha '

In [152]:
repeat('ha ', 5)

'ha ha ha ha ha '

In [153]:
repeat('ha ', times=5)  # arguments can be named

'ha ha ha ha ha '

In [156]:
def repeat_extra(s):
    # returns multiple values
    return len(s), s * 3

In [158]:
repeat_extra('ha')

(2, 'hahaha')

In [162]:
result = repeat_extra('ha')
length   = result[0]
repeated = result[1]

In [159]:
length, repeated = repeat_extra('ha')  # shorter way of destructuring the result

In [160]:
length

2

In [161]:
repeated

'hahaha'

In [173]:
a, b, c = [1, 2, 3]  # works everywhere

In [174]:
a

1

In [175]:
[a, b, c] = [1, 2, 3, 4]  # gives an error upon mismatch of length

ValueError: too many values to unpack (expected 3)

### Conditional statements

For control flow

In [101]:
def is_even(n):
    return n % 2 == 0

In [128]:
if is_even(2):
    print('works')
else:
    print('broken')

works


In [129]:
small_squares = []
for sq in squares:
    if sq < 5:
        small_squares.append(sq)

In [130]:
small_squares

[0, 1, 4]

In [131]:
[sq for sq in squares if sq < 5]  # works in list comprehension as well

[0, 1, 4]

In [114]:
filter(is_even, squares)  # filter is another higher-order-function, which returns just those elements that pass the predicate

<filter at 0x1052ae278>

In [202]:
# little known fact, sometimes useful: an "else" can also be attached to a for/while loop

target = 10  # change it! make it something that is and then something that is not in the list

for x in [2, 5, 1, 4, 2]:
    if x == target:
        break

else:  # no break
    print('target not in the list')

target not in the list


## Error handling

In [177]:
l = [1, 2, 3]

In [178]:
l[10]  # the list does not have that many elements, thus an error is given

IndexError: list index out of range

In [179]:
try:
    l[10]
except:
    print('an error occurred')

an error occurred


In [180]:
try:
    lizt[10]
except IndexError:
    print('bad index')
except Exception as e:
    print('a different error occured:', str(e))

a different error occured: name 'aaa' is not defined


### Documentation

In [181]:
# type hints tell the user (and the pre-compiler) what the function is expected to receive and to output:
def is_even(n: int) -> bool:
    return n % 2 == 0

In [104]:
def is_even(n):
    """
    This is multi-line comment explains what the function does.
    This function return whether the integer passed as argument is even.
    """
    return n % 2 == 0

In [182]:
# quick unit tests show example of usage
def is_even(n):
    """
    The examples below show usage and expected output:
    
    >>> is_even(2)
    True

    >>> is_even(5)
    False
    
    >>> is_even(1)
    True
    """
    return n % 2 == 0

In [183]:
# they can be tested automatically for correctness:
import doctest
doctest.testmod()

**********************************************************************
File "__main__", line 12, in __main__.is_even
Failed example:
    is_even(1)
Expected:
    True
Got:
    False
**********************************************************************
1 items had failures:
   1 of   3 in __main__.is_even
***Test Failed*** 1 failures.


TestResults(failed=1, attempted=3)

## Exercises

**Exercise**: create a list of the first 100 squares

**Exercise**: create a `yell` function that takes a string and a boolean `exclaim`. 
It returns the s in all caps and adds an exclamation mark at the end if exclaimed (off by default).

**Exercise**: create a function that returns the minimum and maximum of a dictionary's numeric values

## Jupyter

Essential 
  - run cell: **`ctrl` + `enter`**
   - insert new cell: **`B`**



you can read all kinds of files

you open multiple tabs, display multiple files at once, in split view (drag and drop to position)

- **Jupyter**: interactive notebooks that tell a story: code, results (variables, plotting), and
 - Actions quick intro:
   - edit cell
   - run cell: **`ctrl` + `enter`**
   - insert new cell: **`B`**
   - delete cell: **`X`**

 - Text
   - Latex: equations $e^{i\pi} + 1 = 0$
   
 - More
   - interactivity
   - HTML
   - system commands

Text can be **bold**, _italic_ or be a [link](example.com)

Below is a separation line:

---

![python logo](https://pbs.twimg.com/profile_images/439154912719413248/pUBY5pVj_400x400.png)

cell operations:
- **`z`** undo
- **`shift-z`** redo
- **`c`** copy
- **`v`** paste

change cell type:
- **`y`** to markdown
- **`y`** to code
- **`1`** to header 1

will show any expression's output (except if it is None)

particularity for collaboratory

In [259]:
In[10]

"repeat('hi ')"

In [258]:
Out[10]

'hi hi hi '

In [261]:
_10  # as a shortcut for Out[10]

'hi hi hi '

separating line:

---

### Magics

In [249]:
from time import sleep

autoreload magics

In [253]:
%%time
for i in range(3):
    print(i)
    sleep(.5)

0
1
2
CPU times: user 5.03 ms, sys: 2.16 ms, total: 7.2 ms
Wall time: 1.51 s


In [244]:
%time _ = [n ** 2 for n in range(1_000_000)]

CPU times: user 368 ms, sys: 60.8 ms, total: 429 ms
Wall time: 428 ms


In [245]:
%time _ = list(map(lambda n: n ** 2, range(1_000_000)))

CPU times: user 309 ms, sys: 14.9 ms, total: 324 ms
Wall time: 323 ms


but if you run it again, it might come out different, so then you might want to do multiple trials

In [247]:
%timeit _ = [n ** 2 for n in range(1_000_000)]

240 ms ± 2.71 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [248]:
%timeit _ = list(map(lambda n: n ** 2, range(1_000_000)))

291 ms ± 3.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


---

In [262]:
%load_ext autoreload
%autoreload 2
# load the extension and activate it, automatically refreshing all imports

In [266]:
from lucky import lucky_number

In [268]:
lucky_number()

24

Change the number in `lucky.py`, save the file and then run the cell above again.

Jupyter is not meant to be a complement for your IDE, not a replacement. The bulk of your processing (functions, classes, etc) should be organized in `.py` files, while notebooks should be used for running them and looking at the results.

### External

In [256]:
from IPython.display import HTML, clear_output

In [257]:
for i in range(5):
    clear_output()
    print(i)
    sleep(.5)

4


In [140]:
!ls

Untitled.ipynb   readme.md        workshop-1.ipynb workshop-3.ipynb
index.html       requirements.txt workshop-2.ipynb workshop-4.ipynb


In [235]:
!echo 'Hello from shell!'

Hello from shell!


In [233]:
# you can even assign the output to a python variable
n_files = !ls | wc -l

In [234]:
int(n_files[0].strip())

8

In [237]:
# most useful is interacting with the package manager (`pip`) and installing new packages without leaving the notebook
!pip list

Package           Version
----------------- -------
appnope           0.1.0  
backcall          0.1.0  
bleach            3.0.2  
cycler            0.10.0 
decorator         4.3.0  
defusedxml        0.5.0  
entrypoints       0.2.3  
ipykernel         5.1.0  
ipython           7.2.0  
ipython-genutils  0.2.0  
jedi              0.13.1 
Jinja2            2.10   
jsonschema        2.6.0  
jupyter-client    5.2.3  
jupyter-core      4.4.0  
jupyterlab        0.35.4 
jupyterlab-server 0.2.0  
kiwisolver        1.0.1  
MarkupSafe        1.1.0  
matplotlib        3.0.2  
mistune           0.8.4  
nbconvert         5.4.0  
nbformat          4.4.0  
notebook          5.7.2  
numpy             1.15.4 
pandas            0.23.4 
pandocfilters     1.4.2  
parso             0.3.1  
pexpect           4.6.0  
pickleshare       0.7.5  
pip               18.1   
prometheus-client 0.4.2  
prompt-toolkit    2.0.7  
ptyprocess        0.6.0  
Pygments          2.3.0  
pyparsing         2.3.0  
python-dateu

In [230]:
# this workshop is not about sh, but you can do run any command you wish

In [232]:
HTML('<div style="text-align: center; color: orange; font-size: 30px">Hello from HTML!</div>')
# this workshop is not about html, but you can do anything you wish

In [236]:
HTML('<script>alert("Hello from JavaScript!")</script>')