# Environment Basics

This notebook presents a crash course into the Python language and the Jupyter environment

TODO: short description

### Table of contents

- [The Python language](#python)
 - [Primitives](#primitives)
 - [Variables](#variables)
 - [Data structures](#data-structures)
   - [List](#list)
   - [String](#string)
   - [Dictionary](#dictionary)
   - [Other](#other-data-structures)
 - [Functions](#functions)
 - [Conditional statements](#conditional-statements)
 - [Error handling](#error-handling)
 - [Documentation](#documentation)
 - [Relevant built-ins](#relevant-built-ins)
   - [Misc functions](#misc-relevant-functions)
   - [Regular expressions](#regex)
   - [OS](#os)
   - [JSON](#json)
   - [Date and time](#datetime)
   - [Serialization](#serialization)


- [The Jupyter environment](#jupyter)
 - [Navigation](#navigation)
 - [Features](#features)
 - [Cell operations](#cell-operations)
 - [Text markup](#text)
   - [Markdown](#markdown)
   - [Latex](#latex)
   - [HTML](#html)
 - [Magics](#magics)
   - [Timing](#timing)
   - [Auto-reload](#autoreload)
 - [Interactivity](#interactivity)


- [Further reading](#further-reading)

TODO: make links

## Python

 - most widely used language for scientific computation & AI/ML/DS
 - friendly syntax
 - huge community support

introductory level, not in depth, just most things that you'll need

also some tips that you might have not known, even if you're familiar with python (string interpolation, for/else, type hints, doctest)

assumes some familiarity with programming

not touching upon classes, modules at all

In [1]:
# this is a comment

### Primitives

The most used python primitives are:
 - `int`: integers
 - `float`: real numbers
 - `str`: strings (note that there is no character primitive)
 - `bool`: truth value
 - `None`: absence of a value (similar to `null` in other languages)

### Basic arithmetic

In [2]:
1 + 9

10

In [3]:
2 ** 3  # power operator

8

### Variables

In [4]:
name = 'Peter Parker'  # either single or double quotes
age = 21
height = 5.84
married = False

In [5]:
age / 2

10.5

In [6]:
age // 2  # only whole part

10

---

In [7]:
type(age)  # checking the type of a variable

int

In [8]:
type(height) is int

False

In [9]:
int(2.8)  # type conversion

2

In [10]:
int('2')

2

In [11]:
float(2)

2.0

In [12]:
str(123)

'123'

### Data Structures

The most used basic data structures are:
 - `list`: a sequence of elements
 - `dict`: dictionary mapping keys to values

### List

In [13]:
squares = [0, 1, 4, 9]

In [14]:
squares

[0, 1, 4, 9]

In [15]:
squares.append(25)  # add one element at the end

In [16]:
squares

[0, 1, 4, 9, 25]

In [17]:
len(squares)  # number of elements

5

In [18]:
squares[0]  # zero-indexed

0

In [19]:
squares[-1]  # last element

25

In [20]:
4 in squares  # test membership

True

In [21]:
5 in squares  # the list does not contain 5

False

In [22]:
squares.index(4)  # get the index of an element

2

In [23]:
squares

[0, 1, 4, 9, 25]

In [24]:
squares[1:]  # slice: from the second element onwards

[1, 4, 9, 25]

In [25]:
squares[:-1]  # all but the last element

[0, 1, 4, 9]

In [26]:
squares + [49, 'abc', 'xyz']  # concatenate lists

[0, 1, 4, 9, 25, 49, 'abc', 'xyz']

### Strings

In [27]:
name

'Peter Parker'

In [28]:
# a string behaves just like a list
len(name)

12

In [29]:
name[0]

'P'

In [30]:
name[1:]

'eter Parker'

In [31]:
'Hello ' + name

'Hello Peter Parker'

In [32]:
'ha' * 3

'hahaha'

In [33]:
name.split()  # by default, it splits by blanks

['Peter', 'Parker']

In [34]:
name.lower()

'peter parker'

In [37]:
'-'.join(['ab', 'cd', 'yz'])

'ab-cd-yz'

In [38]:
f'{name} is {age} years old'  # note the f at beginning

'Peter Parker is 21 years old'

In [41]:
'abc abcd'.replace('ab', 'X')

'Xc Xcd'

In [42]:
'  abc  '.strip()

'abc'

In [43]:
pi = 3.14159

In [48]:
f'π is {pi}'

'π is 3.14159'

In [44]:
f'π as a whole number {pi:.0f}'  # no decimals

'π as a whole number 3'

In [47]:
f'first decimals of π {pi:.3f}'

'first decimals of π 3.142'

In [52]:
per = 0.705  # 70.5%
f'as a percentage {per:.1%}'

'as a percentage 70.5%'

In [54]:
f'as a percentage {per:.4%}'

'as a percentage 70.5000%'

### Dictionary

A dictionary behaves like a list where the keys are not numbers

has the time complexity of a hashmap, but through some very clever engineering, its elements are ordered

In [55]:
superhero_ages = {
    'Ironman':   36,
    'Hulk':      38,
    'Thor':      'varies',  # does not have to be homogenous
}

In [56]:
len(superhero_ages)

3

In [57]:
superhero_ages['Ironman']

36

In [58]:
superhero_ages['Spiderman'] = 21  # add an element

In [59]:
superhero_ages

{'Ironman': 36, 'Hulk': 38, 'Thor': 'varies', 'Spiderman': 21}

In [60]:
del superhero_ages['Hulk']  # remove an element

In [61]:
superhero_ages

{'Ironman': 36, 'Thor': 'varies', 'Spiderman': 21}

In [62]:
'Deadpool' in superhero_ages  # whether the key is in the dictionary

False

---

### Other data structures

Tuples are similar to lists, but they are meant for heterogenous data (think of them as light-weight classes)

In [67]:
info = ('Los Angeles', 21, 'vanilla')  # information about a person

In [68]:
info[0]

'Los Angeles'

Sets are similar to lists, but they are unordered* and the access time is $O(1)$

In [69]:
s = {2, 1, 4}

In [70]:
2 in s

True

### Iteration

TODO: break, continue

In [71]:
# list iteration
for sq in squares:
    print(sq)

0
1
4
9
25


In [73]:
# dict iteration
for name, age in superhero_ages.items():
    print(name, age)

Ironman 36
Thor varies
Spiderman 21


In [74]:
for i in range(10):  # idiom for C-like languages' for(i = 0; i<10; i++)
    print(i)

0
1
2
3
4
5
6
7
8
9


Iteration-related functions:

In [75]:
colors = ['red', 'green', 'blue', 'black']

In [77]:
for col in colors:
    print(col)

red
green
blue
black


In [76]:
for col in reversed(colors):
    print(col)

black
blue
green
red


In [78]:
for i, col in enumerate(colors):
    print(i, col)

0 red
1 green
2 blue
3 black


In [79]:
squares

[0, 1, 4, 9, 25]

In [80]:
for sq, col in zip(squares, colors):
    print(sq, col)

0 red
1 green
4 blue
9 black


In [81]:
# `while` is not used as often
k = 0
while k <= 10:
    k += 1  # idiom, k++ does not exist (because it makes implementing operator overload much easier)

In [82]:
k

11

---

Multiple ways of creating a collection based on another's elements:

In [83]:
color_lengths = []

for col in colors:
    l = len(col)
    color_lengths.append(l)

In [84]:
color_lengths

[3, 5, 4, 5]

In [85]:
[len(col) for col in colors]  # list comprehension, equivalent but shorter

[3, 5, 4, 5]

In the next subsection we'll learn an even shorter way, using `map`

In [86]:
{col: len(col) for col in colors}  # dict comprehension

{'red': 3, 'green': 5, 'blue': 4, 'black': 5}

### Functions

TODO: move earlier

In [87]:
def double(n):
    # takes an argument and returns it doubled
    return n * 2

In [88]:
double = lambda n: n * 2  # equivalent but shorter way to write simple functions

In [89]:
double(5)

10

Duck-typing allows the function to work on any kind of argument that supports the `*` operator:

In [90]:
double(1.2)

2.4

In [91]:
double('ha')

'haha'

In [92]:
double([1, 2, 3])

[1, 2, 3, 1, 2, 3]

---

Functions can take anything as arguments, even other functions.

One such special function is `map`, which takes two arguments, a function and a collection, and it applies the function to each element of the collection.

In [None]:
map(len, colors)  # get the length of each color, equivalent to definitions above


*Note*: in order to preserve memory and computation, it is designed to use lazy-evaluation, meaning that it actually returns a `generator` object, which yields one element at a time, upon being called.

In [None]:
result = map(len, colors)
for l in result:
    print(l)

Calling it again correctly yields no elements, as they have all been consumed:

In [None]:
for l in result:
    print(l)

In [None]:
# evaluate all elements
list(map(double, squares))

---

A function can have default arguments:

In [93]:
def repeat(s, times=3):
    # by default, it repeats 3 times
    return s * times

In [94]:
repeat('ha ')

'ha ha ha '

In [95]:
repeat('ha ', 5)

'ha ha ha ha ha '

In [96]:
repeat('ha ', times=5)  # arguments can be named

'ha ha ha ha ha '

In [97]:
def repeat_extra(s):
    # returns multiple values
    return len(s), s * 3

In [98]:
repeat_extra('ha')

(2, 'hahaha')

In [99]:
result = repeat_extra('ha')
length   = result[0]
repeated = result[1]

In [100]:
length, repeated = repeat_extra('ha')  # shorter way of destructuring the result

In [101]:
length

2

In [102]:
repeated

'hahaha'

In [103]:
a, b, c = [1, 2, 3]  # works everywhere

In [104]:
a

1

In [105]:
[a, b, c] = [1, 2, 3, 4]  # gives an error upon mismatch of length

ValueError: too many values to unpack (expected 3)

### Conditional statements

For control flow

In [106]:
def is_even(n):
    # tell whether the argument is even
    return (n % 2 == 0)

In [107]:
if is_even(2):
    print('works')
else:
    print('broken')

works


In [108]:
small_squares = []
for sq in squares:
    if sq < 5:
        small_squares.append(sq)

In [109]:
small_squares

[0, 1, 4]

In [110]:
[sq for sq in squares if sq < 5]  # works in list comprehension as well

[0, 1, 4]

In [None]:
filter(is_even, squares)  # filter is another higher-order-function, which returns just those elements that pass the predicate

In [None]:
# little known fact, sometimes useful: an "else" can also be attached to a for/while loop

target = 10  # change it! make it something that is and then something that is not in the list

for x in [2, 5, 1, 4, 2]:
    if x == target:
        break

else:  # no break
    print('target not in the list')

## Error handling

In [111]:
l = [1, 2, 3]

In [112]:
l[10]  # the list does not have that many elements, thus an error is given

IndexError: list index out of range

In [113]:
try:
    l[10]
except:
    print('an error occurred')

an error occurred


In [114]:
try:
    lizt[10]
except IndexError:
    print('bad index')
except Exception as e:
    print('a different error occured:', str(e))

a different error occured: name 'lizt' is not defined


### Documentation

In [None]:
# type hints tell the user (and the pre-compiler) what the function is expected to receive and to output:
def is_even(n: int) -> bool:
    return n % 2 == 0

In [None]:
def is_even(n):
    """
    This is multi-line comment explains what the function does.
    This function return whether the integer passed as argument is even.
    """
    return n % 2 == 0

In [None]:
# quick unit tests show example of usage
def is_even(n):
    """
    The examples below show usage and expected output:
    
    >>> is_even(2)
    True

    >>> is_even(5)
    False
    
    >>> is_even(1)
    True
    """
    return n % 2 == 0

In [None]:
# they can be tested automatically for correctness:
import doctest
doctest.testmod()

## Exercises

TODO: add more, easier ones at the beginning

**Exercise**: create a list of the first 100 squares (programatically)

In [118]:
first_100_squares = [x ** 2 for x in range(100)]

**Exercise**: create a `yell` function that takes a string and a boolean `exclaim`. 
It returns the string in all caps and adds an exclamation mark at the end if exclaimed (off by default).

**Exercise**: create a function that returns the minimum and maximum of a dictionary's numeric values

## Relevant built-in functions

In [None]:
sorted([9, 1, 4])

In [None]:
with open('example_files/plain.txt') as f:
    print('file contents:', f.read())

In [None]:
any([1<3, 1+1 is 2, 0 == 1])

In [None]:
all([1<3, 1+1 is 2, 0 == 1])

## Relevant built-in packages

Regular expressions — extract information from structured text:

In [None]:
import re

In [None]:
re.sub(
    pattern='\d{5}',  # matches any five digits
    repl='[removed]',
    string='Hi, this is funny_bunny_94 and my zip code is 90007',
)

In [None]:
# rudimentary email, just for show
# one or more alphanumeric characters, followed by @ followed by at least three letters and ending in .com
if re.match('\w+@[a-z]{3,}\.com', 'name94@example.com'):
    print('matched')

---

In [None]:
import os

In [None]:
os.makedirs('example_files', exist_ok=True)

In [None]:
from pathlib import Path

In [None]:
current_folder = Path('.')

In [None]:
# print only files in the current folder
for entry in current_folder.iterdir():
    if not entry.is_dir():
        print(entry)

In [None]:
current_folder / 'example_files'  # traverse using the / operator

---

In [None]:
import json

In [None]:
with open('example_files/objects.json') as f:
    print(json.load(f))

---

In [None]:
from datetime import datetime

In [None]:
datetime.now()

In [None]:
datetime(year=2018, month=2, day=28)

In [None]:
# not built-in but related, and very useful
from dateparser import parse as parse_date

In [None]:
parse_date('1 day and 2 hours ago')

In [None]:
parse_date('28 feb')

---

In [None]:
import pickle

In [None]:
serialized = pickle.dumps(superhero_ages)  # you can save this in as a binary file

In [None]:
serialized

In [None]:
pickle.loads(serialized)

<a id="jupyter"></a>
## Jupyter

you can read all kinds of files

you open multiple tabs, display multiple files at once, in split view (drag and drop to position)

select one or more cells, drag them to reposition, or between notebooks

TODO: indicator of working kernel

Interface interaction:
 - navigate folders and files in the [File Browser]  (on the sidebar to the right)
 - select multiple, move files using drag-n-drop
 - download/upload using drag-n-drop to/from your system's file browser
 - search all commands in the [Commands Palete] (on the sidebar to the right)
 - [View] > [Show line numbers] to more easily track error lines

TODO: tab completion, shift_tab function argument

**Jupyter**: interactive notebooks that tell a story: code, results (variables, plotting), and

Actions quick intro:
 - add cell
 - run cell 

When you open a notebook, a new kernel is started. Once you close it, all variable values are lost.

If the kernel is stuck, or processing takes to long, you can interrupt it. Variables are intact, just the cell that was running has been stopped.

 - **`ctrl+s`** save

running:
 - **`ctrl+enter`** run cell
 - **`shift+enter`** run cell and advance to next one
 - **`ii`** interupt kernel
 - **`00`** restart kernel
 - [Edit] > [Clear all outputs]

cell operations:
- **`esc`** to exit out of edit mode
- **`z`** undo
- **`shift-z`** redo
- **`x`** cut (use as delete)
- **`c`** copy
- **`v`** paste
- **`shift-m`** merge cells


change cell type:
- **`m`** to markdown
- **`y`** to code
- **`1`** to header 1, etc

These options are also available in the menu bar at the top

will show any expression's output (except if it is None)

particularity for collaboratory

---

In [None]:
_  # the last result

In [None]:
In[10]

In [None]:
Out[10]

In [None]:
_10  # as a shortcut for Out[10]

## Text markup

### Markdown language

Text styles:
 - regular
 - **bold**
 - _italic_ 
 - `code` 
 - [link](example.com)
 
 
 
> this is a quote

block of code:
```html
<div id="greeting">hello</div>
```


Ordered lists:
 1. first
 2. second
 3. third
   - this is
   - a sublist

Headers (1`#` thorugh 6`######`)

Below is a separation line:

---

This is an embedded image (note the `!` before the link):

![python logo](https://pbs.twimg.com/profile_images/439154912719413248/pUBY5pVj_bigger.png)

## Latex

equations $e^{i\pi} + 1 = 0$

## HTML

In [None]:
from IPython.display import HTML

In [None]:
HTML('<div style="text-align: center; color: orange; font-size: 30px">Hello from HTML!</div>')
# this workshop is not about html, but you can do anything you wish

### Magics

In [None]:
from time import sleep

In [None]:
%%time
for i in range(3):
    print(i)
    sleep(.5)

In [None]:
%time _ = [n ** 2 for n in range(1_000_000)]

In [None]:
%time _ = list(map(lambda n: n ** 2, range(1_000_000)))

but if you run it again, it might come out different, so then you might want to do multiple trials

In [None]:
%timeit _ = [n ** 2 for n in range(1_000_000)]

In [None]:
%timeit _ = list(map(lambda n: n ** 2, range(1_000_000)))

---

In [None]:
%load_ext autoreload
%autoreload 2
# load the extension and activate it, automatically refreshing all imports

In [None]:
from lucky import lucky_number

In [None]:
lucky_number()

Change the number in `lucky.py`, save the file and then run the cell above again.

Jupyter is not meant to be a complement for your IDE, not a replacement. The bulk of your processing (functions, classes, etc) should be organized in `.py` files, while notebooks should be used for running them and looking at the results.

That is why it is also not possible to import from other notebooks.

---

In [None]:
# because this is not a package, an extra step is needed in order to import from folders
from sys import path
path.append('.')

In [None]:
from example_files.module import f

In [None]:
f()

### External languages

In [None]:
!ls

In [None]:
!echo 'Hello from shell!'

In [None]:
# you can even assign the output to a python variable
n_files = !ls | wc -l

In [None]:
int(n_files[0].strip())

In [None]:
# most useful is interacting with the package manager (`pip`) and installing new packages without leaving the notebook
!pip install numpy

---

In [None]:
HTML('alert("Hello from JavaScript!")')  # put <script> tags around it and run it!

### Interactivity

In [None]:
from IPython.display import clear_output

In [None]:
for i in range(5):
    clear_output()
    print(i)
    sleep(.5)

---

In [None]:
#!jupyter labextension install @jupyter-widgets/jupyterlab-manager

In [None]:
from ipywidgets import interact

In [None]:
import ipywidgets

In [None]:
def power(base, exp, negative):
    result = base ** exp
    if negative:
        result *= -1
    return round(result, 2)

_ = interact(power, base=2.5, exp=3, negative=False)

## Further reading
 - cheatsheet: TODO
 - Python: [interactive tutorial](https://www.codecademy.com/learn/learn-python-3)
 - Regex: [reference](https://docs.python.org/3/library/re.html)
 - Latex: [tutorial series](https://www.sharelatex.com/blog/latex-guides/beginners-tutorial.html)
 - Markdown: [GFM guide](https://guides.github.com/features/mastering-markdown/)
 - HTML: [interactive tutorial](https://www.codecademy.com/learn/learn-html)
 - JavaScript: [MDN guide](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide)
 - Command line: [interactive tutorial](https://www.codecademy.com/learn/learn-the-command-line)
 - Date formatting: [reference](https://pyformat.info)
 - Number formatting: [reference](https://docs.python.org/3/library/string.html#formatspec)
 - IDE: [PyCharm](https://www.jetbrains.com/pycharm/) (free with .edu email)
