# Crash course in Jupyter and Python

- Introduction to Jupyter
    - Using Markdown
    - Magic functions
    - REPL
    - Saving and exporting Jupyter notebooks
- Python
    - Data types
    - Operators
    - Collections
    - Functions and methods
    - Control flow
    - Loops, comprehension
    - Packages and namespace
    - Coding style
    - Understanding error messages
    - Getting help

## Class Repository

Course material will be posted here. Please make any personal modifications to a **copy** of the notebook to avoid merge conflicts.

https://github.com/cliburn/sta-663-2021.git

## Introduction to Jupyter

- [Official Jupyter docs](https://jupyter.readthedocs.io/en/latest/)
- User interface and kernels
- Notebook, editor, terminal
- Literate programming: Making code very readable, mixing code and markdown
- Code and markdown cells
- Menu and toolbar
- Key bindings
- Polyglot programming: Can mix and match languages

### Using Markdown

- What is markdown?
- Headers
- Formatting text
- Syntax-highlighted code
- Lists
- Hyperlinks and images
- LaTeX

See `Help | Markdown`

### Magic functions

- [List of magic functions](https://ipython.readthedocs.io/en/stable/interactive/magics.html)
- `%magic`
- Shell access
- Convenience functions
- Quick and dirty text files

In [3]:
%magic

In [4]:
! ls # Allows you to run shell command

S01_Jupyter_and_Python.ipynb [34mexercises[m[m
S02_Text.ipynb               [34msolutions[m[m


In [6]:
# Can use this syntax for longer shell scripts
%%bash

ls

S01_Jupyter_and_Python.ipynb
S02_Text.ipynb
exercises
solutions


### REPL

- Read, Eval, Print, Loop
- Learn by experimentation

In [None]:
1 + 2

### Saving and exporting Jupyter notebooks

- The File menu item
- Save and Checkpoint
- Exporting
- Close and Halt
- Cleaning up with the Running tab

## Introduction to Python

Python is a general purpose language (in contrast to R, which was developed for statistics)

- [Official Python docs](https://docs.python.org/3/)
- [Why Python?](https://insights.stackoverflow.com/trends?tags=python%2Cjavascript%2Cjava%2Cc%2B%2B%2Cr%2Cjulia-lang%2Cscala&utm_source=so-owned&utm_medium=blog&utm_campaign=gen-blog&utm_content=blog-link&utm_term=incredible-growth-python)
- General purpose language (web, databases, introductory programming classes)
- Language for scientific computation (physics, engineering, statistics, ML, AI)
- Human readable
- Interpreted
- Dynamic typing
- Strong typing
- Multi-paradigm
- Implementations (CPython, PyPy, Jython, IronPython)

![img](https://d6vdma9166ldh.cloudfront.net/media/images/1560163643152-Image-26.jpg)

### Data types

- boolean 
- int, double, complex
- strings
- None

In [9]:
# No T, F shortcuts like we have in R
True, False

(True, False)

In [None]:
1, 2, 3

In [8]:
import numpy as np

np.pi, np.e

(3.141592653589793, 2.718281828459045)

In [12]:
# Uses j, rather than i, for complex numbers
3 + 4j

(3+4j)

In [10]:
'hello, world'

'hello, world'

In [14]:
# Double quotes helpful if string contains single quote (no need to escape it)
"hell's bells"

"hell's bells"

In [13]:
"""三轮车跑的快
上面坐个老太太
要五毛给一块
你说奇怪不奇怪"""

'三轮车跑的快\n上面坐个老太太\n要五毛给一块\n你说奇怪不奇怪'

In [None]:
None

In [15]:
# Use "is" to check for identify... == for equality
None is None

True

### Operators

- mathematical
- logical
- bitwise
- membership
- identity
- assignment and in-place operators
- operator precedence

#### Arithmetic

In [16]:
# Exponentiation
2 ** 3

8

In [17]:
11 / 3

3.6666666666666665

In [18]:
# Integer division
11 // 3

3

In [19]:
# Modulo (remainder)
11 % 3

2

#### Logical

In [21]:
True and False

False

In [22]:
True or False

True

In [23]:
not (True or False)

False

#### Relational

In [None]:
2 == 2, 2 == 3, 2 != 3, 2 < 3, 2 <= 3, 2 > 3, 2 >= 3

#### Bitwise

In [24]:
format(10, '04b')

'1010'

In [None]:
format(7, '04b')

In [None]:
x = 10 & 7
x, format(x, '04b')

In [None]:
x = 10 | 7
x, format(x, '04b')

In [None]:
x = 10 ^ 7
x, format(x, '04b')

#### Membership

In [25]:
'hell' in 'hello'

True

In [26]:
# Range returns 0 up to specified number - 1
3 in range(5), 7 in range(5)

(True, False)

In [27]:
'a' in dict(zip('abc', range(3)))

True

#### Identity

In [30]:
x = [2,3]
y = [2,3]
x == y, x is y # "is" returns false because x and y are stored in different places in memory

(True, False)

In [31]:
id(x), id(y)

(140201278187232, 140201278187312)

In [32]:
x = 'hello'
y = 'hello'

In [33]:
# Strings are immutable, so the same storage location is used
x == y, x is y

(True, True)

In [None]:
id(x), id(y)

#### Assignment

In [34]:
# No arrow notation. Just use equals for assignment
x = 2

In [35]:
x = x + 2

In [36]:
x

4

In [37]:
# Means x = x * 2
x *= 2

In [38]:
x

8

### Collections

- Sequence containers - list, tuple
- Mapping containers - set, dict
- The [`collections`](https://docs.python.org/2/library/collections.html) module

#### Lists

In [39]:
# Python is zero-indexed
xs = [1,2,3]
xs[0], xs[-1] # Counts back from the end

(1, 3)

In [42]:
# List can contain mixes of different types
[1, 2.3, 'hello', ['a', 'b']]

[1, 2.3, 'hello', ['a', 'b']]

In [40]:
xs[1] = 9
xs

[1, 9, 3]

#### Tuples

In [41]:
# Tuples are immutable (while lists are mutable)
ys = (1,2,3)
ys[0], ys[-1]

(1, 3)

In [43]:
try:
    ys[1] = 9
except TypeError as e:
    print(e)

'tuple' object does not support item assignment


#### Sets

In [44]:
zs = [1,1,2,2,2,3,3,3,3]

In [45]:
# Sets have no duplicates
set(zs)

{1, 2, 3}

#### Dictionaries

In [46]:
# Dictionary contains key, value pairs. 3 ways to create dictionaries shown below
{'a': 0, 'b': 1, 'c': 2}

{'a': 0, 'b': 1, 'c': 2}

In [47]:
dict(a=0, b=1, c=2)

{'a': 0, 'b': 1, 'c': 2}

In [48]:
dict(zip('abc', range(3)))

{'a': 0, 'b': 1, 'c': 2}

### Functions and methods

- Anatomy of a function
- Docstrings
- Class methods

In [49]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [50]:
# List comprehension (we'll talk more about this later)
[item for item in dir() if not item.startswith('_')]

['In', 'Out', 'exit', 'get_ipython', 'np', 'quit', 'x', 'xs', 'y', 'ys', 'zs']

In [51]:
def f(a, b):
    """Do something with a and b.
    
    Assume that the + and * operatores are defined for a and b.
    """
    
    return 2*a + 3*b

In [52]:
# Maps arguments in order
f(2, 3)

13

In [53]:
f(3, 2)

12

In [54]:
f(b=3, a=2)

13

In [55]:
# What you do if you want to use the elements of a tuple in a function
f(*(2,3))

13

In [56]:
# What you do if you want to use the elements of a dictionary in a function
f(**dict(a=2, b=3))

13

In [57]:
# Multiplication and addition are defined for strings (repeat, then concatenate)
f('hello', 'world')

'hellohelloworldworldworld'

In [58]:
# Multiplication and addition are defined for strings (repeat, then concatenate)
f([1,2,3], ['a', 'b', 'c'])

[1, 2, 3, 1, 2, 3, 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c']

### Control flow

- if and the ternary operator
- Checking conditions - what evaluates as true/false?
- if-elif-else
- while
- break, continue
- pass

In [59]:
if 1 + 1 == 2:
    print("Phew!")    

Phew!


In [60]:
# One-line version of if statement
'vegan' if 1 + 1  == 2 else 'carnivore'

'vegan'

In [61]:
'vegan' if 1 + 1  == 3 else 'carnivore'

'carnivore'

In [62]:
if 1+1 == 3:
    print("oops")
else:
    print("Phew!")

Phew!


In [63]:
for grade in [94, 79, 81, 57]:
    if grade > 90:
        print('A')
    elif grade > 80:
        print('B')
    elif grade > 70:
        print('C')
    else:
        print('Are you in the right class?')

A
C
B
Are you in the right class?


In [64]:
i = 10
while i > 0:
    print(i)
    i -= 1    

10
9
8
7
6
5
4
3
2
1


In [65]:
# "Continue" goes to the next iteration of the for loop
for i in range(1, 10):
    if i % 2 == 0:
        continue
    print(i)

1
3
5
7
9


In [66]:
# "Break" exits the entire loop
for i in range(1, 10):
    if i % 2 == 0:
        break
    print(i)

1


In [67]:
for i in range(1, 10):
    if i % 2 == 0:
        pass
    else:
        print(i)

1
3
5
7
9


### Loops and comprehensions

- for, range, enumerate
- lazy and eager evaluation
- list, set, dict comprehensions
- generator expression

In [68]:
# Specifying the "end" overrides default new line
for i in range(1,5):
    print(i**2, end=',')

1,4,9,16,

In [69]:
# i gives position, x gives value
for i, x in enumerate(range(1,5)):
    print(i, x**2)

0 1
1 4
2 9
3 16


In [71]:
# Can start with an offset
for i, x in enumerate(range(1,5), start=10):
    print(i, x**2)

10 1
11 4
12 9
13 16


In [72]:
range(5)

range(0, 5)

In [74]:
# How you can actually visualize the elements in a range
# Python is "lazy"... doesn't evaluate range(.) until it has to
list(range(5))

[0, 1, 2, 3, 4]

#### Comprehensions

In [75]:
# List comprehension
[x**3 % 3 for x in range(10)]

[0, 1, 2, 0, 1, 2, 0, 1, 2, 0]

In [76]:
# Brackets create set (set comprehension)
{x**3 % 3 for x in range(10)}

{0, 1, 2}

In [77]:
# Dictionary comprehension (specifying key, value pair)
{k: v for k, v in enumerate('abcde')}

{0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}

In [78]:
# Generator expression. Lazy expression. Typically would use in a loop
(x**3 for x in range(10))

<generator object <genexpr> at 0x7f8327967e50>

In [79]:
list(x**3 for x in range(10))

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

### Packages and namespace

- Modules (file)
- Package (hierarchical modules)
- Namespace and naming conflicts
- Using `import`
- [Batteries included](https://docs.python.org/3/library/index.html)

In [80]:
# File magic... take contents of this cell, and save it in foo.py
%%file foo.py

def foo(x):
    return f"And FOO you too, {x}"

Writing foo.py


In [82]:
# See contents of file
! cat foo.py


def foo(x):
    return f"And FOO you too, {x}"


In [83]:
# Import foo just as we would anything else
import foo

In [84]:
foo.foo("Winnie the Pooh")

'And FOO you too, Winnie the Pooh'

In [85]:
from foo import foo

In [86]:
foo("Winnie the Pooh")

'And FOO you too, Winnie the Pooh'

In [87]:
import numpy as np

In [88]:
np.random.randint(0, 10, (5,5))

array([[2, 0, 5, 7, 0],
       [3, 7, 2, 5, 6],
       [6, 5, 2, 3, 6],
       [8, 3, 3, 3, 5],
       [3, 9, 3, 0, 7]])

### Coding style

- [PEP 8 — the Style Guide for Python Code](https://pep8.org/)


- Many code editors can be used with linters to check if your code conforms to PEP 8 style guidelines.
- E.g. see [jupyter-autopep8](https://github.com/kenkoooo/jupyter-autopep8)

### Understanding error messages

- [Built-in exceptions](https://docs.python.org/3/library/exceptions.html)

In [89]:
try:
    1 / 0
except ZeroDivisionError as e:
    print(e)

division by zero


### Getting help

- `?foo`, `foo?`, `help(foo)`
- Use a search engine
- Use `StackOverflow`
- Ask your TA

In [90]:
help(help)

Help on _Helper in module _sitebuiltins object:

class _Helper(builtins.object)
 |  Define the builtin 'help'.
 |  
 |  This is a wrapper around pydoc.help that provides a helpful message
 |  when 'help' is typed at the Python interactive prompt.
 |  
 |  Calling help() at the Python prompt starts an interactive help session.
 |  Calling help(thing) prints help for the python object 'thing'.
 |  
 |  Methods defined here:
 |  
 |  __call__(self, *args, **kwds)
 |      Call self as a function.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [92]:
# Function help, just like R
?range

In [None]:
# Can use greek letters by typing name and then using tab (e.g., \beta + tab)
β

Tab completion can be very helpful. For instance np. + tab shows all functions available
Use shift + tab to see arguments required for a given function

### Using R

In [93]:
! python3 -m pip install --quiet pip tzlocal

In [95]:
! python3 -m pip install --quiet rpy2

In [96]:
%load_ext rpy2.ipython

In [97]:
import warnings
warnings.simplefilter('ignore', FutureWarning)

In [98]:
# % is line magic... %% is entire cell magic
df = %R iris

In [99]:
df.head()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa


In [102]:
%%R -i df -o res

library(tidyverse)
res <- df %>% group_by(Species) %>% summarize_all(mean)

R[write to console]: ── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.0 ──

R[write to console]: [32m✔[39m [34mggplot2[39m 3.3.3     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.0.4     [32m✔[39m [34mdplyr  [39m 1.0.2
[32m✔[39m [34mtidyr  [39m 1.1.2     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 1.3.1     [32m✔[39m [34mforcats[39m 0.5.0

R[write to console]: ── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()



-i is input. -o is output

In [103]:
res

Unnamed: 0,Species,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width
1,setosa,5.006,3.428,1.462,0.246
2,versicolor,5.936,2.77,4.26,1.326
3,virginica,6.588,2.974,5.552,2.026
