<a href="https://colab.research.google.com/github/kovacova/random-magic/blob/master/python/01-whirlwind-tour-of-python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A Whirlwind Tour of Python

_Book Notes_


The appeal of Python is in its simplicity and beauty, as well as the convenience of the large ecosystem of domain-specific tools that have been built on top of it. For example, most of the Python code in scientific computing and data science is built around a group of mature and useful packages:

**NumPy** provides efficient storage and computation for multidi‐ mensional data arrays.

**SciPy** contains a wide array of numerical tools such as numeri‐ cal integration and interpolation.

**Pandas** provides a DataFrame object along with a powerful set of methods to manipulate, filter, group, and transform data.

**Matplotlib** provides a useful interface for creation of publication-quality plots and figures.

**Scikit-Learn** provides a uniform toolkit for applying common machine learning algorithms to data.

**IPython/Jupyter** provides an enhanced terminal and an interac‐ tive notebook environment that is useful for exploratory analy‐ sis, as well as creation of interactive, executable documents. For example, the manuscript for this report was composed entirely in Jupyter notebooks.

<br>

📌**TODOs**: 
- Read through the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/)
- Combined bitwise operators
- Binary vs. base-10 representation of numbers
- Complex numbers
- Python documentation for extras
- Collections module


## Python Language Syntax

⚠️Not enough values to unpack (line 5 bug) - we use semicolon (;) instead of a colon 

In [0]:
# Set the midpoint (this is an assignment operation)
midpoint = 5

# Make two empty lists
lower = []; upper = []

# Split the numbers into lower and upper
for i in range(10):
  if (i < midpoint):
    lower.append(i)
  else: 
    upper.append(i)
    
print('lower: ', lower)
print('upper: ', upper)

lower:  [0, 1, 2, 3, 4]
upper:  [5, 6, 7, 8, 9]


### Multinline Statements & Terminating Statements

When writing these, make sure the backward slash is the last character, there are no additional spaces, and the rest of the statement is on the next line.


Semicolon **;** can optionally terminate a statement.

In [0]:
# Writing multiline statements

x = 1 + 2 + 3 + 4 + \
    5 + 6 + 7 + 8

x

36

In [0]:
# We can accomplish the same without \ by using parentheses
# This is generally the best practice

x = (1 + 2 + 3 + 4 +
     5 + 6 + 7 + 8)

x

36

In [0]:
# To terminate the statement, we can use ;
lower = []; upper = []

# Is an equivalent to:
lower = []
upper = []

### Indentation

In programming languages, a block of code is a set of statements that should be treated as a unit. In C, for example, code blocks are deno‐ ted by curly braces.


In Python, code blocks are denoted by **indentation** and indented code blocks are always preceded by **colon**:

Note: Whitespace within lines does not matter.

In [0]:
# Example 1: This will be executed ONLY if x < 4
if x < 4:
  y = x * 2
  print(x)

# This print statement will ALWAYS be executed
if x < 4:
  y = x * 2
print(x)

36


### Parentheses

They can be used in the typical way to group statements or mathematical operations. 

They can also be used to indicate that a function is being called. In the next snippet, the **print()** function is used to display the contents of a variable. 

The **function call** is indicated by a pair of opening and closing parentheses, with the **arguments** to the function contained within:

In [0]:
# Parenthesis example
2 * (3 + 4)

# Function call example
print('First value:', 1)

First value: 1


Some **functions can be called with no arguments** at all, in which case the opening and closing parentheses still must be used to **indicate a function evaluation.**

The () after sort indicates that the function should be executed,
and is required even if no arguments are necessary.

In [0]:
# Example of calling a function
L = [4, 3, 2, 1]
L.sort()
print(L)

[1, 2, 3, 4]


## Python Semantics

As opposed to the syntax covered in the previous section, the semantics of a language involve the meaning of the statements.

### Variables and Objects

This section will cover the semantics of variables and objects, which are the main ways you store, reference, and operate on data within a Python script. 

Python variables are **pointers**. Assigning variables in Python is as easy as putting a variable name to the left of the equals sign (=):

> `x = 4` 

In C code, we would assign variable as: 

> `int x = 4;`


### Containers and Pointers 

In many programming languages, variables are best thought of as containers or buckets into which you put data. So in C, for example, when you write:

> `int x = 4;`

you are essentially **defining a “memory bucket”** named x, and putting the value 4 into it. In Python, by contrast, **variables are best thought of not as containers but as pointers**. So in Python, when you write:

> `x = 4`

you are essentially **defining a pointer named x that points to some other bucket** containing the value 4. Essentially, because Python variables just point to various objects, there is **no need to “declare”** the variable, or even require the variable to always point to information of the same type! This is the sense in which people say Python is **dynamically typed**: variable names can point to objects of any type. 

In [0]:
# Python does not have the concept of 'type safety' and variables can be re-assigned

x = 1           # x is an integer
x = 'hello'     # now x is a string
x = [1, 2, 3]   # now x is a list 

IMPORTANT: If we have two variable names pointing to the same mutable object, then changing one will change the other as well! 

The **=** operator simply **changes** what the name points to.

In [0]:
x = [1, 2, 3]
y = x

print(y)

[1, 2, 3]


In [0]:
x.append(4)   # we append 4 to the list pointed to by x
print(y)      # y's list is modified as well!

[1, 2, 3, 4]


In [0]:
x = 'something else'
print(y)      # y is unchanged because we only changed the x pointer

[1, 2, 3, 4]


Numbers, strings, and other simple types are **immutable**: you can’t change their value—you can only change what values the variables point to. 

When we call **x += 5**, we are not modifying the value of the 5 object pointed to by x, but rather we are changing the object to which x points. For this reason, the value of y is not affected by the operation.

In [0]:
x = 10
y = x
x += 5       # add 5 to x's value, and assign it to x
print('x =', x)
print('y =', y)

x = 15
y = 10


## Python as an Object-Oriented Programming Language

❓type-free languages (type refers to str, int, float)

Python has types; however, the types are linked **not to the variable names but to the objects** themselves.

In object-oriented programming languages like Python, **an object is an entity that contains data along with associated metadata and/or functionality**. In Python, **everything is an object**, which means every entity has some metadata (called **attributes**) and associated functionality (called **methods**). 

These attributes and methods are accessed via **the dot syntax**.



In [0]:
# Even simple types have attached ATTRIBUTES AND METHODS
x = 4.5
print(x.real, '+', x.imag, 'i')

4.5 + 0.0 i


**Methods** are like attributes, except they **are functions** that you can call using a pair of opening and closing **parentheses**. 

In [0]:
x = 4.5
x.is_integer()

x = 4.0
x.is_integer()

True

In [0]:
# Everything in Python is an object
# Even the attributes and methods of objects are themselves objects
type(x.is_integer)

builtin_function_or_method

### Python Arithmetic Operations

Python implements seven basic binary arithmetic operators, two of which can double as unary operators. They are summarized in the following table:

<img src='http://masterprograming.com/wp-content/uploads/2019/05/Arithmetic-operators.jpg' width=50%></img>



In [0]:
(4 + 8) * (6.5 - 3)

42.0

In [0]:
# True division
print(11 / 2)

# Floor Division
print(11 // 2)

5.5
5


In [0]:
# he a @ b operator, which is meant to indicate the matrix product of a and b
import numpy as np

a = np.array([1, 2, 3])
b = np.array([1, 2, 3])

a @ b

14

### Bitwise Operations

| Operator | Name | Description |
|---|---|---|
|  a & b | Bitwise AND  | Bits defined in both a and b   |
|  a \| b | Bitwise OR | Bits defined in a or b or both |
| a ^ b  | Bitwise XOR | Bits defined in a or b but not both |
| a << b | Bit shift left | Shifts bit of a left by b units |
| a >> b | Bit shift right | Sifts bits of a right by b units |
| ~a | Bitwise NOT | Bitwise negation of a |

These bitwise operators only make sense in terms of the binary representation of numbers, which you can see using the built-in bin function:

In [0]:
# The result is prefixed with 0b, which indicates a binary representation
bin(10)

'0b1010'

In [0]:
bin(10)

'0b1010'

In [0]:
# This combines the bits of 4 and 10
4 | 10

14

### Updating Values

Frequently, we are updating variables in Python to reflect a new value (i.e. a + 2). Because this type of combined operation and assign‐ ment is so common, Python includes built-in update operators for all of the arithmetic operations:


There is an **augmented assignment operator** corresponding to each of the binary operators listed earlier; in brief, they are:

|   |  |   |  |
|---|---|---|--|
| a += b | a -= b | a \*= b | a /= b | 
| a //= b | a %= b | a \*\*= b | a &= b | 
| a \|= b | a ^= b | a <<= b | a >>= b |


In [0]:
a = 5
a += 2     # equivalent to a = a + 2
a

7

### Comparison Operations

Another type of operation that can be very useful is comparison of different values. For this, Python implements standard comparison operators, which return Boolean values True and False. The comparison operations are listed in the following table:

|  Operation | Description | 
|---|--|
|a == b | a equal to b | 
|a != b | a not equal to b |
|a < b | a less than b |
|a > b | a greater than b |
|a <= b | a less than or equal to b |
|a >= b| a greater than or equal to b |



In [0]:
# 25 is odd
25 % 2 == 1

True

In [0]:
# 66 is odd
66 % 2 == 1

False

In [0]:
# Check if a is between 15 and 30
a = 25
15 < a < 30

True

In [0]:
# ~ is the bit-flip operator
# and evidently when you flip all the bits of zero you end up with –1. 🤷‍♀️
-1 == ~0

True

### Boolean Operations

When working with Boolean values, Python provides operators to combine the values using the standard concepts of “and”, “or”, and “not”. Predictably, these operators are expressed using the words and, or, and not:

In [0]:
x = 4
(x < 6) and (x > 2)

True

In [0]:
(x > 10) or (x % 2 == 0)

True

In [0]:
not (x < 6)

False

One sometimes confusing thing about the language is when to use Boolean operators (and, or, not), and when to use bitwise operations (&, |, ~). The answer lies in their names: 

**Boolean operators** should be used when you want to compute Boolean values (i.e., **truth or falsehood**) of entire statements. 

**Bitwise operations** should be used when you want to **operate on individual bits or components** of the objects in question.

### Identity Operators

The identity operators, is and is not, check for object identity. Object **identity** is different than **equality**.

The **is** operator **checks whether the two variables are pointing to the same container (object)**, rather than referring to what the container contains. With this in mind, in most cases that a beginner is tempted to use is, what they really mean is **==**.

In [0]:
a = [1, 2, 3]
b = [1, 2, 3]

a == b

True

In [0]:
a is b

False

In [0]:
a is not b

True

In [0]:
# Creating identical objects 
a = [1, 2, 3]
b = a 

a is b

True

### Membership operators

Membership operators check for membership within **compound** objects.

These membership operations are an example of what makes Python so easy to use compared to lower-level languages such as C. In C, membership would generally be determined by manually constructing a loop over the list and checking for equality of each value. In Python, you just type what you want to know, in a manner reminiscent of straightforward English prose.

In [0]:
1 in [1, 2, 3]

True

In [0]:
2 not in [1, 2, 3]

False

## Built-In types 

### Simple Values

These are:
- Integers
- Floating-point numbers (i.e., real numbers)
- Complex numbers (i.e., numbers with a real and imaginary part (x = 1 + 2j))
- Booleans (True/False values)
- Strings
- NoneType (special objects indicating nulls)

You’ll see **None** used in many places, but perhaps most commonly it is used as **the default return value of a function.**

Another convenient feature of Python integers is that by default, **division upcasts to floating-point** type:

In [0]:
5 / 2     # We can use floor division // if we want to retain ints

2.5

In the exponential notation, the e or E can be read “...times ten to the...”, so that 1.4e6 is interpreted as 1.4 × 106.

In [0]:
# Floating point type can store fractional numbers

# Standard decimal notation
x = 0.000005

# Exponential notation
y = 5e-6

print(x == y)

True


This rounding error for floating-point values is a necessary evil of working with floating-point numbers. The best way to deal with it is to always keep in mind that floating-point arithmetic is approximate, and **never rely on exact equality tests with floating-point values**.

In [0]:
# With floats the precision is limited and equality tests may be unstable
0.1 + 0.2 == 0.3

False

In [0]:
print('0.1 = {0:.17f}'.format(0.1))
print('0.2 = {0:.17f}'.format(0.2))
print('0.3 = {0:.17f}'.format(0.3))

0.1 = 0.10000000000000001
0.2 = 0.20000000000000001
0.3 = 0.29999999999999999


Computers usually store values in binary notation, so that each number is expressed as a sum of powers of 2:

1/8 = 0 · 2-1 + 0 · 2-2 + 1 · 2-3

In [0]:
# Accessing individual characters (zero-based indexing)
message = 'spam'
message[0]

's'

## Built-In Data Structures

We have seen Python’s simple types: int, float, complex, bool, str, and so on. Python also has several built-in compound types, which **act as containers** for other types. These compound types are:

| Type Name | Example | Description |
|---|---|---|
|  list | [1, 2, 3] | Ordered Collection |
|  tuple | (1, 2, 3) | Immutable ordered collection |
| dict  | {'a':1, 'b:2, 'c': 3} | Unordered (key, value) mapping |
| set | {1, 2, 3} | Unordered collection of unique values |

As you can see, **round, square, and curly brackets have distinct meanings** when it comes to the type of collection produced.





### Lists

Lists are _ordered_ and _mutable_ data collection type. We can use the **`.append()`**, **`.sort()`** and many more methods on these.

In [0]:
# Lists can contain objects of any type, or even a mix of types
L = [1, 'two', 3.14, [0, 3, 5]]

### List Indexing and Slicing

Python provides access to elements in compound types through indexing for single elements, and slicing for multiple elements. As we’ll see, both are indicated by a square-bracket syntax.

Python uses zero-based indexing, so we can access the first and second element in using the following syntax:

In [0]:
L = [2, 3, 5, 7, 11]

L[0]

L[1]

3

In [0]:
# Negative numbers allow us to pick elements from the END of the list
L[-1]

11

Where **indexing** is a means of **fetching a single value** from the list, **slicing** is a means of **accessing multiple values** in sublists. It uses a colon to indicate the **start point (inclusive)** and **end point (non-inclusive)** of the subarray. For example, to get the first three ele‐ ments of the list, we can write it as follows:

In [0]:
# List slicing example
L[0:3]

# We can also leave out the first value, python will assume it is 0
L[:3]

[2, 3, 5]

It is possible to specify a third integer that represents the **step size**; for example, to select every second element of the list, we can write:

In [0]:
L[::2]

[2, 5, 11]

In [0]:
# A particularly useful version of this is to specify a negative step, which will reverse the array
L[::-1]

[11, 7, 5, 3, 2]

### Tuples

They are very similar to lists, but use parentheses rather than square brackets, or alternatively they can be defined without any brackets at all. 

Tuples have a length, and individual elements can be extracted using square-bracket indexing.

The main distinguishing feature of tuples is that they are **immutable**, meaning that once they are created, **their size and contents cannot be changed**. 

Indexing and slicing logic also **works** with tuples.

In [0]:
t = (1, 2, 3)
t = 1, 2, 3
print(t)

(1, 2, 3)


In [0]:
len(t)

3

In [0]:
t[0]

1

In [0]:
# This will give us error:

# TypeError: 'tuple' object does not support item assignment
# t[1] = 4


# AttributeError: 'tuple' object has no attribute 'append'
# t.append(4)

Tuples are often used in a Python program; a particularly common case is in functions that have multiple return values. For example, the **as_integer_ratio()** method of floating-point objects returns a numerator and a denominator; this dual return value comes in the form of a tuple:

In [0]:
x = 0.125
x.as_integer_ratio()

(1, 8)

In [0]:
# To reverse this:

# These multiple return values can be individually assigned:
numerator, denominator = x.as_integer_ratio()
print(numerator/denominator)

0.125


### Dictionaries

Dictionaries are extremely flexible mappings of keys to values, and form the basis of much of Python’s internal implementation. They can be created via **a comma-separated list of key:value pairs** within curly braces:

Items are accessed and set via the indexing syntax used for lists and tuples, except here **the index is** not a zero-based order but **valid key** in the dictionary:

In [91]:
numbers = {'one':1, 'two':2, 'three':3}

# Access a value via the key
numbers['two']

2

In [92]:
# We can set a new key/value pair

numbers['ninety'] = 90
print(numbers)

{'one': 1, 'two': 2, 'three': 3, 'ninety': 90}


Keep in mind that dictionaries **do not maintain any sense of order** for the input parameters; this is by design. This lack of ordering allows dictionaries to be implemented very efficiently, so that **random element access is very fast**, regardless of the size of the dictionary (if you’re curious how this works, read about the concept of **a hash table**). 

In [107]:
import pandas as pd
df = pd.Series(numbers)
df.head()

one        1
two        2
three      3
ninety    90
dtype: int64

### Sets

Sets contain **unordered collections of unique items**. They are defined much like lists and tuples, except they use the **curly brackets** of dictionaries.

If you’re familiar with the **mathematics of sets**, you’ll be familiar with operations like the **union, intersection, difference, symmetric difference, and others**. Python’s sets have all of these operations built in via **methods** or **operators**.

In [0]:
primes = {2, 3, 5, 7}
odds = {1, 3, 5, 7, 9}

In [111]:
# Union: items appearing in either
primes | odds      # with an operator
primes.union(odds) # equivalently with a method

{1, 2, 3, 5, 7, 9}

In [112]:
# Intersection: items appearing in both
primes & odds             # with an operator
primes.intersection(odds) # equivalently with a method

{3, 5, 7}

In [114]:
# Difference: items in primes but not in odds
primes - odds
primes.difference(odds) 

{2}

In [116]:
# Symmetric difference: items appearing in only one set
primes ^ odds
primes.symmetric_difference(odds)

{1, 2, 9}

### Specialized Data Structures

They can be found in the built-in **collections** module.

Examples:
- collections.**namedtuple** (like a tuple, but each value has a name)
- collections.**defaultdict**(unspecified keys have a user-specified default value)
- collections.**OrderedDict**(like a dictionary, but the order of keys is maintained)

## Control Flow

With control flow, you can **execute** certain code blocks **conditionally and/or repeatedly**: these basic building blocks can be combined to create surprisingly sophisticated programs!

Here we’ll cover conditional statements (including **if**, **elif**, and **else**) and loop statements (including **for** and **while**, and the accompanying **break**, **continue**, and **pass**).

### Conditional Statements: if, elif, and else

Conditional statements, often referred to as if-then statements, allow the programmer to execute certain pieces of code **depending** on some **Boolean condition**. 

**elif** - a contraction of else if

In [117]:
x = -15

if x == 0:
  print(x, 'is zero')
elif x > 0:
  print(x, 'is positive')
elif x < 0:
  print(x, 'is negative')
else:
  print(x, "is unlike anything I've ever seen...")

-15 is negative


### For Loops

Loops in Python are a way to **repeatedly execute some code** statement. So, for example, if we’d like to print each of the items in a list, we can use a for loop.

Notice the simplicity of the for loop: we specify the **variable** we want to use, the **sequence we want to loop over**, and use the **in** operator to link them together in an intuitive and readable way. More precisely, the object to the right of the in can be any Python **iterator**. 

The **range** object is a good example of **an iterator**.

In [118]:
for N in [2, 3, 4, 7]:
  print(N, end=' ')  # print all on the same line

2 3 4 7 

The range starts at zero by default, and that by convention the top of the range is not included in the output. Range objects can also have more complicated values. Python 2, range() produces a list, while in Python 3, range() produces **an iterable object**.

In [1]:
# Range from 5 to 10
list(range(5,10))

[5, 6, 7, 8, 9]

In [2]:
# Range from 0 to 10 by 2
list(range(0,10, 2))

[0, 2, 4, 6, 8]

### While Loops

While loop iterates until some condition is met. The argument of the while loop is evaluated as a Boolean statement, and the loop is executed until the statement evaluates to False.

In [3]:
i = 0
while i < 10:
  print(i, end=' ')
  i += 1

0 1 2 3 4 5 6 7 8 9 

We can fine-tune how our for and while loops are executed using **break** and **continue** statements.

The **break** statement breaks out of the loop entirely.

The **continue** statement skips the remainder of the current
loop, and goes to the next iteration.


In [5]:
for n in range(20):
  # Check if n is even
  if n % 2 == 0:
    continue
  print(n, end=' ')

1 3 5 7 9 11 13 15 17 19 

In [1]:
# Listing all Fibonacci numbers up to a certain value

a, b = 0, 1
amax = 100
L = []

while True:
  (a, b) = (b, a + b) # ❓
  if a > amax:
    break
  L.append(a)
  
print(L)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


### Loops with an Else Block

One rarely used pattern available in Python is the else statement as part of a for or while loop. We discussed the else block earlier: it executes if all the if and elif statements evaluate to False. The loop-else is perhaps one of the more confusingly named statements in Python; I prefer to think of it as a nobreak statement: that is, the else block is executed only if the loop ends naturally, without encountering a break statement.

**Sieve of Eratosthenes**, a well-known algorithm for finding prime numbers 👇

In [10]:
# Loops with an else block 
L = []
nmax = 30


# DEBUG THIS ✅ debugged - indentation issue w/ else and print statement
for n in range(2, nmax):
  for factor in L:
    if n % factor == 0:
      break
  else: # No break
      L.append(n)
print(L)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]


## Functions

There are two ways of creating functions - the **def** statement, useful for any type of function, and the **lambda** statement, useful for creating **short anonymous functions**.

Here print is the function name, and **'abc'** is the function’s **argument**.
In addition to arguments, there are **keyword arguments** that are
specified by name. One available keyword argument for the print() function (in Python 3) is sep, which tells what character or characters should be used to separate multiple items:

When **non-keyword** arguments are used **together with keyword** arguments, the keyword arguments must come at the **end**.

In [11]:
print(1, 2, 3, sep='--')

1--2--3


In [12]:
# Fibonacci Function

def fibonacci(N):
  L = []
  a, b = 0, 1
  while len(L) < N:
    a, b = b, a + b
    L.append(a)
  return L

fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [14]:
def real_imag_conj(val):
  return val.real, val.imag, val.conjugate()

r, i, c = real_imag_conj(3 + 4j)
print(r, i, c)

3.0 4.0 (3-4j)


We can also use **default starting values** for arguments in our functions. We can improve our Fibonacci function as follows: 

In [15]:
def fibonacci(N, a=0, b=1):
  L=[]
  while len(L) < N:
    a, b = b, a + b
    L.append(a)
  return L

fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [16]:
# Defining new starting values
fibonacci(10, 0, 2)

[2, 2, 4, 6, 10, 16, 26, 42, 68, 110]

In [17]:
# Order of our arguments does not matter if specified by name
fibonacci(10, b=3, a=1)

[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]

### \*args and \*\*kwargs: Flexible Arguments

Sometimes you might wish to write a function in which you **don’t initially know how many arguments the user will pass**. In this case, you can use the special form \*args and \*\*kwargs to catch all arguments that are passed.

Here it is not the names args and kwargs that are important, but the * characters preceding them. args and kwargs are just the variable names often used by convention, short for “arguments” and “keyword arguments”. 

The operative difference is the asterisk characters: **a single * before** a variable means “**expand this as a sequence**”, while **a double \*\*** before a variable means “**expand this as a dictionary**”. In fact, this syntax can be used not only with the function definition, but with the function call as well!

In [18]:
def catch_all(*args, **kwargs):
  print('args =', args)
  print('kwargs =', kwargs)

catch_all(1, 2, 3, a=4, b=5)

args = (1, 2, 3)
kwargs = {'a': 4, 'b': 5}


In [20]:
catch_all('a', keyword=2)

args = ('a',)
kwargs = {'keyword': 2}


In [21]:
inputs = (1, 2, 3)
keywords = {'pi': 3.14}

catch_all(*inputs, **keywords)

args = (1, 2, 3)
kwargs = {'pi': 3.14}
