##### based on A Whirlwind Tour of Python by Jake VanderPlas

# Important Notes about Python Syntax

### Comments Are Marked by `#`

In [1]:
# this is a comment and is not run

### Lines
The end of a line terminates a statement. No need for using a semi-colon to end a statement ; although you can optionally use the semi-colon to write two statements in one line.

If you want to have a single statement cover multiple lines, you can use a backslash \ or encase the statement in parenthesis. If you are defining a list or other data structure that already uses some sort of bracket, this is handled automatically.

In [2]:
# examples
x = 5
print(x)

5


In [3]:
# semicolon to include multiple statements in one line
y = 6; z = 7
print(y + z)

13


In [4]:
# backslash to continue a statement over multiple lines
a = 1 + 2 + 3 \
    + 4 + 5
print(a)

15


In [5]:
# or use parenthesis
b = (1 + 2 + 3
    + 4 + 5)
print(b)

15


In [6]:
l = ['a', 2, 3, 'd',
    'e', 6]
print(l)

['a', 2, 3, 'd', 'e', 6]


### Indentation defines code blocks

Python does not use curly braces `{}` to define code blocks.
IPython is smart enough to automatically indent lines after you use a colon `:` which indicates that the following lines are part of a code block

In [7]:
# we will learn if statements later, but here's an example
x = 8
if(x > 5):
    print('x is greater than 5')   # the two indented lines only run 
    print(x)                       # when the if statement is true
print('hello')    # this line is not indented and will run regardless of if statement

x is greater than 5
8
hello


In [8]:
x = 4
if(x > 5):
    print('x is greater than 5')   # the two indented lines only run 
    print(x)                       # when the if statement is true
print('hello')    # this line is not indented and will run regardless of if statement

hello


In [9]:
x = 4
if(x > 5):
    print('x is greater than 5')
print(x)
print('hello')

4
hello


# Data types

Python has several data types:

- integers
- floating point numbers
- strings
- booleans 
- complex numbers

## int and float

In [10]:
type(3)  # if there are no decimals, python sees an integer

int

In [11]:
type(3.)  # if there is a decimal, it's a float

float

In [12]:
a = 10
type(a)

int

In [13]:
b = 2
type(b)

int

In [14]:
# python automatically upcasts (coerces) types
# a and b are both integers
# division always results in a float, even if the answer is a whole number
c = a/b
print(c)
type(c)

5.0


float

In [15]:
11//3  # integer division returns the whole number part and is an integer

3

In [16]:
11/3  # of course regular division will result in a float

3.6666666666666665

floats are always represented with a decimal point, even when it is a whole number

In [17]:
# multiplication may result in an integer or a float depending on the inputs
d = a * b
print(d)
type(d)

20


int

In [18]:
e = 5. * 2
print(e)
type(e)

10.0


float

In [19]:
# integers are variable precision, so you can do monsterous calculations without running into overflow errors
# for example in R, the largest integer allowed is 2^31 - 1
2 ** 1023

89884656743115795386465259539451236680898848947115328636715040578866337902750481566354238661203768010560056939935696678829394884407208311246423715319737062188883946712432742638151109800623047059726541476042502884419075341171231440736956555270413618581675255342293149119973622969239858152417678164812112068608

In [20]:
2 ** 1024

179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216

In [21]:
# Floats on the other hand will overflow
2.0 ** 1023

8.98846567431158e+307

In [22]:
2.0 ** 1024

OverflowError: (34, 'Result too large')

In [23]:
# standard warnings about floating point precision need to be respecte
q = 0.1
r = 0.2
s = q + r
print(q)
print(r)
print(s)
print(s == 0.3)

0.1
0.2
0.30000000000000004
False


## bool
The values `True` and `False` are boolean values.

In [24]:
type(True)

bool

True and False are written with only the first letter capitalized

`TRUE` or `true` will not be recognized

### str

Strings are enclosed in quotes, single or double quotes are fine.

In [25]:
type('hello')

str

In [26]:
type("hello")

str

In [27]:
print("you're")  # if you use double quotes, you can include single quotes w/o issue

you're


In [28]:
print('you\'re')  # you can escape the quote if you want to include a literal quote

you're


# None

The null object in Python is called `None` and has its own type.

In [29]:
type(None)

NoneType

In [30]:
n = None

To check for 'noneness' use `is None`

In [31]:
n is None

True

In [32]:
n == None 
# This seems to work, but this is not what you should use. 
# It gets technical. There's a full explanation on stack exchange: https://stackoverflow.com/a/48504780/2155820

True

In [33]:
if(n is None):
    print('hello')

hello


# Data Structures

## Lists

We will start with lists in Python

## List Creation
Use square brackets. Lists can contain any mix of data types. You can nest lists inside other lists.

In [34]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

In [35]:
fam2 = [["liz", 1.73],
["emma", 1.68],
["mom", 1.71],
["dad", 1.89]]

In [36]:
fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [37]:
fam2

[['liz', 1.73], ['emma', 1.68], ['mom', 1.71], ['dad', 1.89]]

## Subsetting lists
- index starts at 0 (hardest part to adapt for R users)
- use a series of square brackets for nested lists
- use negative numbers to count from the end

In [38]:
fam[0]

'liz'

In [39]:
fam2[0]

['liz', 1.73]

In [40]:
fam2[0][0]

'liz'

In [41]:
fam[-1]

1.89

In [42]:
fam2[-1]

['dad', 1.89]

In [43]:
fam2[-1][-1]

1.89

## List Slicing
Note that the slice will not include the item in the index after the colon.
You can think of the 'slice' happening at the commas corresponding to the number.
So fam[1:3] slices the list at the first and third commas, and extracts [1.73, 'emma']

In [44]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam[1:3]

[1.73, 'emma']

In [45]:
fam[1:2]

[1.73]

In [46]:
fam[1:1]  # there is nothing between the first and first commas

[]

In [47]:
fam[0:2]

['liz', 1.73]

In [48]:
fam[6:8]

['dad', 1.89]

In [49]:
fam[2:]

['emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [50]:
fam[:4]

['liz', 1.73, 'emma', 1.68]

In [51]:
fam[:]  # slice with no indices will create a (shallow) copy of the list.

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [52]:
fam[] # throws error

SyntaxError: invalid syntax (<ipython-input-52-792e48a646bd>, line 1)

In [53]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
print(fam[-5:-2])

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
[1.68, 'mom', 1.71]


No simple solution for subsetting disjoint items in a list. No equivalent to R's list[c(1, 3, 7)]

A workaround (from stackexchange):

`a = ['0', 'a', 'b', 3, 4, 'e', 6, 7, 8]`
and the list of indexes is stored in

`b = [1,3,5]`
then a simple one-line solution will be

`c = [a[i] for i in b]`

In [54]:
a = ['0', 'a', 'b', 3, 4, 'e', 6, 7, 8]
b = [1,3,5]
c = [a[i] for i in b]  # this is technically called a list comprehension
print(c)

['a', 3, 'e']


## Lists are mutable
This means that methods change the lists themselves. 
If the list is assigned to another name, both names refer to the exact same object.

In [55]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
second = fam    # second references fam. second is not a copy of fam.
second[0] = "sister"  # we make a change to the list 'second'
print(second)
print(fam) # changing the list 'second' has changed the list 'fam'

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [56]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
second = fam[:]  # creates a copy of the list
# second = fam.copy() # you can also create a list using the copy() method
second[0] = "sister"
print(second)
print(fam) # changing the list second does not modify fam because second is a copy

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [57]:
third = fam.copy()
print(third)
third[1] = 1.65
print(third)
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.65, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


You can use list slicing in conjuction with assignment to change values

In [58]:
print(fam)
fam[1:3] = [1.8, "jenny"]
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.8, 'jenny', 1.68, 'mom', 1.71, 'dad', 1.89]


# List Methods

- `list.copy()`
    - Return a shallow copy of the list. Equivalent to a[:]
- `list.append(x)`
    - Add an item to the end of the list. Equivalent to a[len(a):] = [x].

In [59]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.append("me")   # unlike R, you don't have to "capture" the result of the function. 
# the list itself is modified. You can only append one item.
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me']


In [60]:
fam = fam + [1.8]  # you can also append to a list with the addition `+` operator
# note that this output needs to be 'captured' and assigned back to fam
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me', 1.8]


- `list.insert(i, x)`
    - Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).

- `list.extend(iterable)`
    - Extend the list by appending all the items from the iterable. Equivalent to a[len(a):] = iterable.

In [61]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, "joe") # inserts joe at the location of the 4th comma between 1.68 and mom
print(fam)

['liz', 1.73, 'emma', 1.68, 'joe', 'mom', 1.71, 'dad', 1.89]


In [62]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, ["joe", 2.0])  # trying to insert multiple items by using a list inserts a list
print(fam)

['liz', 1.73, 'emma', 1.68, ['joe', 2.0], 'mom', 1.71, 'dad', 1.89]


In [63]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, "joe", 2.0)  # like append, you can only insert one item
# trying to insert multiple items causes and error
print(fam)

TypeError: insert() takes exactly 2 arguments (3 given)

In [64]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.extend(["joe", 2.0]) # lets you add multiple items, but at the end
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'joe', 2.0]


In [65]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam[4:4] = ["joe", 2.0] # Use slice and assignment to insert multiple items in a specific position
print(fam)

['liz', 1.73, 'emma', 1.68, 'joe', 2.0, 'mom', 1.71, 'dad', 1.89]


### shallow versus deep copy

There are actually two ways to make a copy:
- `list.copy()`
and 
- `import copy
copy.deepcopy(list)`

The difference is noticeable when you have other objects (e.g. other lists) nested in lists.

A shallow copy makes a copy of the list with references to the nested objects
A deep copy makes copies of the nested objects.

In [66]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

d = c  # i am not making a copy. both d and c refer to the exact same object.
print(c)
print(d)

[['a', 1, 2], ['b', 3, 4]]
[['a', 1, 2], ['b', 3, 4]]


In [67]:
c[1] = "x"  # this change affects both
print(c)
print(d)

[['a', 1, 2], 'x']
[['a', 1, 2], 'x']


#### Shallow copy example

In [68]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

d = c.copy()
c[1] = "x"  # this change affects only c. it does not affect d because d is a copy.
print(c)
print(d)

[['a', 1, 2], 'x']
[['a', 1, 2], ['b', 3, 4]]


In [69]:
a.append(100) # We update list a. Lists c and d refer to list a. So this change affects c and d
print(c)
print(d)

[['a', 1, 2, 100], 'x']
[['a', 1, 2, 100], ['b', 3, 4]]


#### Deep Copy Example

In [70]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

import copy
e = copy.deepcopy(c)

c[1] = "x"  # this change affects only c. it does not affect e because e is a copy
print(c)
print(e)

[['a', 1, 2], 'x']
[['a', 1, 2], ['b', 3, 4]]


In [71]:
a.append(100) # lists c refers to list a, but e made a copy of list a. So this change affects only c but not e
print(c)
print(e)

[['a', 1, 2, 100], 'x']
[['a', 1, 2], ['b', 3, 4]]


- `list.remove(x)`
    - Remove the first item from the list whose value is x. It is an error if there is no such item.

- `list.pop([i])`
    - Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list.

- `list.clear()`
    - Remove all items from the list. Equivalent to del a[:].


In [72]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.remove("liz")
print(fam)

[1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [73]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
j = fam.pop()  # if you don't specify an index, it pops the last item in the list
# default behavior of pop() without any arguments is like a stack. last in first out
print(j)
print(fam)

1.89
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad']


In [74]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
j = fam.pop(0)  # you can also specify an index.
# Using index 0 makes pop behave like a queue. first in first out
print(j)
print(fam)

fam.clear()
print(fam)

liz
[1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
[]



- `list.index(x)`
    - Return zero-based index in the list of the first item whose value is x. Raises a ValueError if there is no such item.
- `list.count(x)`
    - Return the number of times x appears in the list.

In [75]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.index("emma")

2

In [76]:
letters = ["a", "b", "c", "a", "a"]
print(letters.count("a"))

3


In [77]:
fam2 = [["liz", 1.73],
["emma", 1.68],
["mom", 1.71],
["dad", 1.89]]
print(fam2.count("emma"))  # the string by itself does not exist
print(fam2.count(["emma", 1.68]))

0
1


- `list.sort(key=None, reverse=False)`
    - Sort the items of the list in place (the arguments can be used for sort customization, see sorted() for their explanation).

- `list.reverse()`
    - Reverse the elements of the list in place.

In [78]:
fam.reverse()  # no output to 'capture', the list is changed in place

In [79]:
print(fam)

[1.89, 'dad', 1.71, 'mom', 1.68, 'emma', 1.73, 'liz']


In [80]:
fam.sort()  # can't sort floats and string

TypeError: '<' not supported between instances of 'str' and 'float'

In [81]:
some_digits = [4,2,7,9,2,5.1,3]
some_digits.sort()  # the list is sorted in place. no need to resave the output

In [82]:
print(some_digits)  # preserves numeric data types

[2, 2, 3, 4, 5.1, 7, 9]


In [83]:
type(some_digits[4])

float

In [84]:
some_digits.sort(reverse = True)
print(some_digits)

[9, 7, 5.1, 4, 3, 2, 2]
