# Python: the basics

## Using Jupyter notebooks: a quick tour

***Insert -> Insert Cell Below***

Type Python code in the cell, eg:

```
print("Hello Jupyter !")
```

***Shift-Enter*** to run the contents of the cell

When the text on the left hand of the cell is: `In [*]` (with an asterisk rather than a number), the cell is still running. It's usually best to wait until one cell has finished running before running the next.

In [1]:
print("Hello Jupyter !")

Hello Jupyter !


In Jupyter, just typing the name of a variable in the cell prints its representation:

In [2]:
message = "Hello again !"
message

'Hello again !'

In [3]:
# A 'hash' symbol denotes a comment
# This is a comment. Anything after the 'hash' symbol on the line is ignored by the Python interpreter

print("No comment")  # comment

No comment


## Variables and data types
### Integers, floats, strings

In [4]:
a = 5

In [5]:
a

5

In [6]:
type(a)

int

Adding a decimal point creates a `float`

In [7]:
b = 5.0

In [8]:
b

5.0

In [9]:
type(b)

float

`int` and `float` are collectively called 'numeric' types

(There are also other numeric types like `hex` for hexidemical and `complex` for complex numbers)

## Challenge

What is the type of the variable `letters` defined below ?

`letters = "ABACBS"`

## Solution

In [10]:
letters = "ABACBS"
type(letters)

str

### Strings

In [11]:
some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"

In [12]:
some_words

'Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'

In [13]:
type(some_words)

str

In [14]:
more_words = 'You can use "single" quotes'
more_words

'You can use "single" quotes'

In [15]:
triple_quoted_multiline = """In the last years of the nineteenth centuary,
human affairs were being watched from the timeless worlds of space.
Nobody would have believed that we were being scrutinized as a ....

.. etc ..
"""

print(triple_quoted_multiline)

In the last years of the nineteenth centuary,
human affairs were being watched from the timeless worlds of space.
Nobody would have believed that we were being scrutinized as a ....

.. etc ..



In [16]:
# You can substitute variables into a string like this.
# The variables listed after the string replace each `{0}`, `{1}` etc, in order

formatted = "{0} and BTW, did I mention that {1}".format(more_words, some_words)
print(formatted)

# The example above is 'new-style' string formatting. 
# You may also see 'old-style' (C-style) string formatting in examples, which looks like: 

oldskool = "%s and BTW, did I mention that %s" % (more_words, some_words)

# There's lots of fancy ways to format numbers in strings (eg number of decimal places, scientific notation)
# we won't go into today. See: https://pyformat.info/

You can use "single" quotes and BTW, did I mention that Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇


## Operators

`+`  `-`  `*`  `/`  `%`  `**`  `//`  

`+=`  `*=`  `-=`  `/=`

In [17]:
# int + int = int
a = 5
a + 1

6

In [18]:
# float + int = float
b = 5.0
b + 1

6.0

In [19]:
a + b

10.0

In [20]:
some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"
a = 6
a + some_words

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [21]:
str(a) + " " + some_words

'6 Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'

In [22]:
# Multiplication
a * 10

60

In [23]:
# Division
a / 2

3.0

In [24]:
# Power
a**2

36

In [25]:
# Modulus - divide as whole numbers and return the remainder
a % 2

0

In [26]:
# Shorthand: operators with assignment
a += 1
a

7

## Lists and sequence types

### Lists

In [27]:
numbers = [2, 4, 6, 8, 10]
numbers

[2, 4, 6, 8, 10]

In [28]:
len(numbers)

5

In [29]:
# Lists can contain multiple data types
mixed_list = ["asdf", 2, 3.142]
mixed_list

['asdf', 2, 3.142]

In [30]:
list_of_lists = [mixed_list, numbers, ['a','b''c']]
list_of_lists

[['asdf', 2, 3.142], [2, 4, 6, 8, 10], ['a', 'bc']]

In [31]:
numbers[0]

2

In [32]:
numbers[3]

8

In [33]:
numbers[3] = numbers[3] * 100
numbers

[2, 4, 6, 800, 10]

In [34]:
numbers.append(12)
numbers

[2, 4, 6, 800, 10, 12]

In [35]:
numbers.extend([14, 16, 18])
numbers

[2, 4, 6, 800, 10, 12, 14, 16, 18]

In [36]:
# The '+' operator for lists is equivalent to list.extend()
numbers + [100, 200, 300, 400]

[2, 4, 6, 800, 10, 12, 14, 16, 18, 100, 200, 300, 400]

### Tuples

In [37]:
tuples_are_immutable = ("bar", 100, 200, "foo")
tuples_are_immutable

('bar', 100, 200, 'foo')

In [38]:
tuples_are_immutable[1]

100

In [39]:
tuples_are_immutable[1] = 666

TypeError: 'tuple' object does not support item assignment

### Sets

In [40]:
unique_items = set([1, 1, 2, 2, 3, 4, 1, 2, 3, 4])
# or curly brackets
# unique_items = {1, 1, 2, 2, 3, 4, 1, 2, 3, 4}
unique_items

{1, 2, 3, 4}

### Slicing

In [41]:
numbers = [2, 4, 6, 8, 10, 12]

# list[start:end]
# start is inclusive, end isn't

numbers[0:3]

[2, 4, 6]

In [42]:
numbers[4:7]

[10, 12]

In [43]:
numbers[:3] # omitting start implies 0 (the very start)

[2, 4, 6]

In [44]:
numbers[3:] # omitting end means to the very end eg len(numbers)

[8, 10, 12]

In [45]:
numbers[-1:] # negative values reverse direction

[12]

In [46]:
numbers[:-1]

[2, 4, 6, 8, 10]

In [47]:
# you can also specify a step size
# list[start:end:step]

numbers[0:6:2]

[2, 6, 10]

In [48]:
# [:] is a shorthand for copying a list.
# Equivalent to:
# n_copy = list(numbers)

n_copy = numbers[:]
n_copy

[2, 4, 6, 8, 10, 12]

In [49]:
n_copy[3] = 8
n_copy

[2, 4, 6, 8, 10, 12]

In [50]:
numbers

[2, 4, 6, 8, 10, 12]

## Challenge

Given the list: `['banana', 'cherry', 'strawberry', 'orange']`

Return a list of just the red fruits.

## Solution

In [51]:
fruits = ['banana', 'cherry', 'strawberry', 'orange']
red_ones = fruits[1:3]
red_ones

['cherry', 'strawberry']

### Dictionaries

Dictionaries store a mapping of key-value pairs. They are unordered. 

Other programming languages might call this a 'hash', 'hashtable' or 'hashmap'.

In [52]:
pairs = {'Apple': 1, 'Orange': 2, 'Pear': 4}
pairs

{'Apple': 1, 'Orange': 2, 'Pear': 4}

In [53]:
pairs['Orange']

2

In [54]:
pairs['Orange'] = 16
pairs

{'Apple': 1, 'Orange': 16, 'Pear': 4}

In [55]:
pairs.items()
# list(pairs.items())

dict_items([('Apple', 1), ('Orange', 16), ('Pear', 4)])

In [56]:
pairs.values()
# list(pairs.values())

dict_values([1, 16, 4])

In [57]:
pairs.keys()
# list(pairs.keys())

dict_keys(['Apple', 'Orange', 'Pear'])

In [58]:
len(pairs)

3

In [59]:
dict_of_dicts = {'first': {1:2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}
dict_of_dicts

{'first': {1: 2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}

## Functions

Functions wrap up reusable pieces of code - the *DRY* principle

Significant whitespace: the body of the function is indicated by indenting by 4 spaces

*(We also use these indented blocks for if/else, for and while statements .. later !)*

`return` statements immediately return a value (or `None` if no value is given)

Any code in the function after the `return` statement does not get executed.

In [60]:
def square(x):
    return x**2

def hyphenate(a, b):
    return a + '-' + b
    print("We will never get here")

print(square(16), hyphenate('python', 'esque'))

256 python-esque


### Indentation and whitespace

* Python uses spaces at the start of a line to indicate a 'block' of code.
* A new block of code should be indented by **four** spaces.

* For a function, all the indented code is part of the the function.
* This also applies to loops like `for` and `while` and conditionals like `if`

(Indenting/dedenting by four spaces in Python is the equivalent to opening **{** and closing **}** curly brackets in languages like Java, Javascript, C, C++, C# etc)

(You can technically use tab characters, but please don't. The official Python style guide prefers spaces https://www.python.org/dev/peps/pep-0008/).

In [61]:
# Functions can return multiple values (just return a tuple and unpack it)
def lengths(a, b, c):
    return len(a), len(b), len(c)

x, y, z = lengths("long", "longer", "LONGEREST")
print(x, y, z)

4 6 9


In [62]:
def split_at(seq, residue='K'):
    """
    Takes a protein sequence (as a string) and splits it at each K residue,
    or the residue specified in the `residue` keyword argument. Split point
    residue is discarded.
    
    Returns a list of strings.
    """
    return seq.split(residue)

split_at('MILKGROGDRINKPINEAPPLE')

['MIL', 'GROGDRIN', 'PINEAPPLE']

In [63]:
# Functions can have an indeterminate number of arguments and keyword arguments using * and **
import math

def vector_magnitude(x, y, *args, **kwargs):
    
    # print(args)    # args is a tuple
    # print(kwargs)  # kwargs is a dictionary
    
    scale = kwargs.get('scale', 1)
    
    vector = [x,y] + list(args)
    return math.sqrt(sum(v**2 for v in vector)) * scale

In [64]:
print(vector_magnitude(1, 2, 4, 8, m=2))

9.219544457292887


## Conditionals

In [65]:
a = 10
b = 0
a > 1

True

In [66]:
if a > 1:
    print("a is greater than one")

a is greater than one


In [67]:
word = 'Bird'

# Note: Double equals for a conditional vs single equals for assignment !
if word == 'Bird':
    print('Bird is the word.')
    
if word != 'Girt':
    print('The word is not girt.')

Bird is the word.
The word is not girt.


In [68]:
if 'ird' in word:
    print("'ird' is in Bird.")
    
letters = ['B', 'i', 'r', 'd']
if 'i' in letters:
    print("'i' is in letters.")

'ird' is in Bird.
'i' is in letters.


*Protip*: Long lines can be split across two or more using a backslash ('\')

This can make your code more readable.

There should be nothing after the backslash, including whitespace.

Try to keep lines shorter than 78 characters for a PEP-8 style bonus.

In [69]:
if 'I' not in 'team' or \
   'I' not in 'TEAM':
    print("There is no 'I' in team (or TEAM).")

There is no 'I' in team (or TEAM).


In [70]:
# Boolean logic
# True and True => True
a > 1 and b <= 0

True

In [71]:
# True or False => True
a > 1 or b > 1

True

In [72]:
if a > 100:
    print("a is greater than one hundred")
elif a > 50:
    print("a is greater than fifty but less than one hundred")
else:
    print("a is less than fifty")
    
# For better or worse, there is no case/switch statement in Python - you just use if/elif/elif/else

a is less than fifty


In [73]:
# Truthyness
if a:
    print("A non-zero int is truthy")

if not (a - 10):
    print("The int 0 is 'falsey' ... not False => True !")

if '' or [] or () or dict():
    print("We will never see this since an empty string, list, tuple and dict are all 'falsey'")
    
if "    ":
    print("A non-empty string, even whitespace, is 'truthy")

A non-zero int is truthy
The int 0 is 'falsey' ... not False => True !
A non-empty string, even whitespace, is 'truthy


## Loops

A `for` loop works on a sequence types, generators and iterators

(this includes lists, tuples, strings and dictionaries)

In [74]:
for letter in "ABCD..meh":
    print(letter)

A
B
C
D
.
.
m
e
h


In [75]:
ts = [('Z', 99), ('Y', 98), ('X', 97)]

for t in ts:
    print(t)
    
# using tuple unpacking
for m, n in ts:
    print(m, n)

('Z', 99)
('Y', 98)
('X', 97)
Z 99
Y 98
X 97


In [76]:
# for on dictionary.items()
d = {'A': 1, 'B': 2, 'C': 3}

for item in d.items():
    # print(type(item))
    print(item)

('A', 1)
('B', 2)
('C', 3)


In [77]:
for k, v in d.items():
    print(k, v)

A 1
B 2
C 3


`while` loops keep looping while their condition is true:

```
while some_condition:
    do_stuff()
```

Note: If the condition for your `while` loops never becomes `False`, the loop will run forever (in Jupyter you can do *Kernel -> Interrupt* to break out of the infinite loop).

In [78]:
a = 0
while a < 16:
    print(a, end=' ')
    a += 1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

`break` immediately exits a loop

`continue` immediately starts the next iteration of the loop

Any code inside the loop after a `break` or `continue` is skipped.

In [79]:
a = 0
while True:
    a += 1
    
    if a > 16:
        break
        print('We will never see this.')
    
    if a % 2:
        continue
        print('We will also never see this.')
        
    print(a, end=' ')

2 4 6 8 10 12 14 16 

### List comprehensions

List comprehensions are a shorthand way to loop over a list, modify the items and create a new list.

In [80]:
# Instead of doing
new_list = []
for i in range(0,11):
    new_list.append(i**2)

new_list

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [81]:
# Use a list comprehension instead
new_list = [i**2 for i in range(0,11)]
new_list

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [82]:
# You can also `filter` values using an if statement inside the list comprehension
new_list = [i**2 for i in range(0,11) if i < 4]
new_list

[0, 1, 4, 9]

***End part 1. Stand up and strech for a moment.***