# Week 2: Python Basics

## POP77001 Computer Programming for Social Scientists

### Tom Paskhalis

##### Module website: [bit.ly/POP77001](https://bit.ly/POP77001)

<table>
    <tr>
        <td><img width="500" src='../imgs/python_snake.jpg'></td>
        <td><img width="500" src='../imgs/python_monty.png'></td>
    </tr>
</table>

<div style="text-align: center;">
    <img width="500" height="300" src="../imgs/xkcd_353.png">
</div>

Source: [xkcd](https://xkcd.com/353/)

## Python background

<table>
    <tr>
        <td><img width="200" height="100" src='../imgs/guido.gif'></td>
        <td><img width="200" height="100" src='../imgs/python_logo.png'></td>
    </tr>
</table>

Source: [Guido van Rossum](https://gvanrossum.github.io/), [Python Software Foundation](https://www.python.org/psf-landing/)

- Started as a side-project in 1989 by Guido van Rossum, BDFL (benevolent dictator for life) until 2018.
- Python 3, first released in 2008, is the current major version
- Python 2 support stopped on 1 January 2020

## Python basics

- Python is an *interpreted* language (like R and Stata)
- Every program is executed one *command* (aka *statement*) at a time
- Which also means that work can be done interactively

In [1]:
print("Hello World!")

Hello World!


## Python conceptual hierarchy

Python programs can be decomposed into modules, statements, expressions, and objects, as follows:

1. *Programs* are composed of *modules*
2. *Modules* contain *statements*
3. *Statements* contain *expressions*
4. *Expressions* create and process *objects*

## Python objects

- Everything that Python operates on is an *object*
- This includes numbers, strings, data structures, functions, etc.
- Eact object has a *type* (e.g. string or function) and internal data
- Objects can be *mutable* (e.g. list) and *immutable* (e.g. string)

## Operators

*Objects* and *operators* are combined to form *expressions*. Key *operators* are:

- Arithmetic (`+`, `-`, `*`, `**`, `/`, `//`, `%`)
- Boolean (`and`, `or`, `not`)
- Relational (`==`, `!=`, `>`, `>=`, `<`, `<=`)
- Assignment (`=`, `+=`, `-=`, `*=`, `/=`)
- Membership (`in`)

## Basic mathematical operations in Python

In [2]:
1 + 1

2

In [3]:
5 - 3

2

In [4]:
6 / 2

3.0

In [5]:
4 * 4

16

In [6]:
# Exponentiation <- Python comments start with #
2 ** 4

16

## Basic logical operations in Python

In [7]:
3 != 1 # Not equal

True

In [8]:
3 > 3 # Greater than

False

In [9]:
3 >= 3 # Greater than or equal

True

In [10]:
False or True # True if either first or second operand is True, False otherwise

True

In [11]:
3 > 3 or 3 >= 3 # Combining 3 Boolean expressions

True

## Assignment operations

Assignments create object references. *Target* (or *name*) on the left is assigned to *object* on the rigth.

In [12]:
x = 3

In [13]:
x

3

In [14]:
x += 2 # Increment assignment, equivalent to x = x + 2

In [15]:
x

5

## Assignment vs Comparison Operators

As `=` (assignment) and `==` (equality comparison) operators appear very similar, they sometime can create confusion.

In [16]:
x = 3

In [17]:
x

3

In [18]:
x == 3

True

## Membership operations

Operator `in` returns `True` if an object of the left side is in a sequence on the right.

In [19]:
'a' in 'abc'

True

In [20]:
4 in [1, 2, 3] # [1,2,3] is a list

False

In [21]:
4 not in [1, 2, 3]

True

## Object types

Python objects can have *scalar* and *non-scalar* types. Scalar objects are indivisible.

4 main types of scalar objects in Python:

- Integer (`int`)
- Real number (`float`)
- Boolean (`bool`)
- Null value (`None`)

## Scalar types

In [22]:
type(7)

int

In [23]:
type(3.14)

float

In [24]:
type(True)

bool

In [25]:
type(None)

NoneType

In [26]:
int(3.14) # Scalar type conversion (casting)

3

## Non-scalar types

In contrast to scalars, non-scalar objects, *sequences*, have some internal structure. This allows indexing, slicing and other interesting operations.

Most common sequences in Python are:

- String (`str`) - *immutable* ordered sequence of characters
- Tuple (`tuple`) - *immutable* ordered sequence of elements
- List (`list`) - *mutable* ordered sequence of elements
- Set (`set`) - *mutable* unordered collection of unique elements
- Dictionary (`dict`) - *mutable* unordered collection of key-value pairs

## Examples of non-scalar types

In [27]:
s = 'time flies like a banana'
t = (0, 'one', 1, 2)
l = [0, 'one', 1, 2]
o = {'apple', 'banana', 'watermelon'}
d = {'apple': 150.0, 'banana': 120.0, 'watermelon': 3000.0}

In [28]:
type(s)

str

In [29]:
type(t)

tuple

In [30]:
type(l)

list

In [31]:
type(o)

set

In [32]:
type(d)

dict

## Indexing in Python starts from 0

<div style="text-align: center;">
    <img width="600" height="400" src="../imgs/xkcd_163.png">
</div>

Source: [xkcd](https://xkcd.com/163/)

## Strings

In [33]:
s

'time flies like a banana'

In [34]:
len(s) # length of string (including whitespaces)

24

In [35]:
s[0] # Subset 1st element (indexing in Python starts from zero!)

't'

In [36]:
s[5:] # Subset all elements starting from 6th

'flies like a banana'

In [37]:
s + '!' # Strings can be concatenated together

'time flies like a banana!'

## Objects have methods

- Python objects of built-in types have *methods* associated with them
- They can be thought of function-like objects
- However, their syntax is `object.method()` as opposed to `function(object)`

In [38]:
len(s) # Function

24

In [39]:
s.upper() # Method (makes string upper-case)

'TIME FLIES LIKE A BANANA'

## String methods

Some examples of methods associated with strings. More details [here](https://docs.python.org/3/library/stdtypes.html#string-methods).

In [40]:
s.capitalize() # Note that only the first character gets capitalized

'Time flies like a banana'

In [41]:
s.split(sep = ' ') # Here we supply an argument 'sep' to our methods call

['time', 'flies', 'like', 'a', 'banana']

In [42]:
s.replace(' ', '-') # Arguments can also be matched by position, not just name

'time-flies-like-a-banana'

In [43]:
'-'.join(s.split(sep = ' ')) # Methods calls can be nested within each other

'time-flies-like-a-banana'

## Tuples

In [44]:
t # Tuples can contain elements of different types

(0, 'one', 1, 2)

In [45]:
len(t)

4

In [46]:
t[1:]

('one', 1, 2)

In [47]:
t + ('three', 5) # Like strings tuples can be concatenated

(0, 'one', 1, 2, 'three', 5)

## Lists

In [48]:
l # Like tuples lists can contain elements of different types

[0, 'one', 1, 2]

In [49]:
l[1] = 1 # Unlike tuples lists are mutable

In [50]:
l

[0, 1, 1, 2]

In [51]:
t[1] = 1 # Compare to tuple

TypeError: 'tuple' object does not support item assignment

## More on subsetting

In [52]:
l

[0, 1, 1, 2]

In [53]:
l[1:] # Subset all elements starting from 2nd

[1, 1, 2]

In [54]:
l[-1] # Subset the last element

2

In [55]:
l[::2] # Subset every second element, list[start:stop:step]

[0, 1]

In [56]:
l[::-1] # Subset all elements in reverse order

[2, 1, 1, 0]

## Sets

In [57]:
o

{'apple', 'banana', 'watermelon'}

In [58]:
{'apple', 'apple', 'banana', 'watermelon'} # Sets retain only unique values

{'apple', 'banana', 'watermelon'}

In [59]:
{'apple'} < o # Sets can be compared (e.g. one being subset of another)

True

In [60]:
o[1] # Unlike strings, tuples and lists, sets are unordered

TypeError: 'set' object is not subscriptable

## Set methods in Python

<div style="text-align: center;">
    <img width="500" height="300" src="../imgs/venn_diagram_sets.png">
</div>

## Example of set methods

<div style="text-align: center;">
    <img width="500" height="300" src="../imgs/europe_diagram.png">
</div>

Source: [Wikipedia](https://en.wikipedia.org/w/index.php?title=File:Supranational_European_Bodies-en.svg)

## Example of set methods (cont'd)

In [61]:
nordic = {'Denmark', 'Iceland', 'Finland', 'Norway', 'Sweden'}
eu = {'Denmark', 'Finland', 'Sweden'}
krones = {'Denmark', 'Sweden'}

In [62]:
euro = eu.difference(krones) # Same can expressed using infix operators `eu - krones`
euro

{'Finland'}

In [63]:
efta = nordic.difference(eu).union({'Liechtenstein', 'Switzerland'}) # Methods calls can also be 'chained'
efta

{'Iceland', 'Liechtenstein', 'Norway', 'Switzerland'}

In [64]:
efta.intersection(nordic) # efta & nordic

{'Iceland', 'Norway'}

In [65]:
schengen = efta.union(eu) # efta | eu
schengen

{'Denmark',
 'Finland',
 'Iceland',
 'Liechtenstein',
 'Norway',
 'Sweden',
 'Switzerland'}

## Dictionaries

In [66]:
d # key:value pair, fruit_name:average_weight

{'apple': 150.0, 'banana': 120.0, 'watermelon': 3000.0}

In [67]:
d['apple'] # Unlike strings, tuples and lists, dictionaries are indexed by 'keys'

150.0

In [68]:
d[0] # Rather than integers

KeyError: 0

In [69]:
d['strawberry'] = 12.0 # They are, however, mutable like lists and sets
d

{'apple': 150.0, 'banana': 120.0, 'watermelon': 3000.0, 'strawberry': 12.0}

## Conversion between non-scalar types

In [70]:
t ## Tuple

(0, 'one', 1, 2)

In [71]:
list(t) ## Convert to list with a `list` function

[0, 'one', 1, 2]

In [72]:
[x for x in t] ## List comprehesion, [expr for elem in iterable if test]

[0, 'one', 1, 2]

In [73]:
set([0, 1, 1, 2]) ## Conversion to set retains only unique values

{0, 1, 2}

## Aliasing vs copying in Python

- Assignment binds the varible name on the left of `=` sign to the object of certain type on the right.
- But the same object can have different names.
- Operations on immutable types typically overwrite the object if it gets modified.
- But for mutable objects (lists, sets, dictionaries) this can create hard-to-track problems.

## Example of aliasing/copying for immutable types

In [74]:
x = 'test' # Object of type string is assinged to variable 'x'
x

'test'

In [75]:
y = x # y is created an alias (alternative name) of x
y

'test'

In [76]:
x = 'rest' # Another object of type string is assigned to 'x'
x

'rest'

In [77]:
y

'test'

## Example of aliasing/copying for mutable types

In [78]:
d

{'apple': 150.0, 'banana': 120.0, 'watermelon': 3000.0, 'strawberry': 12.0}

In [79]:
d1 = d # Just an alias
d2 = d.copy() # Create a copy
d['watermelon'] = 500 # Modify original dictionary

In [80]:
d1

{'apple': 150.0, 'banana': 120.0, 'watermelon': 500, 'strawberry': 12.0}

In [81]:
d2

{'apple': 150.0, 'banana': 120.0, 'watermelon': 3000.0, 'strawberry': 12.0}

## Summary of built-in object types in Python

|  Type   |  Description  |   Scalar   | Mutability |   Order   |
| :-----: | :-----------: | :--------: | :--------: | :-------: |
|  `int`  |    integer    |   scalar   | immutable  |           |
| `float` |  real number  |   scalar   | immutable  |           |
| `bool`  |    Boolean    |   scalar   | immutable  |           |
| `None`  | Python 'Null' |   scalar   | immutable  |           |
|  `str`  |    string     | non-scalar | immutable  |  ordered  |
| `tuple` |     tuple     | non-scalar | immutable  |  ordered  |
| `list`  |     list      | non-scalar |  mutable   |  ordered  |
|  `set`  |      set      | non-scalar |  mutable   | unordered |
| `dict`  |  dictionary   | non-scalar |  mutable   | unordered |

Extra: [Extensive documentation on built-it types](https://docs.python.org/3/library/stdtypes.html)

## Modules

- Python's power lies in its extensibility
- This is usually achieved by loading additional modules (libraries)
- Module can be just a `.py` file that you import into your program (script)
- However, often this refers to external libraries installed using `pip` or `conda`
- Standard Python installation also includes a number of modules (full list [here](https://docs.python.org/3/library/index.html))

## Basic statistical operations

In [82]:
import statistics # Standard Python module
fib = [0, 1, 1, 2, 3, 5]

In [83]:
statistics.mean(fib) # Mean

2

In [84]:
statistics.median(fib) # Median

1.5

In [85]:
statistics.mode(fib) # Mode

1

In [86]:
statistics.stdev(fib) # Standard deviation

1.7888543819998317

## Help!

Python has an inbuilt help facility which provides more information about any object:

In [87]:
?s

In [88]:
help(s.join)

Help on built-in function join:

join(iterable, /) method of builtins.str instance
    Concatenate any number of strings.
    
    The string whose method is called is inserted in between each given string.
    The result is returned as a new string.
    
    Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'



- The quality of the documentation varies hugely across libraries
- [Stackoverflow](https://stackoverflow.com/) is a good resource for many standard tasks
- For custom packages it is often helpful to check the **issues** page on the [GitHub](https://github.com/)
- E.g. for `pandas`: [https://github.com/pandas-dev/pandas/issues](https://github.com/pandas-dev/pandas/issues)
- Or, indeed, any search engine [#LMDDGTFY](https://lmddgtfy.net/)

## Next

- Tutorial: Python objects, types, basic operations and methods
- Assignment 1: Due at 11:00 on Monday, 27th September (submission on Blackboard)
- Next week: Control flow in Python
