# Welcome to the PhUSE EU Connect introduction to Python Course

This course will cover a primer on the Python Language with a focus on topics relevant to Data Analysis and Management.

## About Python
Python is an open-source programming language that was created and first released in 1991 by [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum) 

It is an interpreted language, it does not require a separate compilation step (turning source code into machine code)

It is supported on pretty much every platform, as well as through the use of cloud computing platforms loading and executing Jupyter Notebooks

## Let's start!

Manipulating values in any language is done through the use of variables and operations.

### Variables

A variable is a holder for data and allows the programmer to pass around references to the data.  
Variables are generally said to be:
* **mutable** - a variable can be changed after creation
* **immutable** - a variable is fixed and unchangeable (there's nothing to stop you overwriting it)

Variables have data types; Python is a little unusual with respect to its type system.  It has a loose typing system.  

This means:
* You don't have to declare what type a variable is - the interpreter will work it out as you *assign* the variable a value

`type` is a function, it returns the python type for a variable instance.

Following this are a set of assigments for the python fundamental types.


In [1]:
# first off - an integer
a = 1
type(a)

int

In [2]:
# next a float
a = 1.1
type(a)

float

In [3]:
# next a string
a = "1"
type(a)

str

In [4]:
# next a boolean
a = True
type(a)

bool

In [6]:
# next a complex number
a = 1.0 - 1.0j
print(type(a))
print(a.real, a.imag)

<class 'complex'>
1.0 -1.0


In each of these cases we have not declared up front what `type` of variable `a` is - we assign the value to the variable and python has worked it out

The type of variable is important when you want to use it, in these examples we use the increment operator `+=` to 

In [None]:
# note that += 1 is a shorthand for a = a + 1 (+= is an operator)
a = 1
a += 1
a

In [None]:
# a bit strange
a = 1.1
a += 1
a

In [None]:
# very strange
a = "1"
a += 1
a

In [None]:
# are you out of your mind???
a = True
a += 1
a

Note that Python has just worked it out where it makes sense, and if it doesn't make sense then it throws an error (which you can anticipate and handle, but more on that later)

## Type casting
Python will try to do the right thing when you attempt to use a variable as a different type.  Changing the type of variable is called *casting*

In [None]:
x = 1.5     # a float 
print(x, type(x))

In [None]:
x = int(x) # cast the float to an int - note the cast to int will floor the value
print(x, type(x))


## Operators

### Arithmetic Operators
* `+` (addition)
* `-` (subtraction)
* `*` (multiplication)
* `/` (division)
* `//` (integer division)
* `**` (power)
* `%` (modulus)

In [None]:
(1 + 2, 1.0 + 2.0)

In [None]:
(2 - 1, 2.0 - 1.0)

In [None]:
(3 * 4, 3.0 * 4.0)

In [None]:
(3 / 4, 3.0 / 4.0)

In [None]:
(3.0 // 4.0)

In [None]:
(3**2, 3.0**2.0)

In [None]:
( 5 % 2, 5.0 % 2)

### Logical Operators
* `not` (`!`)
* `and` (`&`)
* `or` (`|`)
* xor (`^`)

In [None]:
not True

In [None]:
(True and False, True & False)

In [None]:
(True or False, True | False)

In [None]:
# Exclusive or!
(True ^ False, True ^ True, False ^ False)


### Comparison Operators
* `==` equals
* `<` less than
* `>` greater than
* `<=` less than or equal to
* `>=` greater than or equal to

In [None]:
a = 1
b = 2

a == b

a < b

a > b

## Compound Types
Compound types are multi-valued variable # TODO: Fix this

The three core compound types in Python are:
* Lists 
* Tuples 
* Dictionaries

There are different use cases for each of these which we'll go through now:

### Lists
Python lists are simply, lists of values. 

In [None]:
# a is a list
a = [1,2,3,4,5]
# get the length of a
len(a)
# get the maximum value of a
max(a)
# get the minimum value of a
min(a)

# Lists can be updated using the `append` method
a.append(8)

print(a)

# lists can also be extended
b = [12, 14, 18]
a.extend(b)

print(a)

### Tuples
Tuples are similar to lists, with the notable exception of being immutable

In [None]:
# tuples are created using `(` and `)`
a = (1,2,3,4,5)

# get the length of a
len(a)
# get the maximum value of a
max(a)
# get the minimum value of a
min(a)
# Get the arithmetic sum of the a
sum(a)

# you cannot change a tuple
a.append(2)

#### Using Lists and Tuples

When you want to use a list or tuple you can access elements by index (**NOTE**, python is 0-index based)

In [None]:
l = [1,2,3,4,5]
t = (5,4,3,2,1)
print(l[0])  # Get the first element
print(t[0])  # Get the first element

You can also take slices

In [None]:
print(l[1:3]) # take the 2nd to 4th element
print(t[1:3]) # take the 2nd to 4th element

If you want to get elements from the tail of the list/tuple use negative indices

In [None]:
print(l[-1])  # the last element
print(t[-1])  # the last element

The `:` character can be greedy

In [None]:
print(l[-2:]) # take the last two elements
print(t[-2:]) # take the last two elements

### Searching for a value in a Tuple or List

Both lists and tuples are iterable elements.  This means you can iterate over the set of values.  

Let's use this to check and see if a value is in a list (for the purposes of this exercise we'll consider tuples and lists interchangeably)

In [None]:
# define our list to search
l = [1, 3, 4, 7, 12, 19, 25]

# initialise our variable 
found = False
search_value = 12

# now, iterate over the values using a for loop
for value in l:
    if value == search_value:
        found = True  # found our value, mark the search as a success
# use a conditional statement to trigger the switch

if found is True:  # comparison in the case of boolean variables should use is rather than ==
    print("Found value", search_value, "in", l)
else:  # didn't find the value, report that
    print("Didn't find value", search_value, "in", l)

We can short circuit the search somewhat!

In [None]:
# define our list to search
l = [1, 3, 4, 7, 12, 19, 25]

# initialise our variable 
found = False
search_value = 12

# now, iterate over the values using a for loop
for value in l:
    if value == search_value:
        found = True  # found our value, mark the search as a success
        break         # break stops the iteration

if found is True:  # comparison in the case of boolean variables should use is rather than ==
    print("Found value", search_value, "in", l)
else:  # didn't find the value, report that
    print("Didn't find value", search_value, "in", l)

And for the simplest

In [None]:
# define our list to search
l = [1, 3, 4, 7, 12, 19, 25]

# initialise our variable 
search_value = 12

# now, iterate over the values using a for loop
for value in l:
    if value == search_value:
        print("Found value", search_value, "in", l)
        break         # break stops the iteration
else:
    # else runs at the end of the iteration
    print("Didn't find value", search_value, "in", l)

Say we wanted to know whereabouts the value we searched for is; we can use the `enumerate` function 

In [7]:
# define our list to search
l = [1, 3, 4, 7, 12, 19, 25]

# initialise our variable 
search_value = 12

# the enumerate function wraps the iteration, and returns a tuple; the index of the current value and the value
for i, value in enumerate(l):
    if value == search_value:
        print("Found value", search_value, "at position", i)
        break         # break stops the iteration
else:
    # else runs at the end of the iteration
    print("Didn't find value", search_value, "in", l)

Found value 12 at position 4


`enumerate` takes a `start` argument, which tells the interpreter what value to start on - by default it is 0

Those of you who have read ahead will know an easier way...

In [8]:
# define our list to search
l = [1, 3, 4, 7, 12, 19, 25]

# the in operator implements a search for a value
if 12 in l:
    # the index accessor on an iterable returns the first location the value is found
    print("Found value", search_value, "at position", l.index(12))  
else:
    print("Didn't find value", search_value, "in", l)

Found value 12 at 4


Now, an exercise for you!  Using what we've discussed prior, create the code to work out the mean for the following list:

In [None]:
c = [23, -57, -87, -17, 29, -5, 22, 66, -52, -9, 63, -47, 64, -83, 55, -15, 91, 39, -66, -28, 34, -65, 42, -94, 62, 1, 71, -79, -29, -32, 45, -50, -51, 5, -39, 45, -29, -38, -70, -58, -57, 35, -18, -72, -43, -34, -63, 74, -36, 70]


### List comprehensions
List comprehensions are a bit of syntatic sugar, it allows you to create a list according to a function.  As an example; if we wanted to get all the positive values for `c` we can use the following list comprehension

In [None]:
positive_c = [x for x in c if x >= 0]

Or, get the absolute values for all the elements (note, this doesn't change the original list)

In [None]:
abs_val = [abs(x) for x in c]  # abs is a python builtin to take the absolute value

## Dictionaries
Dictionaries are a way of maintaining a key, value set as a variable.  They are created as follows:

In [2]:
t = {"Fruit": ["Tomato", "Pear", "Apple"], "Vegetable": ["Carrot", "Parsnip"]}

# the accessors for a dictionary are the keys:
print(t["Fruit"])
print(t["Vegetable"])

# Asking for a missing value using the [] lookup will raise a KeyError
print(t["Pet"])

['Tomato', 'Pear', 'Apple']
['Carrot', 'Parsnip']


KeyError: 'Pet'

In [3]:
# However, there is another way of accessing elements from a dict
print(t.get("Pet", []))
# The `get` syntax tries to get the value for the key, but if it's not found it will return a default value
#  in this case an empty list because that's what was passed - but it will default to None
print(t.get("Pet"))

[]
None


Dictionaries are a pretty useful data structure, especially for processing nested data.  

There are some nice accessors that make life pretty simple

In [14]:
# setdefault will return the value if it is set, but if it is not it will create a new value and set it to the 
# value passed as the second argument
print(t.get("Pet"))

for pet in ("Dog", "Cat", "Budgie"):
    t.setdefault("Pet", []).append(pet)

print(t["Pet"])

# you can create a dictionary using the dict keyword
p = dict(mobile_phones=["Apple", "Samsung", "Google"])
print(p)

# update will merge two dictionaries together
c = dict(Computer=["Mac", "PC", "Commodore64"]) # this is another way of creating a dictionary
t.update(c)

print(t["Computer"])

['Dog', 'Cat', 'Budgie', 'Dog', 'Cat', 'Budgie']
['Mac', 'PC', 'Commodore64']


Loops with dictionaries are a little different

In [15]:
# items returns a tuple of key and value pairs
for category, values in t.items():
    print(category, "->", values)

# keys returns the list of keys
for category in t.keys():
    print(category)

# values returns a list of the values
for values in t.values():
    print(values)


Fruit -> ['Tomato', 'Pear', 'Apple']
Vegetable -> ['Carrot', 'Parsnip']
Pet -> ['Dog', 'Cat', 'Budgie', 'Dog', 'Cat', 'Budgie']
Computer -> ['Mac', 'PC', 'Commodore64']
Fruit
Vegetable
Pet
Computer
['Tomato', 'Pear', 'Apple']
['Carrot', 'Parsnip']
['Dog', 'Cat', 'Budgie', 'Dog', 'Cat', 'Budgie']
['Mac', 'PC', 'Commodore64']


The `in` accessor defaults to using the keys

In [17]:
print("Computer" in t)
print("Astronaut" in t)
print("Parsnip" in t)

True
False
False


## Sets

Sets are a type of collection, but make the use of set operations much more straight forward.  


In [5]:
a = [1, 1, 3, 5, 7, 9]
b = [1, 2, 3, 12, 15]

# note the deduplication
print(set(a))

# intersection
print(set(a).intersection(set(b)))

# difference
print(set(a).difference(set(b)))

# disjoint
print(set(a).isdisjoint(set(b)))

# union
print(set(a).union(set(b)))

{1, 3, 5, 7, 9}
{1, 3}
{9, 5, 7}
False
{1, 2, 3, 5, 7, 9, 12, 15}


## Gotchas

A few things you should be aware of and plan to mitigate against.

### Inherent Truthiness in Python

In other languages only boolean or 1/0 types can be used for tests of truth.  Python can be a little more flexible

In [18]:
a = True

if a:
    print("A is True")

a = 1

if a:
    print("A is True")

a = "False"

if a:
    print("A is True")

A is True
A is True
A is True


In [19]:
a = False

if not a:
    print("A is False")

a = 0
if not a:
    print("A is False")

a = ""
if not a:
    print("A is False")



A is False
A is False
A is False


You don't have to initialise a variable in Python; in cases where you might want to protect your program you can initialise a variable with a value `None`.  You can then use `a is None` or `a is not None` to establish that a variable has been set or otherwise. 

### 