## Data Science Workshop 1: Fundamentals of Python
Welcome to the first Data Science Society workshop on Python! The aim of these workshops will be to teach you enough Python so that you'll be able to use machine learning algorithms on various datasets for data analysis. We will be avoiding the more complex maths behind these algorithms leaning instead towards packages like Scikit Learn which already provide us with these algorithms. To begin with we will cover the very basics of Python. The workshop will mainly go over the Numpy, Matplotlib, Pandas and Scikit Learn packages but there are many more in Python that will not be covered in the workshops.

### Jupyter Notebook
Throughout these workshops we will be using the Jupyter Notebook editor. In this editor code is run in code cells and text is run in markdown cells. You can change the type of cell using at the top of the page below the words Kernel and Widgets. The options to edit cells are found at the top of the page: "+" to create a new cell, scissors to cut a cell etc.
#### Markdown Cells
The text you are reading now is in a markdown cell. To edit the text double click on the cell. To run the cell press Shift + Enter. These cells don't actually run code but mainly serve as a means of allowing you to describe what your code is supposed to do, so that you can inform someone else or remind yourself of what your code is doing when you revisit it. These cells also allow for the formatting of equations when you place something in between two single dollar signs. For instance: $ E = mc^2 $. Doing so between two double dollar signs places the equation on a new line like so: $$ \nabla \cdot B = 0. $$ 
Don't worry about what these equations mean!
#### Code Cells
Code cells are as the name the suggests where you actually write you're code. You can run a code cell by pressing Shift + Enter to run the code in that cell. When you run a cell the output will be displayed below the code cell. Below is a code cell for you to run. 

In [None]:
print("If you are reading this, this cell has been run!")

### Commenting 
Inside code cells you can write notes by using the # symbol. Putting it in front of a line of code means that code won't be run when running the cell. This is useful when you want to see what removing a line of code briefly without deleting the line entirely and having to rewrite it again. Additionally, the # symbol is used in commenting on your code so that another user, or yourself when you return to the code can figure out what that code does. It is good practice to leave comments in your code however they should be consise to avoid overcommenting that may make your code harder to read. If ever in doubt imagine that if you ever undercomment two kittens die and if you ever overcomment one kitten dies. Below is a code cell identical to the one above save for the fact the # symbol is in front of the line of code. If you run the code you will not see any output.

In [None]:
# print("If you are reading this, this cell has been run!")

### Print Statements
One of the most basic functions in Python is the ````print```` statement we used above. As the name suggests it "prints" the argument you pass through it. What happens if we run the first code without the ````print```` statement?

In [None]:
"If you are reading this, this cell has been run!"

The cell still produces the output. What if we had had two lines instead.

In [None]:
"Line 1"
"Line 2"

Only the last line of code is outputted. Having each line inside ````print```` statements means both will be outputted.

In [None]:
print("Line 1")
print("Line 2")

Commas in ````print```` statements and indeed in all functions are used to seperate different arguments. In ````print```` statements you typically do this when "printing" different data types. Passing ````print("The population of London is", 8.136, "million")```` will output ````The population of London is 8.136 million````. As can be seen each argument is printed with a space in between them. For this example the arguments in quotation marks are strings; essentially a set of characters, while the second argument is a float; essentially a number. We will discuss data types in more detail later.

In [None]:
print("The population of London is", 8.136, "million")

Now use the print statement in the code cell below to print a message of your choosing.

In [None]:
print()

### Basic Operations
Below are some basic mathmatical operations that are coded in the code cell that follows.
$$ 2 + 2 = 4 $$
$$ 5 - 2 = 3 $$
$$ 2 \times 4 = 8 $$
$$ \frac{6}{3} = 2 $$
$$ 5^3 = 125 $$
Running the code cell below should give you the right hand side of each equation.

In [None]:
# Addition
print("Addition:", 2 + 2)

# Subtraction 
print("Subtraction:", 5 - 2)

# Multiplication
print("Multiplication:", 2*4)

# Division
print("Division:", 6/3)

# Powers
print("Powers:", 5**3)

# Division without remainder 
print("Divison without remainder:", 7//3)

# Returns remainder
print("Division returning the remainder:", 7%3)

Be aware that is often easy to make syntax errors when performing mathmatical operations. Sometimes this involves writing something to the effect of ````2x```` instead of ````2*x```` where the former would look for a variable defined as ````2x```` and the latter would multiply a variable ````x```` by $2$. In large calculations when using brackets it is very easy to make mistakes so be alert for potential mistakes. Clarity of code is important to help make it easier to read and spot mistakes. Often rather than having an entire calculation in one messy line of code its often best to split the calculation into several lines that are easier to read. 

In the code cell below print the result of the following operations: $$ 4^7 $$ $$ 8\times123 $$ $$ \frac{4523}{84} $$

In [None]:
# 4 to the power of 7
print()

# 8 x 123
print()

# 4523 divided by 84
print()

### Defining variables 
You can define variables then redefine later in the following ways:

In [None]:
# Defining a variable x as 5
x = 5
print("x =", x)

# Redefining a variable x as a string "string"
x = "string"
print("x =", x)

# Redefining a variable x as 7
x = 7 
print("x =", x)

# Redefining x as 8 by adding 1 (The same can be done for other mathmatical operations)
x = x + 1
print("x =", x)

# Redefining x as 9 by adding 1 (The same can be done for other mathmatical operations)
x += 1
print("x =", x)

In the code cell below, define a variable ````x```` to be a number $12$ and then print it it by calling the variable ````x````. Then define ````x```` as a string of your choice and print this string by passing ````x```` in a print statement.

In [None]:
x = 
print()

x = 
print()

### Lists 
What happens if you have a set of items you want to assign to one variable. For this we can use lists. 

In [None]:
# Defining a variable y as a list 
y = [1,2,3,4]
print("y =",y)

# One can even have a list of lists
y = [y,y]
print("y =",y)

# Defining two variables at once from a list
x, y = [1,2]
print("x =", x, ", y =",y)

In the code cell below define a list ````y```` to be $[0,5,10,15,20]$ and print it. Then define three variables ````x````,````y```` and ````z```` as $5$, $9$ and $4$ respectively in the same line using a list. In the same print statement print these three variables.

In [None]:
# Define y as a list
y =
print()

# Defining three variables
x, y, z = 
print()

Lists can be sliced so that you can pick a particular item from the list. This is done by writing ````y[i]````, where ````y```` is a list and ````i```` is the index of the list. What is an index? An index is effectively a label for each item in the list. The labels correspond to the items place in the list. One thing to bare in mind is that Python starts counting from $0$ so when slicing the first item has an index $0$, the second an index $1$ and so forth. One can also select sections of a list using colons. Examples are displayed below.

In [None]:
# Defining a variable y as a list 
y = [1,2,3,4,5,6,7,8,9,10]

# Single items from the list
print("First item in list:", y[0])
print("Second item in list:", y[1])

# Section from the list
# Nothing before colon meand the first index is the start, the last index is the one before the number after the colon
print("First to third item:", y[:3])

# Number before colon indicates starting index, no number after colon indicates the end point is the last index
print("Seventh to last item in list:", y[6:])

# Number before colon is the starting index, last index is the one just before the number after the colon
print("Fourth to sixth item:", y[3:6])

In the code cell below print the slice of all values from $5$ to $10$ inclusive and all values from $4$ to $9$ inclusive from the list ````y````.

In [None]:
# Slice from 5 to 10 inclusive
print(y[])

# Slice from 7 to 9 inclusive
print(y[])

### Data Types
Notice how the code above produces $2.0$ as the output instead of $2$ for division. Is there any difference? The answer is yes as $2.0$ is a float and $2$ is an integer. This may seem a trivial distinction but it can be important as some functions will only take integers as so if you pass float instead you'll recieve an error. Both floats and integers are real numbers but the integer lacks any decimal points. In addition to floats and integers there are also strings which are a collection of characters, often text but sometimes numbers. Using the functions: ````str````, ````float```` and ````int```` you can convert between these data types provided the input can be interpreted as the type of data you are converting to. Examples are given in the cell below.

In [None]:
# Converting to float (The input "2" is a string)
print(float("2"))

# Converting to integer (The input 2.0 is a float)
print(int(2.0))

# Converting to string (The input 2 is an integer)
print(str(2))

# Trying to add a string and number
print(2 + str(2))

#### Strings
As can be seen above the attempt to add an integer and string together failed. This shows how the two data types behave differently. One can add strings together as follows:

In [None]:
# Adding two strings
print("2" + "5")
print("H" + "i")

In the code cell below add the strings "Ye" and "et" together outputting the result.

In [None]:
print()

Strings are essentially lists of characters. One can pick out specific characters in a string by slicing the string as you would a list. Once again Python starts counting from $0$. Strings seperated with spaces may also be split into a list of strings. This means that if your string was a sentence then you could split it into the a list containing the words that make up that sentance. This is all demonstrated below.

In [None]:
# Defining x as a specific string
x = "One does not simply walk into ExCel."

# First character of the string x
print(x[0])

# Third character of the string x
print(x[2])

# First three characters of string x
print(x[:3])

# Fifth to eigth character of string x
print(x[4:8])

# Splitting the string x into a list of strings
x = x.split()
print("Upon splitting the string we get:", x)

# Prints seventh word/term of the split string x (One could also have used x.split()[6] instead) 
print(x[6])

# Prints first letter in the seventh word/term of the split string x
print(x[6][0])

In the code cell below define ````y```` as a string of your choosing though one consisting of multiple words. Print the first five characters of the string. Then split the string and print the second word by slicing the split string.

In [None]:
# Define x as a string
y = 
# Return first five characters
print(y[])

# Split the string 
y = 
# Return second word
print(y[])

### Dictionaries
Dictionaries are an indexed list of items. Each item consists of a key and a value assigned to that key. Both keys and values can be integers, floats or strings. An example is showed below. Here the planet names are the keys and the number assigned to each one is the value.

In [None]:
# Defining a dictionary for the planets in the Solar System
planets = {
    "Mercury": 1,
    "Venus": 2,
    "Earth": 3,
    "Mars": 4,
    "Jupiter": 5,
    "Saturn": 6,
    "Uranus": 7,       
            } 

# Return the value for "Saturn"
planets["Saturn"]

You can add new items in a dictionary simply by defining a value for a new key as follows:

In [None]:
# Adds new item to planets dictionary
planets["Neptune"] = 8

print(planets)

In [None]:
# The same can be done for empty dictionaries
emptydict = {}

# Adding new item to emptydict
emptydict["Best Science"] = "Physics"
print(emptydict)

In the code cell below define an empty dictionary as a variable called ````shopping_list````. Fill this dictionary with a shopping list of your choice. Include four objects or more in the list. The key should be the item on the list and the value should be the price (doesn't need to be accurate). Once the list is complete print out the resulting dictionary.

In [None]:
# Defining an empty dictionary
shopping_list = 

# Filling the shopping list
shopping_list[] = 
shopping_list[] = 
shopping_list[] = 
shopping_list[] = 

# Print the shopping list
print()

## Further Work (Optional)
At the end of the workshops I may include content that I feel is too maths oriented for the workshops. The idea here is that you are free to explore these but they will not be covered in workshops. 
### Complex Numbers
In addition to the float and integer data types there is also a complex number data type. The function ````complex(u,v)```` generates a complex number with a real component ````u```` and imaginary component ````v````. Note that $j$ is used as the imaginary unit in Python.

In [None]:
# Creates a complex number z = 2 + 4j
z = complex(2,4)
print(z)

# Alternatively 
z = 2 + 4j
print(z)

# Real part of z
z_real = z.real
print(z_real)

# Imaginary part of z
z_imag = z.imag
print(z_imag)

# Returns the conjugate of the complex number
z_conj = z.conjugate()
print(z_conj)