# Lab 01 - Introduction to Python

This lab is an introduction to Python, a programming language which can beused for not only social data analytical tasks, but also general interest programming, statistical analysis, and other data science-type tasks. Python is a language which has been around for over 10 years, has a vibrant community, and an open-source codebase. This means that if you have an issue with Python, there are many resources available for solving this issue.

## Setup

The version of Python we're using in this course is [Anaconda Python](https://www.continuum.io/downloads), Python 3.5 version. We're using this because there are many data science tools which come included automatically. We don't have to go through the trouble of installing anything else for what we plan to do.

You can download this on your home computer or laptop if you'd like. For this class, we're going to use the version installed on the lab computers in DH 2010.

The first thing you need to do is start the Anaconda Navigator, which is under the start menu in Windows. It will be in a different place on Mac, probably Applications.

![Anaconda Navigator](img/00a-desktop.png)

This will start a program which will give you a host of tools for writing Python code. For this class we will be using Jupyter Notebook, which is a way to put your code and notes about that code (called *documentation*) in the same file. The file you're looking at now was written in Jupyter Notebook.

Start the Jupyter Notebook by clicking Launch.

![Jupyter Notebook](img/01a-jupyter.png)

After that you will get a window which looks like this. It will list a bunch of files. You should click Desktop and use the dropdown menu to create a new folder. Call it Lab1.

![Jupyter screen](img/02-jupyter-window.png)
![Jupyter desktop](img/04a-jupyter-new-file.png)

After that, enter the folder and create a new Python file by selecting **Python [Root]**. Once you do that, you should get a new screen which looks like this.

![New file](img/05a-new-file.png)

Click the **Untitled** text and give the file a new name, such as Lab 1. 

![Title](img/06a-title.png)

Jupyter Notebooks work by operating on what are called *cells*. Those are the boxes which make up the notebook. A new cell will start with <code>In [ ]</code> and by default will take Python code. You can change the cell type by selecting content types in the center of the toolbar. For the first cell, let's selection type called *Markdown*. Markdown is a markup language which let's you create headers, lists, and emphasis. 

Try selecting Markdown, then entering the text <code># Lab 1</code>. This will be our title. Once you enter that, press the button in the toolbar which looks like a play button. This will *run* the cell.

![Title2](img/07a-title2.png)

Now we're working. In the next box, we can start actually writing code. Let's do something easy. Type <code>1 + 1</code> in the box and press the Run Cell button.

![1 + 1](img/08a-math.png)

That's pretty much it when it comes to Jupyter Notebooks. Now, let's get to learning the mechanics of the Python programming language.

## Types

There are several different *types* available in Python. Python *objects* can take on any one of these types. 

Basic numbers can take on several type, the two most important being *integers* (e.g. 1, 2, -5) and *floating point* number or *floats* (e.g. 1.3, 0.5).

In [None]:
## Integers
1
2
-5

## Floats
1.4
20.3453

You can add documentation with *comments*. Comments start with the <code>#
</code> character.

There are many different types in Python. For today, we'll cover strings and booleans.

## Strings

*Strings* are bits of text.

In [None]:
"This is a string."

You can also put two strings together with a plus sign.

In [None]:
"This is a string." + " " + "This is another string."

*Booleans* are logical values. Booleans can be denoted by keywords or logical statements. The major keywords in Python are <code>True</code>, <code>False</code>.

In [None]:
True
False

We'll be more into logical statements and operators below.

## Arithmetic

Let's start off easy. Python can function as a simple calculator with all the usual operations: addition, subtraction, multiplication, and division.

In [1]:
1 + 3 # addition

4

In [2]:
3 - 5 # subtraction

-2

In [3]:
4 * 10 # multiplication

40

In [4]:
10 / 3 # division

3.3333333333333335

Python will also obey the usual order of operations.

In [5]:
1 + 3 * 5 + 10

26

But you can also use parentheses to order operations as you'd like.

In [10]:
(1 + 3) * 5 + 10

30

You can also do basic operations like exponentials as built-in operations.

In [12]:
10**2

100

In [13]:
64**0.5

8.0

**Exercise 1**: Try to do some arithmetic yourself for a few minutes. 

1. What seems to work? What gives you an error?
2. What happens when you try to add a number with a boolean or string?

## Variables

Usually you want to use a representation of an object, instead of the object itself. This helps because you can use it later and name it something descriptive. You can also manipulate them in place of numbers.

In [16]:
height_cm = 200
weight_kg = 75

bmi = weight_kg / (height_cm**2) * 10**4
bmi

18.75

So far, we have been printing out variables by typing them, but you can also print out using the <code>print</code> function. A *function* is a bit of code which you can reuse in other parts of the code. There are many different built-in functions that come with Python. 

In [17]:
print(bmi)

18.75


You can also store strings or booleans in variables and treat them like usual.

In [26]:
first_name = "john"
last_name = "smith"

name = "john" + " " + "smith"
print(name)

john smith


In [28]:
truth = True
false = False

result = truth and false ## and is a logical operator 
print(result)

False


## Data structures

*Data structures* are objects which hold other objects. Over the course of the term, we're going to learn about several different ones which are better suited for statistical and data analysis. For now, we will focus on three which are part of the base Python package: *lists*, *dictionaries*, and *tuples*. They each are good for particular things.

### Lists

*Lists* are ordered groupings of items. Usually every item in the list is of the same type.

In [32]:
list_1 = [1, 2, 4, 8]
list_2 = ["one", "two", "three"]
list_3 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # list of lists

Lists are also what can call *mutable*. What that means is you add elements to the end or even the middle of a list.

In [54]:
mutable_list = [] # this list is empty
mutable_list.append(1)
mutable_list.append(3)
mutable_list.append(5)
mutable_list

[1, 3, 5]

Once you're created a list, you need to be able to access the elements of the list. You can do this using an *index*, starting at 0.

In [35]:
list_1[0]

1

In [36]:
list_2[1]

'two'

In [37]:
list_3[2]

[7, 8, 9]

You can also access *slices* of lists, which represent parts of the list.

In [40]:
list_4 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list_4[0:4]

[1, 2, 3, 4]

### Dictionaries

*Dictionaries* are unordered key-value pairs. This means that the way you index a dictionary is to use a key, rather than a number. This is useful when you want to use a string to access an item.

In [43]:
mayors = {
    "rob": "ford",
    "john": "tory"
}

us_election = {
    "hillary": "clinton",
    "donald": "trump"    
}

prime_ministers = {
    "justin": "trudeau",
    "stephen": "harper"
}

mayors["rob"]

'ford'

### Tuples

*Tuples* are similar to lists, but usually they are thought as containing heterogenuous kinds of information. So while lists are all the same tpe, tuples are often different types. Tuples are also *immutable*, which means you can't add anything to them.

You can index elements of a tuple the same way you can access elements of a list.

In [42]:
tuple_1 = ("rob", "ford", 1)
tuple_1[1]

'ford'

## Loops

If we have a data structure where we want to have a look at every element of the structure, we can repeat the task using a *loop*. The most common loop we'll use in this course is the *for* loop, and more specifically the *foreach* loop.

Say we want to print 10 plus the number for every number in our list. We can do the following.

In [45]:
for item in list_4:
    print(item + 10)

11
12
13
14
15
16
17
18
19
20


This works the same for tuples. But, we have to be careful because tuples usually have different kinds of items in them. So we can't add 10 to each, or else we would get an error.

In [47]:
for item in tuple_1:
    print(item)

rob
ford
1


This looks a little different for dictionaries. We have to be able to separate the key from the value.

In [48]:
for key, value in mayors.items():
    print(key + " " + value + " has been the mayor of Toronto.")

rob ford has been the mayor of Toronto.
john tory has been the mayor of Toronto.


**Exercise 2**

1. Create a list variable called <code>likert</code> with numbers from -3 to 3. I will use the notation [-3, 3] to denote this.
2. Create a list variable called <code>responses</code> with 100 random elements in the range [-3, 3], which you can do that with the following code:
<pre> 
  import random
  random.randint(-3, 3)
</pre>
3. Create a dictionary variable called <code>codebook</code> which has three elements: <code>q1</code>, <code>q2</code>, and <code>q3</code>. <code>q1</code> will be a list of 10 random elements from [-3, 3], <code>q2</code> will be 100 random elements of the same range, and <code>q3</code> will be 1000 random elements of the same range.
4. Generate the mean (average) of each element in <code>codebook</code>.