# Introduction to Data Science for Public Policy
## Class 2: Python Basics
## Thomas Monk

# Overview
- Now we've set up our environment, we can start learning how to code. 
- **Aim**: By the end of this session, we will be able to build our first simple Python program!
- **Structure**: In general, I'll introduce you to a few concepts, which we'll then apply in some practice problems. I'll ask you to solve these in groups, and we can discuss the solutions.

# Basics: Arithmetic
- Python can do many things. Let's start with some very basic arithmatic.
- The *syntax* (the grammar of the programming language) is simple with Python.
- We can just ask it to do things in a very natural way.

In [2]:
2+2

4

In [8]:
2*3 + 1/2

6.5

In [7]:
4**2

16

You can control the order of operations in long calculations with brackets.

In [5]:
((1 + 3) * (9 - 2) / 2) ** 2

196.0

In general, Python follows the BODMAS rule when deciding the order of operations.

# Basics: Printing
By default, our notebook will print to the screen whatever the output of the last line of our cell is.


In [10]:
2+2
2*1

2

Notice above that only 2\*1 was printed.

If we want to print something to the screen explicitly, we use the print *function* we've already seen. (We'll get back to the meaning of *function* later on).

In [11]:
print("Hello, World!")
2*1

Hello, World!


2

Now, both of our lines of code are printed. This is equivalent to typing:

In [13]:
print("Hello, World!")
print(2*1)

Hello, World!
2


# Data types
We've seen a couple of different *types* of data here:

- integers (the natural numbers: 1, 2, 3...)
- floats (numbers with a decimal part, i.e. 1.5, or 196.0)
- strings (Hello, World!), denoted by "STRING"

In our transition from Stata to Python, we're going to have to explictly think more often about data types, and ensure they are compatible. But we need to do this much less than in other languages, like C!

Python can tell us the types of data like so:

In [1]:
type(1)

int

In [2]:
type("Hello, World!")

str

In [9]:
type(1.0)

float

## Data types
What happens when we add a string to an integer?

In [3]:
"Hello, World!" + 1

TypeError: can only concatenate str (not "int") to str

We get an error! Notice the error is different when we do this the other way round - we can think more about this later.

In [4]:
2 + "Hello, World!"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

## Other useful data types

### Booleans
Booleans represent one of two values: `True` or `False`. In the code cell below, `True` has the type `bool`.

In [16]:
type(True)

bool

Bools represent truth values in general. Here, Python will tell us if 5 is less than 6:

In [19]:
print(5<6)
print(type(5<6))

True
<class 'bool'>


### Booleans
We can also modify our booleans. I'll invert a `True` with a `not`.

In [20]:
not True

False

## Comments
We use comments to annotate what code is doing. These become much more important when we have large amounts of code.

For instance, in the next code cell, we multiply 3 by 2. We also add a comment (# Multiply 3 by 2) above the code to describe what the code is doing.

In [10]:
# Multiply 3 by 2
print(3 * 2)

6


To indicate to Python that a line is comment (and not Python code), you need to write a hash sign (#) as the very first character.

## Variables

So far, our calculations have been ephemeral. We ask Python to do something, and we lose the output forever.

We can now introduce something reasonably foreign to us from Stata: the concept of a **variable**. These are persistent stores of data, which we can name however we want.

In [12]:
my_variable = 5
print(my_variable)

5


Here, we create the variable *my_variable*, and set it to the value of 5, which we then print to the screen.

We assign the value to the variable with the equals sign - we assign the value on the right to the variable on the left of the =.

## Manipulating multiple variables

We can have as many variables as we can define - it's normal to have a large number of variables in your code.

Remember to name them with names that you can refer back to later, and understand what they are!

We can also modify the contents of our variables at any point in time.

In [21]:
my_var1 = 1
my_var2 = 2

print(my_var1 + my_var2)

3


In [22]:
myvar_1 = 1
myvar_1 = 2
print(myvar_1)

2


## Manipulating multiple variables

In [25]:
# Create variables
num_years = 4
days_per_year = 365 
hours_per_day = 24
mins_per_hour = 60
secs_per_min = 60

# Calculate number of seconds in four years - how?

In [26]:
total_secs = secs_per_min * mins_per_hour * hours_per_day * days_per_year * num_years
print(total_secs)

126144000


## Modules

As I've mentioned, there is a large number of software availible for us to use in Python.

Millions of people use Python, and a significant proportion of them make *modules* for us to use.

Here we're going to import the `math` module, and use the code within it.
We do this with the `import` command.

In [4]:
import math
math.pi # Provides us easy access to pi! This is one of many things the module provides.

3.141592653589793