# POLSCI 3 Fall 2019

## Discussion 1: Analyzing Quantitative Data using Python

Estimated time: 40 minutes

Welcome! In this notebook, you will review statistical concepts from lecture and learn how how to use Jupyter Notebooks (like this one!) and the Python programming language to analyze quantitative data.

### Jupyter Notebooks

A Jupyter Notebook is an online, interactive computing environment, composed of different types of __cells__. Cells are chunks of code or text that are used to break up a larger notebook into smaller, more manageable parts and to let the viewer modify and interact with the elements of the notebook.

Notice that the notebook consists of 2 different kinds of cells: **markdown** and **code**. A markdown cell (like this one) contains text, while a code cell contains expressions in Python, the programming language that you will be using. 

### Running Cells

"Running" a cell is similar to pressing 'Enter' on a calculator once you've typed in an expression; it computes all of the expressions contained within the cell.

To run a code cell, you can do one of the following:
- press __Shift + Enter__
- click __Cell -> Run Cells__ in the toolbar at the top of the screen.

You can navigate the cells by either clicking on them or by using your up and down arrow keys. Try running the cell below to see what happens. 

In [1]:
print("Hello, World!")

Hello, World!


The input of the cell consists of the text/code that is contained within the cell's enclosing box. Here, the input is an expression in Python that "prints" or repeats whatever text or number is passed in. 

The output of running a cell is shown in the line immediately after it. Notice that markdown (text) cells have no output.

### Expressions

An expression is a combination of numbers, variables, operators, and/or other Python elements that the language interprets and acts upon. You can think of expressions as a set of __step-by-step instructions__ for Python to follow in order to produce a specific output. To edit a code cell, simply double click on it and make your changes! 

In [2]:
# Replace the word 'friend' below with your name. Then run the cell. 
print("Welcome to Jupyter notebooks, friend.")

Welcome to Jupyter notebooks, friend.


Code cells can be used evaluate arithmetic expressions. Below are a few basic examples of what Python can do!

In [3]:
# Addition
20+20

40

In [4]:
# Multiplication
10*8.5

85.0

In [5]:
# Division
625/25

25.0

In [6]:
# Exponents: To raise a to the power of b, call a**b
4**4

256

In [7]:
# A series of arithmetic operations
(2-4*5+7) + 18**2

313

Note that code after a # (hashtag) is not run, so we use lines starting with hashtags to add comments or notes on our code. Here's an example.

In [8]:
#By using a comment at the beginning of the cell, we can describe what will occur when you run the cell.
# Add ten to 8
10+8 # Note how we can add a comment after the expression


18

### Python Variables

Aside from numbers, Python has **variables**, names that act as placeholders for certain values. For example, let the variables `x` and `y` equal 10 and 9, respectively. This action is called "defining a variable".

In [9]:
x = 10
y = 9

Notice that assigning a number to a variable name such as `x` produces no output. To view the value of x, place it at the end of a coding cell, like below.

In [10]:
x

10

Now, we can use the variables `x` and `y` in expressions.

In [11]:
10 + 9

19

In [12]:
x + y

19

Now what happens when the value of `x` changes?

In [13]:
x = 12*2

Then, the value of the expression also changes.

In [14]:
x + y

33

**This is why the order in which you run code cells is important.** The expression `x + y` can yield different results depending on which cells you ran before.

In [15]:
x = 5
y = 20
x+y

25

What happens if you try to use a variable without assigning it to a value first?

In [16]:
x + y + z

NameError: name 'z' is not defined

You'll see that Python outputs a `NameError`. Python tried to find the value of `z`, but `z` hadn't been defined yet!

**Important:** If you see this error again in this notebook or in future notebooks, it is an indication that you might not have run all the previous cells or that you might be using variables without assigning values to them first.

Add a hashtag before `x + y + z` on the previous cell so it doesn't give you trouble later.

Run the next cell to define `z`

In [17]:
# Defining z here
z = 2019

In [18]:
# Good to go
x + y + z

2044

### Variable Types
As you saw in the examples above, two common types of variables are __integers__ (positive and negative whole numbers), and __decimals__ (positive and negative decimal numbers).

Another important type of variable is a __string__. Strings are sequences of characters, such as words or sentences. Strings are always surrounded by quotes. For example, `"Sociology"` is a string becuase it is surrounded by quotes, but `berkeley` and `1868` are not. 

In [19]:
# String 
'Political Science'

'Political Science'

In [20]:
# The variable subject is a string
subject = "Political Science"

# The variable berkeley is an integer
berkeley = 1868

### Arrays

An Array is a special type of variable that can hold more than one value or variable at a time. You can think of it as a "list" or collection of values that you can use to store multiple values into one single variable.


Now, let's create our first array. To create an array, separate each of your values with a comma, surrounding the list in brackets. In the cell below, we create an array of the x, y, and z values from before.

In [21]:
[x, y, z]

[5, 20, 2019]

We can store arrays as variables, just like how we can store the contents of arrays as variables. Note that in the cell below, we use the last line to access the value of the <code>numbers</code> array.

In [22]:
numbers = [x, y, z]
numbers

[5, 20, 2019]

### Variables Overview

Here is a table that summarizes all of the information on variables above. 

|__Variable Type__ | __Definition__                       | __Examples__ |
|------------------|--------------------------------------|--------------|
|Integer           | Positive and negative whole numbers  |1868, 0, -200 |
|Decimal            | Positive and negative decimal numbers|-9.7, 0.0, 8.4|
|String            | Sequence of characters               | "Berkeley"   |
|Array             | List of variables/numbers/strings                        | [1, 2, 3]    |

### Functions in Python

At a high level, functions are series of expressions that take in inputs, perform some action with them, and return some output. The `print()` function is a very simple example. It takes in some number or text, and simply outputs its input.

In [23]:
print("Hello World")

Hello World


In [24]:
print(z)

2019


Here, we define a function that squares its input. In the following cell, we use the <code>def</code> word to indicate we are *defining* a function. This is followed by the name of our function, in this case *square*. We follow the name with parentheses surrounding whatever inputs the function uses. The <code>return</code> word is at the beginning of whatever output we want our function to have, in this case the input value raised to the power of 2.

In [25]:
def square(x):
    return x**2

In [26]:
h = square(15)
h

225

Here, the variable `x` inside the function is assigned to the input 15. 

`x` is considered a parameter to the function `square()`. As you can see from the examples above, the parameters to a function are passed in by specifying them within the parentheses immediately after the function name.

Note that functions can have more than one argument, and the order you enter the arguments does matter. Here `exponentiate()` takes 2 arguments, `a` and `b`, and raises the first argument to the power of the second.

In [27]:
def exponentiate(a, b):
    return a**b

In [28]:
exponentiate(4, 5)

1024

In [29]:
exponentiate(5, 4)

625

## Conclusion:
Great! Over the course of this notebook, you were introduced to the basic types of objects in Python, how to store them, how to create lists of them, and how to use them as inputs and outputs for functions. In the next discussion, you will build further on the coding and quantitative skills you learned today. Stay tuned!

__Peer Consulting Office Hours__

If you had trouble with any content in this notebook, Data Peer Consultants are here to help! You
can view their locations and availabilites at this link: https://data.berkeley.edu/education/data-peer-consulting. Peer Consultants are there to answer all data-related questions, whether it be about the content of this notebook, applications of
data science in the world or other data science courses offered at Berkeley -- make sure to take advantage of this wonderful resource!

## Saving Your Notebook
Now that you've finished the discussion, we need to save it! To do this, run the next cell. If that doesn't work, click <code>File</code> $\rightarrow$ <code>Download as</code> $\rightarrow$ <code>PDF via Chrome</code>

In [4]:
from otter import Notebook
Notebook.export('Discussion Notebook 1, Introduction to Python.ipynb', filter_type='tags')