# NYU Wagner - Python Coding for Public Policy

Aidan Feldman

## Welcome!

![Elmo waving](https://media.giphy.com/media/LPgFwCQg4HQBvPihcn/source.gif)

### A bit more about me

- Coding since 2005 🖥
- Government since 2014 🦅
- Also a modern dancer 💃 cyclist 🚲 and baker 🍞
- Passionate about open source

### Day jobs

Past include:

- Technology Director at [TTS](https://www.gsa.gov/about-us/organization/federal-acquisition-service/technology-transformation-services)
- [Census xD](https://www.xd.gov/)
- [NYC Planning Labs](https://labs.planning.nyc.gov/)
- GitHub
- [18F](https://18f.gsa.gov)

### Introductions

Share the following:

- Name (what you go by)
- Pronouns
- What you're studying
- Fun fact

### Who are you (as a whole)

[Survey results](https://docs.google.com/forms/d/1VmpiA6uOdJ0BB-ZuHbyTO7iprvPrLJeJt5DCPzx0rOQ/edit#responses)

### Where we're at

- The world is still scary
- Childcare, mental health issues, etc.
- Reach out
- Your future responsibilities

## Class structure

### Class materials walkthrough

1. [Homepage](https://github.com/afeld/python-public-policy)
1. [Syllabus](https://github.com/afeld/python-public-policy/blob/main/syllabus.md#readme)
1. [Brightspace](https://brightspace.nyu.edu/d2l/home/156784)

### Homework

- **Online tutorials:** In advance of classes, online tutorials will be assigned as homework. The following lecture will focus on applying those concepts
- **Coding:** Complete Python coding exercises that apply the concepts covered in the last lecture.

![homework workflow](extras/img/hw_workflow.png)
 

## Disclaimers

- You are not going to be good at coding by the end of this class
- Not everything is going to make sense the first time
- Here to teach you to:
  - Understand the power of code
  - Not be afraid of code
  - Do a lot with just a little code
  - Troubleshoot
  - Google stuff
- Not a statistician
- Still new to Brightspace and JupyterHub, so there will be some trial and error
- You will get out of it what you put into it

### Questions you might ask

- Can you remind us what that means?
- Can you say that differently?
- Can you give an example?
- How might this show up in our jobs?
- When did _you_ first learn about this?
- Why does this matter?

_Stolen from [Andrew Maier](https://twitter.com/andrewmaier/status/1258738594430291969)_

## Spreadsheets vs. programming languages

### Why spreadsheets

- The easy stuff is easy
- Lots of people know how to use them
- Mostly just have to point, click, and scroll
- Data and logic live together as one

### Why programming languages

- Data and logic _don't_ live together
  - Why might this matter?

- More powerful, flexible, and expressive than spreadsheet formulas

  - Don't have to cram into a single line

    ```excel
    =SUM(INDEX(C3:E9,MATCH(B13,C3:C9,0),MATCH(B14,C3:E3,0)))
    ```

  - Can have more descriptive data references than `Sheet1!A:A`

- Better at working with large data
  - Google Sheets and Excel have hard limits at 1-5 million rows, but get slow long before that
- Reusable code (packages)
- Automation

### Side-by-side<sup>1</sup>

|                       Task |      Spreadsheets      | programming languages |
| -------------------------: | :--------------------: | :-------------------: |
|           **Loading data** |          Easy          |        Medium         |
|           **Viewing data** |          Easy          |        Medium         |
|         **Filtering data** |          Easy          |        Medium         |
|      **Manipulating data** |         Medium         |        Medium         |
|           **Joining data** |          Hard          |        Medium         |
| **Complicated transforms** | Impossible<sup>2</sup> |        Medium         |
|             **Automation** | Impossible<sup>2</sup> |        Medium         |
|        **Making reusable** | Impossible<sup>2</sup> |        Medium         |
|         **Large datasets** |       Impossible       |         Hard          |

<sup>1</sup> Ratings are obviously subjective.<br/>
<sup>2</sup> Not including scripting.

### Python vs. other languages

Why are you taking _this_ class instead of R or whatever else?

![Python logo](https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/Python-logo-notext.svg/110px-Python-logo-notext.svg.png)

### Python vs. other languages

- Good for general-purpose _and_ data stuff
- Widely used in both industry and academia
- Relatively easy to learn
- Open source

![Python logo](https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/Python-logo-notext.svg/110px-Python-logo-notext.svg.png)

## What _is_ Python?

- A general-purpose programming language
- Text that your computer understands
    - Usually saved in a text file
    - _This is true of most programming languages_
- Popular for data analysis and data science

### Packages

- a.k.a. "libraries" or "modules"
- Developers have create them to make code/functionality reusable and easily sharable
- Software plugins that you `import`
- Packages we’ll use:
    - `pandas`
    - `plotly`

### Where to Python

Pyton can be run in:

- A text file, using the `python` command
- [The interactive Python interpreter / command prompt / shell](https://www.python.org/shell/)
- A Jupyter Notebook
    - [Google Colab](https://colab.research.google.com/), [Mode](https://mode.com/), [Kaggle](https://www.kaggle.com/), and other sites/tools are built around it
    - What we'll be using for this class

Each can be on your computer ("local"), or in the cloud somewhere.

![Trinity using the command line in the Matrix](https://nmap.org/movies/matrix/trinity-nmapscreen-hd-crop-1200x728.jpg)

### Try it!

1. Go to [python.org/shell](https://www.python.org/shell/)
1. Do some math (after typing each line, press `Enter` to submit)
    1. `1 + 1`
    1. `10 / 4`
    1. `10 / 3`
    1. Calculate the number of minutes in a year

### Try to break it!

It's ok, you won't hurt it.

What happened?

## Jupyter

- Web based programming environment
- Supports Python by default, and other languages with plugins
- Nicely displays output of your code so you can check and share the results
- Avoids using the command line
- Avoids problems with installation problems across different computer operating systems

We're using [JupyterHub, offered by NYU's High Performance Computing (HPC) group](https://sites.google.com/nyu.edu/nyu-hpc/training-support/resources-for-classes/jupyterhub).

### Command line vs. Jupyter

![Command line vs. Jupyter output](extras/img/cli_vs_jupyter.png)

### Try it!

1. Go to the [class JupyterHub](https://padmgp-4506-spring.rcnyu.org/)
1. Create a notebook
   1. Click `New`
   1. Under `Notebook`, click `Python [conda env:python-public-policy]`
1. Paste in [the following example](https://plotly.com/python/linear-fits/#linear-fit-trendlines-with-plotly-express)
1. Press the ▶️ button (or `Control`+`Enter` on your keyboard)

```python
import plotly.io as pio
pio.renderers.default = "notebook_connected+pdf"

import plotly.express as px

df = px.data.tips()
fig = px.scatter(df, x="total_bill", y="tip", trendline="ols")
fig.show()
```

FYI `px.data.tips()` loads one of [Plotly's sample datasets](https://plotly.com/python-api-reference/generated/plotly.express.data.html).

### Jupyter basics

A "cell" can be either code or [Markdown](https://www.markdownguide.org/getting-started/) (text). Raw Markdown looks like this:

```
## A heading

Plain text

[A link](https://somewhere.com)
```

#### Running

- You "run" a cell by either:
    - Pressing the ▶️ button
    - Pressing `Control`+`Enter` on your keyboard
- Cells don't run unless you tell them to, in the order you do so
    - Generally, you want to do so from the top every time you open a notebook

#### Output

- The last thing in a code cell is what gets displayed when it's run
- The output gets saved as part of the notebook
- Just because there's existing output from a cell, doesn't mean that cell has been run during this session

## Computers are not smart.

They do exactly what you tell them to do (not what you _meant_ them to do) in the order you tell them to do it.

## [Homework 0](https://padmgp-4506-spring.rcnyu.org/user-redirect/notebooks/class_materials/hw_0.ipynb)

1. Walk through the assignment
1. [How to submit](https://github.com/afeld/python-public-policy#turning-in-assignments)
1. If there's time, start on it