<a href="https://colab.research.google.com/github/afeld/python-public-policy/blob/master/lecture_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **NYU Wagner - Python Coding for Public Policy**

Aidan Feldman

[Shared notes](https://docs.google.com/document/d/1umb8kbKZuKR05K7Bvl2WD4N_EfGHW2WsOQpqFAlLQXU/edit#)

![Reel-to-reel camera](https://p0.pikrepo.com/preview/271/1009/gray-reel-to-reel-projector-against-gray-background-thumbnail.jpg)

# Class 0: Intro to coding

## Welcome!

![Elmo waving](https://media.giphy.com/media/LPgFwCQg4HQBvPihcn/source.gif)

### A bit more about me

- Coding since 2005 🖥
- Government since 2014 🦅
- Also a modern dancer 💃 cyclist 🚲 and baker 🍞
- Passionate about open source
- Run a meetup called [Hacker Hours](https://hackerhours.org/) for people learning to code

### Day jobs

- **Current:** Technology Director at [TTS](https://www.gsa.gov/about-us/organization/federal-acquisition-service/technology-transformation-services)
- **Past** include:
  - [Census xD](https://www.xd.gov/)
  - [NYC Planning Labs](https://labs.planning.nyc.gov/)
  - GitHub
  - [18F](https://18f.gsa.gov)

### Introductions

In the Zoom meeting chat, share the following:

- Name (what you go by)
- What you're studying
- Go-to quarantine snack

### Where we're at

- The world is scary right now
- Childcare, mental health issues, etc.
- Reach out
- Your future responsibilities

## Class structure

### Zoom

- **Video:**
    - Encouraged to keep it on, but "face mute" if you need
    - [Gallery view](https://support.zoom.us/hc/en-us/articles/360000005883-Displaying-participants-in-gallery-view)
    - [Screen sharing](https://support.zoom.us/hc/en-us/articles/201362153-Sharing-your-screen) for hands-on time
- **Audio:** Keep muted by default
- **In-meeting chat:** Feel free to drop questions as we go
- **Channel:** Group messaging between classes
- [Non-verbal feedback](https://support.zoom.us/hc/en-us/articles/115001286183-Nonverbal-Feedback-During-Meetings#h_50523139-7bac-403b-9c59-1755ada65ad9)

### Class materials walkthrough

1. [Shared notes](https://docs.google.com/document/d/1umb8kbKZuKR05K7Bvl2WD4N_EfGHW2WsOQpqFAlLQXU/edit#)
1. [Homepage](https://github.com/afeld/python-public-policy)
1. [Syllabus](https://github.com/afeld/python-public-policy/blob/master/syllabus.md#readme)
1. [NYU Classes page](https://newclasses.nyu.edu/portal/site/08599039-1e6b-4c40-98f4-1a862293dbee)

### Homework

- **Online tutorials:** In advance of classes, online tutorials will be assigned as homework. The following lecture will focus on applying those concepts
- **Coding:** Complete Python coding exercises that apply the concepts covered in the last lecture.

![homework workflow](img/hw_workflow.png)
 

### Disclaimers

- You are not going to be good at coding by the end of this class
- I am here to teach you to:
  - Understand the power of code
  - Not be afraid of code
  - Do a lot with just a little code
  - Troubleshoot
  - Google stuff
- Not a statistician
- This class came together in two weeks
  - Rough around the edges
  - Subject to change
  - Ask for your patience

## Spreadsheets vs. programming languages

### Why spreadsheets

- The easy stuff is easy
- Lots of people know how to use them
- Mostly just have to point, click, and scroll
- Data and logic live together as one

### Why programming languages

- Data and logic _don't_ live together
  - Why might this matter?

- More powerful, flexible, and expressive than spreadsheet formulas

  - Don't have to cram into a single line

    ```
    =SUM(INDEX(C3:E9,MATCH(B13,C3:C9,0),MATCH(B14,C3:E3,0)))
    ```

  - Can have more descriptive data references than `Sheet1!A:A`

- Better at working with large data
  - Google Sheets and Excel have hard limits at 1-5 million rows, but get slow long before that
- Reusable code (packages)
- Automation

### Side-by-side\*

|                       Task |  Spreadsheets  | Programming languages |
| -------------------------: | :------------: | :-------------------: |
|           **Loading data** |      Easy      |        Medium         |
|           **Viewing data** |      Easy      |        Medium         |
|         **Filtering data** |      Easy      |        Medium         |
|      **Manipulating data** |     Medium     |        Medium         |
|           **Joining data** |      Hard      |        Medium         |
| **Complicated transforms** | Impossible\*\* |        Medium         |
|             **Automation** | Impossible\*\* |        Medium         |
|        **Making reusable** | Impossible\*\* |        Medium         |
|         **Large datasets** |   Impossible   |         Hard          |

_\*Ratings are obviously somewhat subjective._

_\*\*Not including scripting._

## Python vs. other languages

- Good for general-purpose _and_ data stuff
- Widely used in both industry and academia
- Relatively easy to learn
- Open source

![Python logo](https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/Python-logo-notext.svg/110px-Python-logo-notext.svg.png)

## What _is_ Python?

- A general-purpose programming language
- Text that your computer understands
    - Usually saved in a text file
    - _This is true of most programming languages_
- Popular for data analysis and data science

### Packages

- a.k.a. "libraries" or "modules"
- Developers have create them to make code/functionality reusable and easily sharable
- Software plugins that you `import`
- Packages we’ll use:
    - `pandas`
    - `plotly`
    - `spacy`

### Where to Python

Pyton can be run in:

- A text file, using the `python` command
- [The interactive Python interpreter / command prompt / shell](https://www.python.org/shell/)
- A Jupyter Notebook
    - Google Colab, [Mode](https://mode.com/), [Kaggle](https://www.kaggle.com/), and other sites/tools are built around it
    - What we'll be using for this class

Each can be on your computer ("local"), or in the cloud somewhere.

![Trinity using the command line in the Matrix](https://nmap.org/movies/matrix/trinity-nmapscreen-hd-crop-1200x728.jpg)

### Try it!

1. Go to [python.org/shell](https://www.python.org/shell/)
1. Do some math (after typing each line, press `Enter` to submit)
    1. `1 + 1`
    1. `10 / 4`
    1. `10 / 3`
    1. Calculate the number of minutes in a year

### Try to break it!

It's ok, you won't hurt it.

What happened?

## Jupyter

For our purposes, synonymous with Google Colab.

- Web based programming environment
- Supports Python by default, and other languages with plugins
- Nicely displays output of your code so you can check and share the results
- Connects with Google Drive
- Avoids using the command line
- Avoids problems with installation problems across different computer operating systems

### Command line vs. Jupyter

![Command line vs. Jupyter output](img/cli_vs_jupyter.png)

### Try it!

1. Create a Colab notebook
   1. Go to [colab.research.google.com](https://colab.research.google.com)
   1. Click `NEW NOTEBOOK`
1. Paste in [the following example](https://plotly.com/python/linear-fits/#linear-fit-trendlines-with-plotly-express):

    ```python
    import plotly.express as px

    df = px.data.tips()
    fig = px.scatter(df, x="total_bill", y="tip", trendline="ols")
    fig.show()
    ```

1. Press the ▶️ button (or `Control`+`Enter` on your keyboard)

### Jupyter basics

A "cell" can be either code or [Markdown](https://www.markdownguide.org/getting-started/) (text). Raw Markdown looks like this:

```
## A heading

Plain text

[A link](https://somewhere.com)
```

#### Running

- You "run" a cell by either:
    - Pressing the ▶️ button
    - Pressing `Control`+`Enter` on your keyboard
- Cells don't run unless you tell them to, in the order you do so
    - Generally, you want to do so from the top every time you open a notebook

#### Output

- The last thing in a code cell is what gets displayed when it's run
- The output gets saved as part of the notebook
- Just because there's existing output from a cell, doesn't mean that cell has been run during this session

## Start [Homework 0](https://colab.research.google.com/github/afeld/python-public-policy/blob/master/hw_0.ipynb)