# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Introduction to Python

June 17, 2017
NYC

# Introductions

Hi, I'm Winston. ![Hi](http://www.charbase.com/images/glyph/128075)
- Data scientist at Argo Digital
- BA Economics & Politics from Oxford
- Former data science bootcamp instructor at General Assembly

# How about you?

- What's your name?
- What do you want to learn today?

![](https://btwbpress.files.wordpress.com/2016/10/name_is.png?w=1000)

# Learning objectives

You will be able to:

- Write and run Python scripts from the command line
- Use Jupyter Notebooks to organize and run your Python code
- Use Python for basic data analysis tasks (data cleaning, exploration and analysis)
- Identify Python’s role as a tool in the data analysis ecosystem
- Plan and implement your own Python-based projects for work or play!

# So...how are we going to do this?

- We're starting from scratch, but will get hands-on ASAP
- Ask questions as we go! I'll put in-depth ones in the "parking lot" until our breaks
- Errors are _instructional_! You won't break anything, learn by trial-and-error


- We can't cover everything: I'll focus on the tools you need to start using Python in your work
- We'll move fast... Don't worry if you feel temporarily lost! We have chunks of independent practice to solidify what you learn
- Kitchen and bathrooms are on this floor

# Our schedule
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | ** Requirements checklist ** |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots|
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

# Setup checklist

1. Laptops with an internet connection
- Anaconda for Python 2.7 or 3.6
- Access to the command line (Terminal for Mac / \*nix, Command Prompt for Windows)
    - Try opening it now:
    - ⌘ (Command) + Space, then type "Terminal" (Mac OS)
    - Start Menu, search bar, "Command Prompt" (Windows)


![Terminal from finder](././intro_python_bootcamp/opening_terminal.png)

* Ideal but optional -- Python in your PATH variable. Try these commands in your terminal:
    - `which python`
    - Open a browser, then in your terminal type: `jupyter notebook`
    - If those fail, don't worry -- we'll fix it
* Ideal but optional -- a plain text editor (not Word / Pages)

![Terminal from finder](././intro_python_bootcamp/terminal_python_0.png)


### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | **Python, the command line, .py files and Jupyter **|
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots|
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

# Python

Python is a general purpose programming language created by Guido van Rosssum, aka "Benevolent Dictator for Life", in the early 1990s.

![](https://www.python.org/static/img/python-logo@2x.png)

# Why Python?

- Readability
- Flexibility 
- Interpreted language, doesn't need to compile
- Dynamic typing
- Supports multiple programming paradigms (object oriented, functional, procedural)
- Entire ecosystem for many domains: web, data, scientific




# Why did I install Anaconda?

- A distribution of Python and commonly used libraries of tools
- Easier than individually installing many libraries
- Ensures the versions of each library are compatible with each other

# But how do I _use_ Python?

You've got options! You can write and run Python code:

- Within the Python interpreter environment
- By running scripts at the OS command line
- Jupyter Notebooks (aka iPython Notebook)
- IDEs, like Spyder

We'll cover the first three in this class, but mainly we'll work with Jupyter Notebooks.

# Let's get going!

# Using the Command Line


There was a point when computers didn't come with Graphical User Interfaces (GUI).

Instead, everyone interacted with the computer using text commands in what we call a Command Line Interface (CLI).

![Grandpa Simpson](http://1.bp.blogspot.com/-pD3rd_ueBYM/U0S_WsWllyI/AAAAAAAAAmA/EsCZEfROBhk/s1600/47grandpa.png)

The command line still exists! Knowing how to use the CLI becomes essential as you program more.

### Opening and closing the terminal

Spotlight in OS X is the easiest and fastest way to open the terminal:

- ⌘ (Command) + Space
- "Terminal"
- Enter

You can actually hit enter as soon as the field autocompletes.



For users of Windows <= 7:
- Click Start
- All Programs - Accessories - Command Prompt

For Windows 8+: https://www.lifewire.com/how-to-open-command-prompt-2618089

### Getting comfortable in the CLI

- For many programs, you can open multiple tabs by pressing **⌘-T**. Try it in your terminal!
- You can close the current tab or window with **⌘-W**. This goes for most applications on a Mac. Try _that_ in your terminal!


You are in a "shell" which lets you interact with the system. Try typing these commands, hitting enter after each:
```bash
> pwd
> ls
> cd ..
```
    
> Check: can you infer what each of those commands did? How can you "undo" the `cd ..` command?

# Python from the command line

At the command prompt, simply type `python`. This will run the Python interpreter.

![Python interpreter](./intro_python_bootcamp/terminal_python_1.png)


Ideally this mentions Anaconda and includes the version number 3.6 or 2.7.

If it doesn't, the system keyword `python` may not point to the version we want. Try the Anaconda Command Prompt or opening Spyder via Anaconda Navigator.

# Finally... programming!

`print("Hello Room 3E!")`

And hit enter.

\* When you see `text like this`, that's Python code; type it as you see it. Hit enter after each line.

You 'bind' _variables_ to _values_:

`x = "Hello Room 3E"
print(x)`

Now the variable `x` is bound to the value "Hello Room 3E".  You can do things with that value by using the variable `x`.

(Less pedantically, we're assigning a value to `x`.)

Of course, variables can bind to all sorts of values, including numbers:

`y = 42
 y + 8
(y + 8) ** 2`

Let's try those commands together.

# Your turn

We'll go over all the basic operations soon, but first: take a minute and experiment with variables and arithmetic on your own!

- Try multiplying numbers with \*
- Divide numbers with /
- Do a division that should have a fractional answer. What did you see? (We'll come back to this!)

# So...we run programs one line at a time?

![Slow](./intro_python_bootcamp/ibm_punchcard.jpeg)

# Nope. Enter the text editor.

You don't need anything special to write and save whole Python scripts: a simple text editor will do.

- Open Sublime (recommended), TextEdit, Notepad, or anything else that lets you specify the extensions of your saved text files.



Type some code:

`x = "3E"
print("Hello Room: {}".format(x))`

Notice the that we're using two *functions*, print() and format(), and each requires opening and closing parentheses.

What do you think that example code will do?

Save this file as example.py to the directory you're in.

Now, from the command line:

`python example.py`

![It's beautiful!](http://s2.quickmeme.com/img/46/46d395cdf9c87ae2a7955a09acaf0ff36eb758b54299b24b1a0298c38ff5e7fe.jpg)

# Jupyter Notebooks

- An interactive "environment" for running Python.
- Extremely popular for data analysis and visualization.

With an internet browser open, go to the command line and type `jupyter notebook`

You've just launched a server running the notebook program. Your browser should show this on a new tab.



Click New - Notebook - Python 3.6

![JN](./intro_python_bootcamp/jupyter_notebooks_0.png)


This slideshow is a Jupyter notebook!

On the command line, you can go to the directory you saved it in, then open it with:

`jupyter notebook intro_to_python.ipynb`

The notebook is made up of cells.

To run a cell, hit ctrl+Enter while your cursor is in it.

To create a new cell, click outside your current cell, so your cursor isn't showing, then hit `a` or `b` to insert a new cell above or below.

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | ** BREAK ** |
| 11:15 - 12:15 | ** Python: data types, containers and control flow ** |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots|
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

# Data types

All data are of specific "types", which tells the Python interpreter how to work with them.

>Python has several "built-in" types:
- **numerics**
- **sequences**
- **mappings**
- files
- classes
- instances
- exceptions

## Numerics

>- **int** (integers)
>- **float** (includes decimals)
>- Boolean (true/false)
>- long
>- complex

Python has a function to return data type:
type(<value>) 

### 1) Integers - whole numbers, either positive or negative

In [None]:
4
x = 786

### 2) Floats - numbers with a decimal point

In [None]:
3.2

In [None]:
# Lines starting with hashes are comments
# They don't run

# What type are each of these values?
type(1)
# type(2.5)
# type(True)

## Sequences

>- **strings**
>- **lists**
>- tuples
>- unicodes
>- bytearrays
>- buffers

### Sequences: Strings

In [3]:
x = "string here"
y = "here too"

In [None]:
print(x)
type(x)

In [111]:
x + ". " + y

'string here. here too'

In [1]:
'stringy' * 3

'stringystringystringy'

In [4]:
1 + x 
#str(1) + " " + x

TypeError: unsupported operand type(s) for +: 'int' and 'str'

### String indexing

In [106]:
print(x)
x[0]

string here


's'

### What if we want more than just a single character?

In [None]:
x[0:2]

### Notice that this slicing is 'exclusive' - it returns the 0th and the 1st element, but not the 2nd

## Can we index from the right side?

In [None]:
print(x)
x[-1]
#x[-3:]

### Sequences: Lists - [ ]

### A list is a mutable sequence, i.e., the items in it can be replaced. It is denoted with square brackets.

In [109]:
### How do we index 'Dog' in our list?
l = ['Cat', 2, 'Dog', x]

# Mappings: Dicts - { } 

Dictionaries are also known as key-value stores.

Like lists they are mutable in that the values for a given key can be replaced.

They are denoted with curly braces.

In [121]:
d = {'key_a': 1, 'key_b': 'dog', 'key_c': [0,1]}
d

{'key_a': 1, 'key_b': 'dog', 'key_c': [0, 1]}

# We get a value back by calling its key

In [187]:
d['key_c']

[0, 1]

# We can add a new key/value pairs

In [122]:
d['key_d'] = 2.1
d.items()

[('key_d', 2.1), ('key_a', 1), ('key_c', [0, 1]), ('key_b', 'dog')]

> Check: we covered a lot of types! Can you concat (+) and can you multiply (\*):
>> Integers?

>> Floats?

>> Strings?

>> Lists?

>> Dictionaries?



# Functions

- Reusable snippets of code
- Define the function once 
- Call the function to execute your code as many times as you like
- Can receive inputs and return results

![Function syntax](./intro_python_bootcamp/function_syntax.png)



In [110]:
def simplestFunction():
     print "I made a function"
        
simplestFunction()

I made a function


In [None]:
# Functions can take parameters

def square(x):
     return x ** 2

square(5)

# Pair programming exercise

### One person dictates the code, one person types

1) Define a function that converts Celsius to Fahrenheit. It returns the converted value, accurate to at least one decimal point.

### When you're done, switch!

2) Update your function to return a sentence (string type) with the Celsius and Fahrenheit values inserted into the string, e.g.: ”You input 10. The converted value is …”

_Hint: I've shown you a couple methods you can use to do this..._


# If statements
- Used to execute commands when defined conditions are met
- Contains a conditional statement that has a True/False value
- If statement is True then a series of commands will be executed
- If the statement is False then commands are skipped

# If statement syntax
![If statement syntax](./intro_python_bootcamp/if_statement_syntax.png)

# Conditionals
![Conditional statements](./intro_python_bootcamp/conditional_statements.png)

In [None]:
# What will you see?

x = 3
if x > 0:
    print x

# If-else statements

Run this in your notebook:

```python
temperature = float(input('What is the temperature? '))
```


In [190]:
temperature = float(input('What is the temperature? '))

What is the temperature? 68


In [191]:
if temperature > 70:
    print('Wear shorts.')
else:
    print('Wear long pants.')
print('Get some exercise outside.')

Wear long pants.
Get some exercise outside.


```python
if temperature > 70:
    print('Wear shorts.')
else:
    print('Wear long pants.')
print('Get some exercise outside.')
```

There are two indented blocks: one comes after the `if` heading and is executed when the condition in the `if` heading is true.

This is followed by an `else:` line, followed by another indented block executed when the original condition is false.

In an `if-else` statement exactly one of two possible indented blocks is executed.

# What if we have more than one condition?

The syntax for an if-*elif*-else statement is:

```python
if condition1 :
    indentedStatementBlockForTrueCondition1
elif condition2 :
    indentedStatementBlockForFirstTrueCondition2
elif condition3 :
    indentedStatementBlockForFirstTrueCondition3
elif condition4 :
    indentedStatementBlockForFirstTrueCondition4
else:
    indentedStatementBlockForEachConditionFalse
```

![If-elif-else](./intro_python_bootcamp/if_elif_syntax.png)

Type:
```python
x = int(raw_input("Please enter an integer: "))
```


```python
if x < 0:
    x = 0
    print('Negative changed to zero')
elif x == 0:
    print('Zero')
elif x == 1:
    print('One')
else:
    print('More')
```

The if, each elif, and the final else line are all aligned. There can be any number of elif lines, each followed by an indented block.

Exactly one of the indented blocks is executed: the one corresponding to the first True condition.



> Check: what happens if more than one condition in an if-elif-else statement is true?

# For-loops

Python’s `for` statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.

The syntax for loops is:

```python
for iterator_name in iterating_sequence:
        …statements…
```




Try:
```python
words = ['plant', 'window', 'defenestrate']
for w in words:
    print(w, len(w))
```


and it returns:
```bash
plant 5
window 6
defenestrate 12
```

> Check: what's going on? Discuss with your neighbor. And then report back.

Any sequence can be your iterating sequence.

Some functions can create particularly useful sequences:

`range(<integer>)`
- Creates list of integers
- Starts with zero and each subsequent value is incremented by 1
- Returns list with length = input integer
- Last item in list is input -1 since list starts with zero



`len(<object>)`

- Checks the length of your object
- Useful as an input to your range() function
- For example, with a list: `range(len(mylist))`

# LUNCH

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | ** Python practice ** |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots|
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

# Independent practice
1. Create a function that checks the type of an input and returns True if the input is numeric (float or integer) or a False if it is another data type.

- Update your temperature conversion function to return an error message if a string is entered instead of a number
- Create a function that receives a list of numbers as an input, adds 1 to each number and returns the results as a list
- Update your temperature conversion function to accept a list of Celsius temperatures and return a list of Fahrenheit temperatures

Bonus:
Add error handling to your temperature conversion function!

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice  |
| 1:30 - 1:50 | ** The data science ecosystem ** |
| 1:50 - 2:50 | Pandas: dataframes, series and plots|
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

# The data science ecosystem

- NumPy, Pandas, sklearn, matplotlib, BeautifulSoup, and many more
- Libraries are designed to make specific tasks much easier

# The Python data science ecosystem: key components
![](./intro_python_bootcamp/python_ds_ecosystem.png)

You can use these libraries in your programs by importing them:



In [125]:
import pandas as pd

![XKCD Python import library](https://imgs.xkcd.com/comics/python.png)

# Where is Python used?

- Very common on data science teams, for analysis and making data products
- Also seen as backend language for websites
- Used in lots of contexts for quick, ad hoc "scripting" 

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice  |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | ** Pandas: dataframes, series and plots** |
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 |Your next steps |
---

## Pandas

![Pandas](http://i.imgur.com/OKffmnL.png)

pandas is a Python package providing **fast, flexible, and expressive data structures** designed to make working with “relational” or “labeled” data both easy and intuitive.

## Let's talk about those data structures

The two basic data structures are:
    
- Series
- DataFrame

## Series

A series is a 1D data structure. A series always has an index and optionally a column name.

In [160]:
my_series = pd.Series([10, 20, 30, 40, 50])
my_series

0    10
1    20
2    30
3    40
4    50
dtype: int64

In [161]:
my_series = pd.Series([10, 20, 30, 40, 50], index=[2012, 2013, 2014, 2015, 2016])
my_series

2012    10
2013    20
2014    30
2015    40
2016    50
dtype: int64

> Check: why are we wrapping those lists in pd.Series()? What is `pd`? 

## DataFrame

A DataFrame is a 2D data structure. It also has an index, and each column -- itself a series -- in the DataFrame has a column name.

A DataFrame is similar to a spreadsheet in structure.

<img src="http://i.imgur.com/Z5PAHRQ.png" width=400>

In [164]:
# how to read in a csv
df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/car/Salaries.csv')
df

Unnamed: 0.1,Unnamed: 0,rank,discipline,yrs.since.phd,yrs.service,sex,salary
0,1,Prof,B,19,18,Male,139750
1,2,Prof,B,20,16,Male,173200
2,3,AsstProf,B,4,3,Male,79750
3,4,Prof,B,45,39,Male,115000
4,5,Prof,B,40,41,Male,141500


In [165]:
df.head()

Unnamed: 0.1,Unnamed: 0,rank,discipline,yrs.since.phd,yrs.service,sex,salary
0,1,Prof,B,19,18,Male,139750
1,2,Prof,B,20,16,Male,173200
2,3,AsstProf,B,4,3,Male,79750
3,4,Prof,B,45,39,Male,115000
4,5,Prof,B,40,41,Male,141500


In [163]:
df.describe()

Unnamed: 0.1,Unnamed: 0,yrs.since.phd,yrs.service,salary
count,397.0,397.0,397.0,397.0
mean,199.0,22.314861,17.61461,113706.458438
std,114.748275,12.887003,13.006024,30289.038695
min,1.0,1.0,0.0,57800.0
25%,100.0,12.0,7.0,91000.0
50%,199.0,21.0,16.0,107300.0
75%,298.0,32.0,27.0,134185.0
max,397.0,56.0,60.0,231545.0


## What if I want to select an individual column?

In [None]:
df['rank']

## Why did that look different?

In [None]:
type(df)
type(df['rank'])
# type(df[['rank']])

# Let's plot data!

In [None]:
%matplotlib inline
df['salary'].plot(kind='hist')

## What if I want to pick my row and columns together?

In [None]:
# .iloc[rows to select, columns to select]
df.iloc[0:10,2:4]

## Let's fix that existing index from the csv

In [133]:
df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/car/Salaries.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,rank,discipline,yrs.since.phd,yrs.service,sex,salary
0,1,Prof,B,19,18,Male,139750
1,2,Prof,B,20,16,Male,173200
2,3,AsstProf,B,4,3,Male,79750
3,4,Prof,B,45,39,Male,115000
4,5,Prof,B,40,41,Male,141500


In [None]:
df = df.iloc[?:,:]
df.head()

## Exercise

Use the dataset in `df` to:
- Select the first and third columns two different ways 
- Using .iloc, select the second column's second data point
- Select the 'discipline' column as a Series and then as a DataFrame
- Plot a histogram of yrs.service
- Read this documentation for the df.plot.scatter() function: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.scatter.html. Can you make a scatterplot of salary vs yrs.service?

# pd.DataFrame.Groupby()

When looking at data, we often want to ask the same question for different groupings in our data.

For example, we have salaries and ranks -- what is the *mean salary* for *each* rank?

Pandas describes this as:
- Split: separate into groups, by some variable
- Apply: perform the same operation on each group
- Combine: return the results in a tidy data structure

The typical syntax for `groupby()` is:

### `dataframe.groupby(variable_to_group_by) [columns_to_apply_to] .some_function()`

In [156]:
# For example
df.groupby('rank')['yrs.service'].max()


rank
AssocProf    53
AsstProf      6
Prof         60
Name: yrs.service, dtype: int64

In [None]:
# You can tidy up the result with the .to_frame() method
df.groupby('rank')['yrs.service'].min().to_frame()


In [157]:
# Other functions work
df.groupby('discipline')['rank'].nunique()
df.groupby('sex')['salary'].count()


sex
Female     39
Male      358
Name: salary, dtype: int64

# Multiple dimensions and functions

You can make hierarchical groups!

You can also apply multiple functions to your groups, by passing them inside a list as an argument to the `.agg()` method.

In [158]:
# We're importing numpy so we can use its mean function
import numpy as np
df.groupby(['rank','sex'])['salary'].agg([max, min, np.mean])

Unnamed: 0_level_0,Unnamed: 1_level_0,max,min,mean
rank,sex,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AssocProf,Female,109650,62884,88512.8
AssocProf,Male,126431,70000,94869.703704
AsstProf,Female,97032,63100,78049.909091
AsstProf,Male,95079,63900,81311.464286
Prof,Female,161101,90450,121967.611111
Prof,Male,231545,57800,127120.822581


![Magic](http://i.imgur.com/YsbKHg1.gif)

# pd.Series.map()

9 times out of 10 we need to change our data in some way after we get it.

For example, we might want to squash yrs.since.phd into three bands, or remove outliers.

The concept of *mapping* a transformation -- applying the same change to every element in a series -- is very powerful.



In [5]:
# Define our function, or use a built-in one
def return_abbrv(sex):
    if sex == "Male" or sex == "Female":
        return sex[0]
    else:
        return "Other"

# Choose the Series to apply the function to
# And map it!
df.head(10)['sex'].map(return_abbrv)

NameError: name 'df' is not defined

### This is frequently used to change a column or create a new column in a dataframe.

In [180]:
df['sex'][:3]

0    Male
1    Male
2    Male
Name: sex, dtype: object

In [181]:
df['sex'] = df['sex'].map(return_abbrv)
df['sex'][:3]

0    M
1    M
2    M
Name: sex, dtype: object

### If you want a 1:1 or n:1 mapping, you can also use a dictionary

In [6]:
# Let's reverse what we've done
abbrv_dict = {"M":"Male", "F":"Female"}
df['sex'] = df['sex'].map(abbrv_dict)
df['sex'][:3]

NameError: name 'df' is not defined

# Once again...



![Magic](http://i.imgur.com/YsbKHg1.gif)

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice  |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots |
| 2:50 - 3:00 | ** BREAK ** |
| 3:00 - 3:45 | ** Python + pandas practice ** |
| 3:45 - 4:00 |Your next steps |
---

# Independent practice

1. Rewrite your function from this morning to convert USD to EUR
2. Let's assume our dataset's salaries are in USD; map them to EUR with your function
3. Change the discipline names: A is Engineering, B is Philosophy.
4. Plot a histogram of the salaries for each discipline's faculty
5. Find the maximum salary for each rank within each displine.


BONUS: Load some of your own data into a dataframe!

### LESSON GUIDE
| TIME  | TOPIC  |
|:-:|---|
| 10:00 - 10:15 | Introductions: Who's here? |
| 10:15 - 10:30 | Requirements checklist |
| 10:30 - 11:00 | The command line, .py files and Jupyter |
| 11:00 - 11:15 | BREAK |
| 11:15 - 12:15 | Python: data types, containers and control flow |
| 12:15 - 1:00 | LUNCH |
| 1:00 - 1:30 | Python practice  |
| 1:30 - 1:50 | The data science ecosystem |
| 1:50 - 2:50 | Pandas: dataframes, series and plots |
| 2:50 - 3:00 | BREAK |
| 3:00 - 3:45 | Python + pandas practice |
| 3:45 - 4:00 | ** Your next steps** |
---

![Questionnaire](./intro_python_bootcamp/ga_feedback.png)

# Next steps: choose your own adventure...

- ### Scraping web data?
- ### Replacing your Excel macros?
- ### Machine learning?

You've made the first steps toward any of these applications, and more.

# A Python 102 curriculum

There is a lot more to learn about Python. I'd recommend prioritizing:
- Python tuple, dictionary and list methods
- Python classes
- Panda joins, .apply(), dataframe creation
- Learning to read libraries' documentation
- Effective searches on Stack Overflow via google
- Sylistic best practices: follow PEP-8


# Thank you!

Feel free to connect: https://www.linkedin.com/in/winston-featherly-bean-6a050635/
