# Introduction to Programming Review

### 1. Python Basics

You'll be using Python to help with data mining throughout this course. Here are some fundamentals you should recall. If not, please take time to review the suggested resources and last course's lessons.

#### Variables and Math

Know how to create a variable in Python. To review, a variable is a representation of a piece of data, just like in algebra. Know how to perform math with data in Python. Just like with your calculator, Python can add and subtract numbers whether they are in variables or not. 


In [None]:
# variables
a = 1
my_variable = a + 1

In [None]:
my_variable

In [None]:
my_variable + 3

#### Lists

Know how to use lists in Python. When we manipulate data, we must often change the way it is shaped. This becomes especially important as we deal with more complex data structures. Review what lists can do.

In [None]:
# lists
my_list = [my_variable, 9, 7, 'a', 'this is a longer string', 
           'and another']

In [None]:
my_list[0]

In [None]:
my_list[-1]

In [None]:
# reading exceptions

sorted(my_list)

In [None]:
sorted(my_list[-3:])

In [None]:
sorted(my_list[:-3])

In [None]:
sorted(my_list[:3])

#### Loops and Logic

Know how to use for loops and logic statements. These can help you figure out how to transform data. Remember, a for loop can iterate over any list or iterable. 


In [None]:
# loops
for element in my_list:
    print(element)

In [None]:
# logic
for element in my_list:
    if isinstance(element, str):
        print(element)
    else:
        print(element / 2)

In [None]:
my_list

#### Functions

Know how to create a function. Recall how to write a function in Python. These are for use of the same code numerous times. If you recall, you can use a def statement and then the name. Then comes any arguments or keyword arguments in your function. Then you can call your function in another cell. Remember, if you forget your arguments you will get an error.

In [None]:
def add_one(x):
    '''Takes an integer or number and adds one.'''
    return x + 1

In [None]:
add_one(4)

In [None]:
add_one()

In [None]:
add_one?

#### Debugging

Remember how to debug your errors. Sometimes it is as easy as googling the error. Sometimes there is a good clue just in what it says. For better google results, remove specific parts of your error message which are related to the data you are using.

In [None]:
# debugging
add_one('a')

In [None]:
def add_one(x):
    if isinstance(x, str):
        print('sorry, we can not handle your request.')
        return
    return x + 1

In [None]:
add_one('a')

In [None]:
add_one(1.4)

### 2. Statistics in Python

You'll be using statistical libraries in Python to help with data mining throughout this course. Here are some libraries and concepts you should recall. If not, please take time to review the suggested resources and last course's lessons.

#### Statistical Aggregates in Pandas and Numpy

You should remember how to calculate statistics like mean, median, mode, interquartile range and correlation with dataframes or arrays in Python. 

In [None]:
import pandas as pd
import numpy as np
%pylab inline

df = pd.read_csv('../data/full_titanic.csv')

In [None]:
df.describe()

In [None]:
np.median([1, 1, 1, 4, 7, 7, 8, 10])

### Data Visualization Basics

You should remember how to visualize your dataframe or array quickly using Python to see if there are particular outliers or trends. 

In [None]:
df['Age'].hist()

In [None]:
df['Pclass'].value_counts().plot(kind='pie')

In [None]:
df.plot(x='Fare', y='Age', style='o')

In [None]:
fare_first_quartile, fare_third_quartile = df['Fare'].quantile(
    [.25, .75])
fare_iqr = fare_third_quartile - fare_first_quartile
fare_mean = df['Fare'].mean()

In [None]:
fare_iqr

In [None]:
outliers = df[(df['Fare'] > fare_third_quartile + 1.5 * fare_iqr) | 
              (df['Fare'] < fare_first_quartile - 1.5 * fare_iqr)]

In [None]:
outliers.shape

In [None]:
df.shape

#### Resources for Review

- [DataCamp: Introduction to Pandas](https://www.datacamp.com/courses/pandas-foundations)
- [Wes McKinney's Pandas Book](http://wesmckinney.com/pages/book.html)
- [Julia Evan's Pandas Cookbook](https://github.com/jvns/pandas-cookbook)

Many more tutorials on YouTube! (Try searching for PyData or SciPy conference tutorials!)

## 3. Debugging, Questioning and Exploring fundamentals. 

As we used in the previous course, you will need to figure out how to debug code, make things work and scrap together bits from different lessons to make a working piece of code, notebook or script. You will be expected to search and find answers to error messages. You will be expected to question the data and figure out what to ask when presented a new dataset. And you will definitely want to find new questions and ideas via data exploration and perhaps exploring the topic further in research and books.

## 4. Challenging one another. 

We will continue with group projects and group chat. Even if in our previous course you did not use the group channel often, I highly encourage you to do so this year. By helping one another via projects and assignments, you can garner a deeper understanding of the topics. 
