# Introduction to Python

This introduction to Python will focus on the first of two main topics: basic Python syntax.
The second will cover tools for data manipulation, analysis, and plotting.

The Python syntax we will cover is:
- loading libraries and modules,
- printing information,
- opening and reading files,
- creating lists,
- functions,
- for loops, and
- control statements (if statements).

IPython notebooks are organized by "cells." Each cell can have its own code and can be run independently and in any order (although they are usually run top to bottom in a notebook.) To run a cell and move to the next cell press ```Shift+Enter```. To run a cell and stay on that cell press ```Control+Enter```.

Questions to be discussed in groups are highlighted in <font color='green'>green</font>. If you don't understand a function that is used, try googling something like "python function-name".

## Loading CSVs

Many times, data will come to us in a format called "CSV" or "comma separated values." Generally, these files will contain a "header" row that contains the column names and then a number of rows containing the data entries.

We'll use the 'csv' library that comes with Python.

In [82]:
# Import the csv library. Also, this is how you make a comment!
# Run me with `Shift+Enter`.
import csv

Before we load it into Python, have a look at the "student_data.csv" file in a text editor. It's filled with fake student grade data.

<font color='green'>
1. How is the data formatted?<br>
2. How should the data be "read in"?
</font>

After you've answered these questions, run the following cell and look at what is printed. You can also run the following cell to see what the variable "rows" is.

You can click on the white space to the left of the output to minimize or maximize it.

In [83]:
with open('student_data.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    rows = []
    for row in reader:
        print row
        rows.append(row)

['ID', 'MT1', 'MT2', 'HW', 'Final']
['0', '0.6185606717497889', '0.52307659617995528', '0.71269241271932049', '0.81050233677508476']
['1', '0.56688004713495843', '0.49888269232116766', '0.65758470044597461', '0.53792487930500854']
['2', '0.56708374182047172', '1.0', '0.75853960517151209', '0.50327846447710534']
['3', '1.0', '0.57696137570817552', '0.59272832863954539', '0.98736220267195729']
['4', '0.70320063012792033', '0.52947692932883039', '0.70602692943238332', '0.63219431967191186']
['5', '0.85857204437161694', '0.57554747021982611', '0.65759885042622446', '0.65969605145136012']
['6', '0.26562804038956517', '0.50341833772919986', '0.73538757925315901', '0.18463914952768501']
['7', '0.50481450319153964', '0.58556925477668653', '0.66374509008946259', '0.62927115062462169']
['8', '0.88155813339717826', '1.0', '0.73280097137760825', '0.78282276268030981']
['9', '0.70307625737941626', '0.94471049886614944', '0.72690962794784819', '1.0']
['10', '0.72355359915934858', '0.2072430367192815

In [84]:
rows

[['ID', 'MT1', 'MT2', 'HW', 'Final'],
 ['0',
  '0.6185606717497889',
  '0.52307659617995528',
  '0.71269241271932049',
  '0.81050233677508476'],
 ['1',
  '0.56688004713495843',
  '0.49888269232116766',
  '0.65758470044597461',
  '0.53792487930500854'],
 ['2',
  '0.56708374182047172',
  '1.0',
  '0.75853960517151209',
  '0.50327846447710534'],
 ['3',
  '1.0',
  '0.57696137570817552',
  '0.59272832863954539',
  '0.98736220267195729'],
 ['4',
  '0.70320063012792033',
  '0.52947692932883039',
  '0.70602692943238332',
  '0.63219431967191186'],
 ['5',
  '0.85857204437161694',
  '0.57554747021982611',
  '0.65759885042622446',
  '0.65969605145136012'],
 ['6',
  '0.26562804038956517',
  '0.50341833772919986',
  '0.73538757925315901',
  '0.18463914952768501'],
 ['7',
  '0.50481450319153964',
  '0.58556925477668653',
  '0.66374509008946259',
  '0.62927115062462169'],
 ['8',
  '0.88155813339717826',
  '1.0',
  '0.73280097137760825',
  '0.78282276268030981'],
 ['9',
  '0.70307625737941626',
  '0.94

## For Loops

A common procedure in programming is to loop through a bunch of data or lists and do something specific with each item. Here, we have a bunch of students' grade information. Let's say we want to calculate each student's final grade based on the weighting:
- MT1: 25%
- MT2: 25%
- HW: 20%
- Final: 30%

and print out their final score. Since we want to calculate the weighting a bunch of times, we'll write a function to do that for us.

In [85]:
# Function to take grade data and calculate final score
def final_grade(grade_data):
    # Loop up "list comprehension" to understand the next line
    float_data = [float(data) for data in grade_data]
    return .25*float_data[0]+.25*float_data[1]+.2*float_data[2]+.3*float_data[3]
    
# Look up "slices" in python to understand what [1:] syntax is doing.
for student in rows[1:]:
    print('Student: '+student[0]+', Grade: '+str(final_grade(student[1:])))

Student: 0, Grade: 0.671098500559
Student: 1, Grade: 0.559335088745
Student: 2, Grade: 0.694462395833
Student: 3, Grade: 0.808994670457
Student: 4, Grade: 0.639033071652
Student: 5, Grade: 0.687958464169
Student: 6, Grade: 0.394730855239
Student: 7, Grade: 0.594126302697
Student: 8, Grade: 0.851796556429
Student: 9, Grade: 0.857328614651
Student: 10, Grade: 0.477977252849
Student: 11, Grade: 0.673736072074
Student: 12, Grade: 0.647198688275
Student: 13, Grade: 0.676094715117
Student: 14, Grade: 0.624342795016
Student: 15, Grade: 0.773739222054
Student: 16, Grade: 0.498637801837
Student: 17, Grade: 0.743048347749
Student: 18, Grade: 0.781180610652
Student: 19, Grade: 0.658127638686
Student: 20, Grade: 0.565339697635
Student: 21, Grade: 0.510398496892
Student: 22, Grade: 0.654334177592
Student: 23, Grade: 0.792642474647
Student: 24, Grade: 0.712215645783
Student: 25, Grade: 0.550504162074
Student: 26, Grade: 0.642576893539
Student: 27, Grade: 0.609205792664
Student: 28, Grade: 0.64423236

## If Statements

Another commom procedure is to have different branches in code. This is commonly done with an "if statement." An examples of this would be to print "S" if the student gets a60% or above and "U" if they do not. 

In [86]:
for student in rows[1:]:
    grade = final_grade(student[1:])
    if grade >= .6:
        letter = 'S'
    else:
        letter = 'U'
    print('Student: '+student[0]+', Grade: '+str(grade)+', '+letter)

Student: 0, Grade: 0.671098500559, S
Student: 1, Grade: 0.559335088745, U
Student: 2, Grade: 0.694462395833, S
Student: 3, Grade: 0.808994670457, S
Student: 4, Grade: 0.639033071652, S
Student: 5, Grade: 0.687958464169, S
Student: 6, Grade: 0.394730855239, U
Student: 7, Grade: 0.594126302697, U
Student: 8, Grade: 0.851796556429, S
Student: 9, Grade: 0.857328614651, S
Student: 10, Grade: 0.477977252849, U
Student: 11, Grade: 0.673736072074, S
Student: 12, Grade: 0.647198688275, S
Student: 13, Grade: 0.676094715117, S
Student: 14, Grade: 0.624342795016, S
Student: 15, Grade: 0.773739222054, S
Student: 16, Grade: 0.498637801837, U
Student: 17, Grade: 0.743048347749, S
Student: 18, Grade: 0.781180610652, S
Student: 19, Grade: 0.658127638686, S
Student: 20, Grade: 0.565339697635, U
Student: 21, Grade: 0.510398496892, U
Student: 22, Grade: 0.654334177592, S
Student: 23, Grade: 0.792642474647, S
Student: 24, Grade: 0.712215645783, S
Student: 25, Grade: 0.550504162074, U
Student: 26, Grade: 0.