# Day 1: Introduction to Python 2.7 and the IPython Notebook

### by Justin B. Kinney

Today's tutorial gives an introduction to only the most basic features of Python 2.7 and the Jupyter IPython Notebook. The programming topics covered are **variables, numbers, boolean numbers, strings, lists, dictionaries, conditionals, loops, and functions.**

## About Python

Python is a very flexible language. You can use it to analyze data on the fly, write small analysis scripts, or write major software applications. I program almost exclusively in python in my day-to-day work. 

Python is a user-friendly language. It is designed for maximum readability. More than almost any language out there, valid python code looks like pseudocode. Many aspects of the language help to enforce clarity. There is also a set of convensions, called the PEP 8 Style Guide, that most programmers have adopted in order to make their code as clear as possible. I recommend trying to follow PEP 8 as much as possible, if only to make your code easier to decipher when you come back to it after a year or two of disuse. 

Python is a rapid prototyping language. It is designed to allow you to write programs as quickly and as painlessly as possible. One feature of Python that enables rapid prototyping is that it is an "interpreted" language: each line is executed by the Python "interpreter" one after the next. This is different than C, C++, or Java, which require that programs first be "compiled", i.e. translated in their entirety from code to byte code before they are run. 

Python is a very well-supported language. There is an enormous user base for python, and a large number of mature packages are now available for a wide variety of tasks. This was NOT the case 12 years ago when I first started learning Python. But now I am able to use Python for virtually all of my programming needs. And when I have a question about how to code something (which is about once every 2 minutes), Google is usually able to satisfy my question within a minute or two. 

The official python documentation page is https://www.python.org/. Lots of good documentation and tutorials can be found here. Another great resource for programmers is Stack Overflow at http://stackoverflow.com/. Just google a question about Python programming, and the first hit will likey be a post on Stack Overflow that answers your question. 

**Beware: There are 2 mutually incompatible versions of Python in widespread use: Python 2.7 and Python 3.5. Most packages have been written for 2.7, and this is the version used by most scientists. We will use 2.7 throughout this tutorial. Unless you know what you are doing, I suggest not even installing Python 3.5 on your computer.**

## The IPython Notebook

This is an **IPython Notebook**. You can learn more about these notebooks on the Jupyter project website, http://jupyter.org/. IPython notebooks provide a very conveneint interface to python. It allows you to include fully functional python code inside of a document that contains "markdown" (i.e. text like this) and figures that show analysis results. Note how the menu above allows use to specify whether each cell is a "code" cell (i.e. contains Python code) or a markdown cell. This type of hybrid document provides a very powerful method of interactive data anlysis. I strongly suggest using IPython Notebooks for most of your data anlysis tasks. Only when you start writing production code does it make sense to move away from the Notebook format. 

All of my IPython notebooks begin with the following series mysterious incantations. I will go through each of these lines later in the tutorial. In the mean time, it's best to just take the content of this cell on faith. Execute this code by clicking on it an pressing Shift+Enter. 

In [None]:
# Always put this first
%matplotlib inline
from __future__ import division
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

The "Hello World!" program, used in many language tutorials to illusrate the simplest program, is indeed very simple. Here it is. Again, execute it by typing Shift+Enter

In [None]:
print "Hello World!" # The Hello World program!

All this program does is display the text "Hello World!". This is done using the print function. The text "Hello World!" is the function argument. The text itself is what is called a "string". Print is the single most useful function in python, allowing one to inspect the contents of many variables.

The hash mark # signifies the beginning of a comment. All text following # on a line is ignored by the interpreter. 

You can also run Python scripts, along with command line commands, from within the interpreter

In [None]:
# Run the program stored in the file hello_world.py
%run hello_world.py

In [None]:
# Display the contents of the file hello_world.py (UNIX only)
%cat hello_world.py

In [None]:
# List the files in the current directory (UNIX only)
%ls -lah

In [None]:
# You can plot stuff in ipython notbook. Dont'w worry about the commands for now. 
x = np.arange(-10,10,0.01)
y = np.sin(x)
plt.plot(x,y)

In [None]:
# You can load pictures! Don't worry about the commands for now. 
from IPython.display import Image
Image(filename='cshl.jpg') 

## Variables

Variables are contains that hold different numerical values, text values, etc. The = sign is used to set the value of a variable. Unlike Java or C, variables do not have to be defined before a value is assigned to them. 

In [None]:
# Asign values to variables
an_int = 5           # An integer
a_float = 4.5        # A floating point number
a_bool = True        # A boolean variable
a_string = "Justin"  # A string
b_string = 'Justin'  # A different way to write a string
a_multiline = """   
I am a paragraph.       
I have multiple lines.
""" # A multiline string

# Now print all this stuff
print an_int
print a_float
print a_bool
print a_string
print b_string
print a_multiline

## Numbers

Mathematical operators are performed using familiar symbols (and perhaps few nonfamiliar ones):

In [None]:
x = 5
y = 2

print x+y     # Addition
print x-y     # Subtraction
print x*y     # Multiplication
print x/y     # Division
print x**y    # Power
print x%y     # Modulus
print x//y    # Floor division

To change the value of a variable, the following notation is useful

In [None]:
x = 5
print x

x += 2  # Add 2 to x
print x

x *= 2  # Multiply x by 2
print x

x /= 2  # Divide x by 2
print x

x -= 2  # Subtract 2 from x
print x

x %= 2  # Take x mod 2
print x

Order of operations matter in mathematical expressions. When in doubt, use paranetheses.

In [None]:
print 3+4*5
print 3+(4*5)
print (3+4)*5 

Every variable has a "type". You can determine this type within code

In [None]:
x = 5
y = 2

print type(x) 
print type(y) 
print type(x*y) 
print type(x/y)

# Use isinstance to test what type a variable is
print isinstance(x,int)

Notice that Python is smart enough to realize that, even though, x and y are integers, x/y should be a floating point number (that is, a number with a decimal point). Python 3.5 knows to do this automatically, but Python 2.7 (which we are using) does not. The incantation 
```
from __future__ import division
```
that we put at the top of this notebook makes Python 2.7 act like 3.5 in this regard.

## Boolean numbers

Booleans are numbers that can take on only two values: True and False. There operators are as follows:

In [None]:
# Remember to capitalize True and False
a = True
b = False

print a and b    # And
print a or b     # Or
print not a      # Not

Booleans are typically used to store the results of different tests. 

In [None]:
x = 5
y = 2
a = True
b = False

print x > y 
print x >= y 
print x == y 
print x != y 
print (x != y) and a 
print (x < y) or (x != y)

End Day 1
================

## Strings

Python makes working with strings very easy, especially when compared to C, C++, or Matlab. 

In [None]:
a = "foo"
b = "bar"
c = "foo bar baz"
d = [a,b,c]

print a + b + c    # Concatenate
print a*5          # This is actually useful somtimes
print a == b       # Test identity
print b in c       # Test if c contains b
print '-'.join(d)  # Joint strings together with a specified character
print c.split()    # Split strings based on white space

Indexing allows one to extract different characters from the string

In [None]:
s = 'Justin'

print s        # Show the entire string
print s[0]     # First character, corresonds to index 0
print s[-1]    # Last character
print s[:2]    # The first two characters
print s[-2:]   # The last two characters
print s[::2]   # Skip every other charcter
print s[::-1]  # Reverse the string

## Lists

In this previous example we used "lists" (e.g. the variable d). Lists are arrays of python variables (or other objects), kept in a well-defined order. The elements in a list can be of all different types. The elements of a list are accessed by brackets

In [None]:
v = [1, 'hi', [True,False], 57.3]

print v        # Show the entire list
print v[0]     # First element, corresonds to index 0
print v[-1]    # Last element, same as v[3]
print v[:2]    # The first two elements
print v[-2:]   # The last two elements
print v[::2]   # Skip every other element
print v[::-1]  # Reverse the list
print 'hi' in v   # Test whether list contains an element

# Change an element of v
v[0] = 42
print v

# Add an element to the end of v
v.append('x')
print v

# Insert an element in v
v.insert(2,'y')
print v

# Append a second list to v
v.extend([0,1,2])
print v

# Delete an element of v
del v[1]
print v

## Dictionaries

Dictionaries are one of Python's most useful datatypes. They can be thought of as a list of key-value pairs, which can easily be looked up via the key (which can be anything)

In [None]:
# Define a dictionary
d = {'A':'Justin', 5:'Python','B':5}
print d

# Access elements
print d['A']
print d[5]

# Add an element to the dictionary
d['foo'] = 'bar'
print d

# Remove an element from the dictionary
del d['B']
print d

# Get list of dictionary keys
print d.keys()

# Get list of dictionary values
print d.values()

## If, elif, else blocks

If blocks allow blocks of code to be executed only under specific conditions.

In [None]:
x = 5
y = 6

if x==y:
    print 'In block 1'
    print 'They are equal!'
elif x>y:
    print 'In block 2'
    print 'x is more than y'
else:
    print 'In block 2'
    print 'y is more than x'

Note the indentation within each code block. It is essential that all code within the same block have the same indentation level. For instance, the following code won't even run.

In [None]:
if x==y:
    print 'In block 1'
    print 'They are equal!'
elif x>y:
    print 'In block 2'
     print 'x is more than y'  # Notice the extra space here
else:
    print 'In block 2'
    print 'y is more than x'

PEP 8 sytle specifies that code blocks should be indented not with tabs but with **4 spaces**. This makes code maintenence a lot easier. I strongly recommend you adhere to this convention.

## Loops

In some fundamental sense, "loops" are what make a program a program. As in most programming languages there are two primary kinds of loops: "for" loops and "while" loops. 

In [None]:
# Print characters in a string one-by-one
s = 'Hi there URPs!'
for c in s:
    print c

In [None]:
# Print all nubmers from 0 to 9
print range(10)

for x in range(10):
    print x

In [None]:
x = 1
while x < 3:
    x *= 1.1
    print x

When using while loops, make very sure that your loop will actually end at some point. If your loop continues without end, go to "Kernel -> Interrupt" in the menu above. If your computer still acts strange, select "Kernel -> Restart". You will then have to evaluate your ipython notebook from the beginning. 

## Functions

Finally, we illustrate how to define a function. Instead of defining a single line function (which is readily done), the following example illustrates various good practices

In [None]:
def factorial(n):
    """Returns n factorial. n must be a nonnegative integer.""" # This is a "doc string"
    
    # Thow an error if n does not have the right form
    assert isinstance(n,int),'Input is not an integer' 
    assert n >= 0,'Input is not nonnegative' 
    assert n <= 1000,'Intput is too large!'
    
    # Initialize return variable
    val = 1
    
    # Loop over i=1,2,...,n
    for i in range(1,n+1):   
        val *= i
        
    return val  # Returns val to the user

We test this function by computing n! for n=1,2,...10

In [None]:
for n in range(10):
    print str(n) + '! is ' + str(factorial(n))

Just as important as making sure functions corectly process valid input correctly is to make sure they FAIL when provided with invalid input. Before a function does anything, it should test the validity of its input

In [None]:
# This should fail
print factorial(1.1)

In [None]:
# This should fail
print factorial(-10)

In [None]:
# This should fail
print factorial("I'm not even a number!")

In [None]:
# Also worth testing boundary cases
print factorial(1000)

The docstring is accessible from within python, and is often very useful. Execute the following command and a window will pop up that describes what this function does.

In [None]:
factorial?