# Python: syntax and concepts

This is a tutorial on basic Python syntax and concepts for the [KIPAC computing boot camp](http://kipac.github.io/BootCamp).

Author: [Sean McLaughlin](https://github.com/mclaughlin6464)

----
## Part 0: Basic syntax:

Python syntax is very much *unlike* most other standard languages, in some subtle ways at least. 

This leads to otherwise skilled programmers writing very bad python code. 

Writing slick, clean python code is very easy, if you just know the basics. Unlike langues like C++ or Java, python has:

- No semicolons (;). Each line needs to be a complete statment, unless broken by a backslash (\).
- Variables can be declared dynamically. 
- Nested statments are opened by colons (:), and the structures are specified by **indentation**.

Let's start with something simple: The Zen of python. Press Shift+Enter to evaluate the cell below.

In [19]:
import this

This encapsulates the design ideas of python. If something is complex or obtuse, python probably has a better way to do it! 

Not everything will be covered here, so make sure to check out the [python docs](https://docs.python.org/2/) to learn more! 

Let's start with a 'hello world' in the cell below! 

In [20]:
print 'Hello world!'

Hello world!


-----
### Part 1: Simple Types

All variables in python are dynamically typed; no declaraion necessary! Here are the built in data types in python.

In [21]:
#PSST. I'm a comment! I don't affect anything!
#These are all 6 of the built-ins in pythonn
x =1 #int
print x
x = 2.2 #float
print x
x = 10**4L #Long
print x
x  = 1.0J #complex
print x
x = 'STRANG' #string
print x
x = True #bool
print x

1
2.2
10000
1j
STRANG
True


Certain operations will cast variables from one type to another; it's also possible to manually cast them. 

In [22]:
s= '10'
print s
y = int(s)
print y+10
print y*3.0 #cast as float
print long(y)

10
20
30.0
10


# TODO Consider deleting. 
One other thing worth mentioning; python simples are **imutable**. You can't change their value! 

In [23]:
#the id function returns the address of the variale in memory
x = True
print 'X is True id:  %d'%id(x)#I'll explain string formatting and printing in a second!
x = False
print 'X is False id: %d'%id(x)
y = False
print 'Y is False id: %d'%id(y)

X is True id:  9431952
X is False id: 9431920
Y is False id: 9431920


Why have multiple bools when they can only every be True or False?

In [24]:
#Same is true for all simples!
x = 1
print id(x)
x+=1
print id(x)
print id(2)

14754168
14754144
14754144



Details on all of python's built-ins can be found [here](https://docs.python.org/2/library/stdtypes.html). I'm going to shy away from droning on about the details of the syntax; what you want to know you can look up there. I'll just make a few key mentions.
 * Longs have unlimited size (limited only by the size of your memory). If you've never heard of scientific notation, go nuts.
 * bools' are a subset of integers. When you cast `True` to an `int`, you get `1`. What do you think `False` does?
 * Most types have an interesting `bool` casting that can be used in control flow. More on that in a bit.
 * the `is` operator is kinda cool, but it compares memory addresses, not equality. Sometimes this is the same thing and sometimes it's not.
 

In [25]:
x = 2
print x is 2

True


### Part 1.1: String Aside

Strings are different enought form the numeric types that they warrent a little bit of their own discussion. A string can be created using either double or single quotes. This can be used to insert the other into the string. Escapes are also possible.

In [26]:
print "I'm a string!"
print 'I\'m a string too!'

I'm a string!
I'm a string too!


One of the most important things with strings is formatting. This is done using the `%` sign and the right letter. Once again, details [here](https://docs.python.org/2/library/stdtypes.html#string-formatting). Here's a few short examples of printing fancy strings.

In [27]:
print "Look it's a one: %d"%1
print "Look it's a one followed by a float 2!: %d, %f"%(1, 2.0)
pi = 3.141592653
print "And finally, pi to 3 digits of precesion is %.3f ya know?"%pi

Look it's a one: 1
Look it's a one followed by a float 2!: 1, 2.000000
And finally, pi to 3 digits of precesion is 3.142 ya know?


One of the really interesting things about python strings is that they're kinda like container types. I'll be going into grave detail about them in the next section, but here's a few examples of some things you can do. I'm also going to use some of the cool string methods/functions. There's a list of them [here](https://docs.python.org/2/library/stdtypes.html#string-methods). There are a **ton** of them and they're really useful. I suggest you familizarize yourself with them. 

In [28]:
s = 'supacalifragilisticexpialidocious'
for char in s: #go character by character
    print char,
print '\n' #newline
    
print s[0] #first char
print s[-2] #second to last char
print s.capitalize()
print s.split('i') #remove i's and partition into chunks aroudn them

s u p a c a l i f r a g i l i s t i c e x p i a l i d o c i o u s 

s
u
Supacalifragilisticexpialidocious
['supacal', 'frag', 'l', 'st', 'cexp', 'al', 'doc', 'ous']


### Excercise 1
#### Answers at the bottom!

What is the first digit of `4**16-27`? How about the last?

------
## Part 2: Container Types

Now that you know all there is to know about the basics, you can put them places! There are four container types in python:
 * lists
 * tuples
 * sets
 * dictionaries
They have a lot of similarities, but let's break them down one by one.

#### Lists
Lists are like arrays in other languages, but with 2 majors differences.

1) They can hold data of any type.
They're meant to hold data of the same type, like the names of the files in a directory or data from voltage source. However, they can hold data of any type. Look!

In [29]:
x = [1,2,3,4,5,6]
y = ['potato', 3, True, 89.6]
print x, y

[1, 2, 3, 4, 5, 6] ['potato', 3, True, 89.6]


2) They are dynamically sized.
This is **AWESOME**. No more worrying about allocating memory in advance.

In [30]:
x = []
x.append(1)
print x
x.extend([2,3,4])
x.append('I LOVE APPENDING')
print x

[1]
[1, 2, 3, 4, 'I LOVE APPENDING']


There are also performance costs with this; those are laid out [here](https://wiki.python.org/moin/TimeComplexity) along with the time complexity of a few other operations I'll show you below. 

As for other regular list stuff, check out the examples below. The rest of the list functions are [here](https://docs.python.org/2/tutorial/datastructures.html). As per my usual shpeal, there are many and they are awesome.

In [31]:
arr = [1,8,2,-1, 37,140, -1]
print arr[1] #access the second element
print arr[1:4]#this is slicing. It cuts the array from the second to the 4th element
print len(arr)# get the array's length
arr.sort() #sort the array in place
print arr
arr.remove(1)#remove the first element valued 1
print arr

8
[8, 2, -1]
7
[-1, -1, 1, 2, 8, 37, 140]
[-1, -1, 2, 8, 37, 140]


We'll be revisiting lists in a bit after we cover control flow structures. They're pretty integral to the layout of `for` loops.

#### Tuples

Tuples are...weird. They are a lot like lists in a lot of ways,except for one major one.

In [32]:
tup = (1,2,3,4,5)
print tup[0] #1st elem
print len(tup) #length
tup[0] = -1#modify

1
5


TypeError: 'tuple' object does not support item assignment

`'tuple' object does not support item assignment`.

Tuples differ from lists in that tuples are **immutable**. Once they're set, they can't be changed. This may seem a little bit odd at first, but there are reasons for this. Tuples are not supposed to be homogenous sequences like lists; there are heterogenous. Each element represents a different idea, like cartesian coordinates. Their unique properties also give them unique roles in:
 * They are hashable, so they can be used as keys for dicts or sets (see that in a sec)
 * They can be used to have functions return multiple values at once (see that in 2 secs)
 * Tuple unpacking is crazy cool.
 
The last one I want to elaborate on. Python, as it runs, evalutates comma separated things as tuples automagically. 

In [33]:
x = 1,2,3,4,5
print x
print type(x)

(1, 2, 3, 4, 5)
<type 'tuple'>


This also works in reverse. You can assign to individual elements of a tuple.

In [34]:
a,b,c,d,e = x
print a,b,c,d+e

1 2 3 9


By combinging these 2 effects you encounter something truly unique. In other languages, variable swapping takes a 3rd temp variable like so:

In [35]:
a = 1
b =2
tmp = a
a = b
b = tmp
print a,b

2 1


Python says no more! One line variable swaps!

In [36]:
a, b = 1,2
#GAHHHHHHH
b,a = a,b
print a,b

2 1


The future is here!

#### Sets

Sets behave exactly like logical sets. 

In [37]:
a = set([1,2,3])
b = set([3,4,5])
print a | b #a U b
print a & b # a intersect b
print a-b #a-b
a.add(1) #add an element already in a
print a #duplicates not allowed

set([1, 2, 3, 4, 5])
set([3])
set([1, 2])
set([1, 2, 3])


Sets can only store immutable objects. This means you can't store a list, but you can store a tuple. This is because sets are ***hash driven***. The details of how that works aren't too important, just remember that immutable objects are the only type that are hashable. 

The number one important thing about sets is that they have constant time membership checks. Check out below.

In [38]:
for n in xrange(1,7):#We're doing control flow structures next!
    N = 10**n
    print 'Membership is size %d'%N
    l = range(N)#make a list of size N
    s = set(l) #note that the creation of a set is O(N), so you need to be careful about how you do it.
    print 'List:'
    #I'm usign some ipython magic here; i'll get to those later too.
    %timeit N/2 in l#check membership
    print 'Set:'
    %timeit N/2 in s#check membership
    print '_'*20

Membership is size 10
List:
10000000 loops, best of 3: 93.1 ns per loop
Set:
10000000 loops, best of 3: 59.7 ns per loop
____________________
Membership is size 100
List:
The slowest run took 4.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 440 ns per loop
Set:
10000000 loops, best of 3: 59.5 ns per loop
____________________
Membership is size 1000
List:
100000 loops, best of 3: 3.86 µs per loop
Set:
10000000 loops, best of 3: 73.9 ns per loop
____________________
Membership is size 10000
List:
10000 loops, best of 3: 37.4 µs per loop
Set:
10000000 loops, best of 3: 73.9 ns per loop
____________________
Membership is size 100000
List:
1000 loops, best of 3: 375 µs per loop
Set:
The slowest run took 25.56 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 74.6 ns per loop
____________________
Membership is size 1000000
List:
100 loops, best of 3: 

This is an insanely useful feature and I reccomend you think of ways to use it in your code wherever you can to increase peformance. 

#### Dictionaries

Dictionaries are python's best data structure; even the language designers have said so. They're like set's older, better brother. Dictionaries are a mapping type, so they pair a key to a value. The key must be immutable, but values need not be. They have a variety of uses, like binning or counting

In [None]:
#counting
colors = ['red', 'yellow', 'cyan', 'red', 'green', 'green', 'blue', 'red', 'black']
colorDict = {}
for color in colors:
    if color not in colorDict: #O(1) membership checks
        colorDict[color] = 0
    colorDict[color]+=1
print colorDict
print colorDict['red']

#binning
divDict= {} #dictionary of the smallest prime factor of a number.
from math import sqrt #I'll show off imports later
for n in xrange(1,10**3):#iterate from 1 to 10000
    for d in xrange(2, int(sqrt(n))+1):#iterate from 2 to sqrt(N) in the denominator
        result = n/float(d)
        if int(result) == result: #it's an int
            if d not in divDict: #initialize the dict
                divDict[d] = []
            divDict[d].append(n)
            break
print divDict.keys()
print divDict[31],divDict[11]

#Making a simple pseudo object
person = {}
person['Name'] = 'Sean'
person['Height'] = 6.166 #feet
person['Eye Color'] = 'blue'
print person

Dicts are also just useful to treat like arrays, but with keys that aren't easily encoded as ints. Use them often.

### Excercise 2
#### Answers at the bottom
Using the dictionary below, print 'Hello World' in the ICAO alphabet. Can you generalize it to any message?

In [None]:
d = {'a':'alfa', 'b':'bravo', 'c':'charlie', 'd':'delta', 'e':'echo', 'f':'foxtrot',
     'g':'golf', 'h':'hotel', 'i':'india', 'j':'juliett', 'k':'kilo', 'l':'lima',
     'm':'mike', 'n':'november', 'o':'oscar', 'p':'papa', 'q':'quebec', 'r':'romeo',
     's':'sierra', 't':'tango', 'u':'uniform', 'v':'victor', 'w':'whiskey', 
     'x':'x-ray', 'y':'yankee', 'z':'zulu'}

#your code here
#HINT You'll need a for loop. If you'd like, read ahead and come back.

## Modules

Many functionalities and features are availiable by loading modules.

[See here](https://docs.python.org/2/library/) for a list of modules in the Python standard library

In [39]:
cos(0)

NameError: name 'cos' is not defined

In [None]:
import math

math.cos(0)

In [None]:
from math import cos

cos(0)

In [None]:
sin = 1.0
print(sin)

from math import sin
print(sin)

In [None]:
from math import *

In [None]:
import os

print os.listdir(os.curdir)
print os.path.getsize('Python (2).ipynb')

In [None]:
import time

time.sleep(3)
print time.time()

In [None]:
import subprocess

subprocess.check_output(['wc', '-l', 'Zen.txt'])

In [None]:
import antigravity

## Functions and lambda expressions

In [None]:
def add(x, y):
    return x + y

add(1, 2)

In [None]:
def addone(x, y=1):
    return x + y

print(addone(1))
print(addone(1, 3))

lambda expression is an one-line function that always has a return

In [None]:
addone = lambda x, y=1: x + y
addone(1)

In [None]:
from scipy.integrate import quad

quad(lambda x: x*x*x, 0, 1)

In [None]:
def cube(x):
    return x*x*x

quad(cube, 0, 1)

### Variable scopes

Variable scope is implicitly determined by the scope **in which one assigns a value to the variable**, unless scope is explicitly declared with `global`.

See the difference between the following three cells.

In [None]:
x = 1

def my_function():
    print(x)
    
my_function()

print(x)

In [None]:
x = 1

def my_function():
    x = 3
    print(x)
    
my_function()

print(x)

In [None]:
x = 1

def my_function():
    global x
    x = 3
    print(x)
    
my_function()

print(x)

## Unpacking

In [None]:
a, b = ('this is a', 'this is b')
print(a)
print(b)

In [None]:
a, b = b, a
print(a)
print(b)

In [None]:
def func(a, b):
    return a+b, a-b, a*b, a/b

x, y, z, w = func(1, 3)

print(x)
print(y)
print(z)
print(w)

## Slicing

In [None]:
s = 'Happy birthday to you!'

print s[:5]
print s[6:14]
print s[-4:]
print s[6:-4]
print s[::2]
print s[1::2]
print s[::-1]

## Dictionaries

In [None]:
d = dict(a=1, b=2, c=3)
print d['a']
print d['b']
print d['c']
print d['d'] # this would raise a KeyError

## Iterator

### For loop syntax

    for <counter> in <iterator>:
        <do ...>
        
You can iterate over many differnt objects!

In [None]:
for i in xrange(10):
    print(i)

In [None]:
for i in xrange(6, 22, 4):
    print(i)

Note the counter is just a container, and is *not* used in controlling the for loop

In [None]:
for i in xrange(6, 20, 4):
    i = 0
    print(i)

In [None]:
for i in 'Hello World!':
    print(i)

In [None]:
for i in [1, 2.23, None, 'string', lambda x:x+1, [1,2,3]]:
    print(i)

In [None]:
with open('Zen.txt') as f:
    for line in f:
        print(line)

Iterators can also be used in expressions!

In [None]:
any(x == 9 for x in xrange(9))

In [None]:
all(x < 9 for x in xrange(9))

In [None]:
list(x*x for x in xrange(9))

# or use list comprehension
[x*x for x in xrange(9)]

In [None]:
dict((str(x), x) for x in xrange(9))

# or use dictionary comprehension
{str(x): x for x in xrange(9)}

In [None]:
arr = [1,2,3,4,5,6,7,8,9,10,11]

Compare the three methods below:

In [None]:
count = 0
for i in range(len(arr)):
    if arr[i] > 1:
        count += 1

In [None]:
count = 0
for i in arr:
    if i > 1:
        count += 1

In [None]:
count = sum(1 for i in arr if i > 1)

Good practice: if you use this expression more than once, name it a function.

In [None]:
def countif(f, arr):
    return sum(1 for i in arr if f(i))

countif(lambda x: x > 1, arr)

## Exceptions

In [None]:
a = 'abc'
a = int(a) # this would raise an Error

In [None]:
try:
    a = int(a)
except (ValueError, TypeError):
    a = 0
    
print(a)

# Tasks

Some problems to work out. 

(Thanks to Joshua Meyers for providing some of the one-liner solutions!)

## Task 1

Write a function: 
- Take one string as input
- Return `Ture` if there exists an integer character *AND* its value matches its index in that string. Reture `False` otherwise.

Examples:

    task1('000')    => True      # '0' at index 0
    task1('abc')    => False     # no integer character
    task1('321')    => False     # no matchin indices
    task1('abc321') => True      # '3' at index 3

In [None]:
def task1(s):
    return any(str(i)==c for i, c in enumerate(s[:10])) 

## Task 2

Write a function: 
-   Take one list of numbers as input
-   Return `True` if the numbers in the list are monotone (strictly) decreasing. Reture `False` otherwise.

Examples:

    task2([4,3,1,2])   => False
    task2([4,3,2,1])   => True
    task2([4,3,2,2,1]) => False

In [None]:
def task2(numbers):
    try:
        n = numbers[0]
    except IndexError: # the input list is empty
        return True
    for m in numbers[1:]:
        if m >= n:
            return False
        n = m
    return True

In [None]:
# alternative, an one liner, using `zip`

def task2(numbers):
    return all(m < n for m, n in zip(numbers[1:], numbers[:-1]))

## Task 3

Write a function: 
-   Take two lists of numbers as input
-   Return `True` if the following two criteria are both true:
    1.  The second list is longer than or of the same length as the first list.
    2.  Each element in the second list is larger than or equal to the corresponding element (i.e., at the same index)
        in the first list.
-  Reture `False` otherwise.

Examples:

    task3([3,2,1,0], [1,2,3,4])   => False
    task3([3,2,1,0], [9,8,7])     => False
    task3([3,2,1,0], [9,8,7,6])   => True
    task3([3,2,1,0], [9,8,7,6,5]) => True


In [None]:
def task3(list1, list2):
    return len(list2) >= len(list1) and all(y >= x for x, y in zip(list1, list2))

## Task 4

Write a function: 
-   Take a string as input
-   Return a tuple whichs contains two numbers
    1. The first number is the number of unique characters in the input string.
    2. The second number is maximal occurance of a single character.

Examples:

    task4('apple')       => (4, 2)
    task4('universe')    => (7, 2)
    task4('mississippi') => (4, 4)
    task4('')            => (0, 0)

In [None]:
# using `dict`

def task4(s):
    d = dict()
    for c in s:
        if c in d:
            d[c] += 1
        else:
            d[c] = 1
    return len(d), max(d.values()) if d else 0

In [None]:
# using `defaultdict`

from collections import defaultdict

def task4(s):
    d = defaultdict(int)
    for c in s:
        d[c] += 1
    return len(d), max(d.values()) if d else 0

In [None]:
# using `set`

def task4(s):
    unique_char = set(s)
    max_occurance = max(s.count(c) for c in unique_char) if unique_char else 0
    return len(unique_char), max_occurance

In [None]:
# alternatively, an one liner

def task4(s):
    return len(set(s)), max(s.count(c) for c in s) if s else 0

## Task 5

Write a function: 
-   Take two arguments:
    1. a list of lists
    2. an integer, n
-   Return a list, which collects the n-th element of each list in the input list, in the same order. (0-indexed)
-   Note: if a list in the input list is shorter than n+1, put `None` as the corresponding element in the returned list.

Examples:

    task5([[1, 2, 3], [4, 5, 6], [7, 8], [9, 10]], 1)       => [2, 5, 8, 10]
    task5([[1, 2, 3], [4, 5, 6], [7, 8], [9, 10]], 2)       => [3, 6, None, None]
    task5([[], [], [], [], []], 0)                          => [None, None, None, None, None]

In [None]:
def task5(lists, n):
    return_arr = list()
    for this_list in lists:
        try:
            item = this_list[n]
        except IndexError:
            item = None
        return_arr.append(item)
    return return_arr

In [None]:
# alternative, an one liner, using list comprehension

def task5(lists, n):
    return [this_list[n] if n < len(this_list) and n >= -len(this_list) else None for this_list in lists]