# A Crash Course in Python

This is not a comprehensive Python tutorial but instead is intended to highlight the parts of the language that will be most important to data folk (some of which are often not the focus of Python tutorials). If you have never used Python before, you probably want to supplement this with some sort of beginner tutorial.

### Whitespace Formatting
Many languages use curly braces to delimit blocks of code. Python uses indentation:

In [None]:
for i in[1,2,3]:
    print(i)# first line in "for i" block
    for j in [4,5]:
        print(j) # first line in "for j" block
        print(i+j)# last line in "for j" block
    print(i)# last line in "for i" block
print("done looping")

## Modules

Certain features of Python are not loaded by default. These include both features that are included as part of the language as well as third-party features that you download yourself. In order to use these features, you’ll need to import the modules that contain them.


In [None]:
import re
my_regex = re.compile("[0-9]+",re.I) # how to assign a variable with '='

In [None]:
import re as regex
my_regex = regex.compile("[0-9]+",regex.I)

## (Main) types

In [None]:
type(1.0)

In [None]:
type(1)

In [None]:
type('Hello world')

In [None]:
type("Hello world double quot.")

### Strings

Python is great for string processing. Check the [documentation](https://docs.python.org/3.7/library/stdtypes.html) for more on this 


In [None]:
tab_string="\t" # represents the tab character

In [None]:
not_tab_string=r"\t"# to use back slaches as back slashes, raw strings represents the characters '\' and 't'

In [None]:
first_name = 'Nick'
second_name = 'Staines    '

In [None]:
full_name = first_name + " " + second_name
full_name

In [None]:
f"{first_name} {second_name}" # f strings are super cool

In [None]:
full_name.split(" ")

In [None]:
full_name.lower()

In [None]:
full_name.strip()

## Data structures

### Lists
Probably the most fundamental data structure in Python is the list, which is simply an ordered collection

In [None]:
integer_list = [1,2,3]
heterogeneous_list = ["string",0.1,True]
list_of_lists = [integer_list, 
                 heterogeneous_list, 
                 [] ]

In [None]:
# other properties 
len(integer_list)

In [None]:
# using built-in sum function 
sum(integer_list)

#### Indexing

In [None]:
x=[0,1,2,3,4,5,6,7,8,9]

x[0]

In [None]:
x[-1] # Pythonic for last element

In [None]:
x[-2] 

#### Slicing 

In [None]:
first_three = x[:3]
first_three

In [None]:
three_to_end=x[3:]
three_to_end

In [None]:
without_first_and_last=x[1:-1]
without_first_and_last

In [None]:
copy_of_x=x[:]

#### Membership

**This check involves examining the elements of the list one at a time, which means that you probably shouldn’t use it unless you know your list is pretty small**

In [None]:
1 in [1,2,3] 

In [None]:
0 in [1,2,3]

#### Methods

In [None]:
x = [1,2,3]
x.extend([4,5,6]) # modify x in-place
x

In [None]:
x = [1,2,3]
x.append(0) # also modify x in place
x

In [None]:
x, y = [1,2] # unpack

In [None]:
x

In [None]:
y

### Tuples

Tuples are lists’ immutable cousins. Pretty much anything you can do to a list that doesn’t involve modifying it, you can do to a tuple. You specify a tuple by using parentheses (or nothing) instead of square brackets

In [None]:
my_list=[1,2]
my_tuple=(1,2)
other_tuple=3,4

In [None]:
my_list[1]=3
my_list

In [None]:
my_tuple[1]=3 #error

### Dictionaries

Another fundamental data structure is a dictionary, which associates values with keys and allows you to quickly retrieve the value corresponding to a given key:

In [None]:
empty_dict= {} 
grades={"FGS": 10 ,"SPC":11, 
         'NS': 20, 'CH': 30}

In [None]:
grades['CH']

In [None]:
grades['DA'] # error

In [None]:
'DA' in grades , 'FGS' in grades # Look ma no Brackets! 

Methods

In [None]:
grades.get('SPC', 0)

In [None]:
grades.get('DA', 0)

In [None]:
grades.keys() 

In [None]:
grades.values()

In [None]:
grades.items() # very usefull!! 

#### Counters

A Counter turns a sequence of values into a `defaultdict(int)` -like (**homework**) object mapping keys to counts:

In [None]:
from collections import Counter

In [None]:
c = Counter([0, 1, 2, 0])
c

In [None]:
import urllib3 # ignore code for now

url = "https://gist.githubusercontent.com/provpup/2fc41686eab7400b796b/raw/b575bd01a58494dfddc1d6429ef0167e709abf9b/hamlet.txt"

http = urllib3.PoolManager()
response = http.request('GET', url)
data = response.data.decode('utf-8')

In [None]:
h = Counter(word.strip().lower()
        for word in data.split(" ")
        if word)
h # python is fast, as any task use the correct tool

In [None]:
h.most_common(10) # very useful

### Sets

Another useful data structure is set, which represents a collection of distinct elements. You can define a set by listing its elements between curly braces.
However, that doesn’t work for empty sets, as {} already means “emptydict.” In that case you’ll need to use `set()` itself:

In [None]:
primes_below_10 = {2,3,5,7}

In [None]:
s = set()
s.add(1)# s is now {1}
s.add(2)# s is now {1, 2}
s.add(2)# s is still {1, 2}
x=len(s)# equals 2
2 in s # sets are VERY FAST for membership checking!

We’ll use sets for two main reasons. The first is that in is a very fast operation on sets. 
If we have a large collection of items that we want to use for a membership test, a set is more appropriate than a list.

In [None]:
ten_k_list = list(range(1_000_000))

In [None]:
%timeit 999000 in ten_k_list

In [None]:
ten_k_set = set(range(10_000))

In [None]:
%timeit 999000 in ten_k_set

The second reason is to find the distinct items in a collection

In [None]:
item_list=[1,2,3,1,2,3]
set(item_list)

In [None]:
len(set(data.split(" "))) # in hamlet

## Looping 

Python’s for loops are actually for each loops

Because we don’t actually care about the indexes in our loop, there is a much simpler method of looping we can use:

In [None]:
colors = ["red", "green", "blue", "purple"]
for color in colors:
    print(color)

In [None]:
# if we need indexes
presidents = ["Washington", "Adams", "Jefferson", "Madison", "Monroe", "Adams", "Jackson"]
for num, name in enumerate(presidents, start=1):
    print("President {}: {}".format(num, name))

## Control Flow

In [None]:
if 1 > 2:
    message = "if only 1 were greater than two..."
elif 1 > 3: 
    message = "elif stands for 'else if'"
else:
    message = "when all else fails use else (if you want to)"

In [None]:
message

In [None]:
x = 0
while x < 10:
    print(f"{x} is less then 10")
    x += 1 # pythonic way for x = x + 1 
    
    

In [None]:
# range(10) is the numbers 0, 1, ..., 9
for x in range(10):
    print(f"{x} is less than 10")

If you need more complex logic, you can use continue and break:

In [None]:
for x in range(10):
    if x == 3: # assert equality in Python
        continue #go immediately to the next iteration
    if x == 5:
        break #exit the loop
    print(x)

## Truthiness

In [None]:
one_is_less_than_two=1<2 # equals True
true_equals_false = True == False

In [None]:
x = None

assert x == None,"this is the not the Pythonic way to check for None"  # assert statement checking for validity
assert x is None,"this is the Pythonic way to check for None"

In [None]:
falsy_items = [
    False,
    None,
    [],
    {},
    "",
    set(),
    0,
    0.0   
]

In [None]:
for item in falsy_items:
    print(f"item is {item} and type is {bool(item)}")

Pretty much anything else gets treated as True.

In [None]:
bool(10), bool("hello"), bool([1,2,3])

## Functions

A function is a rule for taking zero or more inputs and returning a corresponding output. In Python, we typically define functions using def

In [None]:
def double(x):
    """
    This is where you put an optional docstring 
    that explains what the function does. 
    For example, this function multiplies its input by 2
    """
    return x * 2

xs = [1,10,100]

for x in xs:
    print(double(x=x))

Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other arguments:

In [None]:
def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)

my_double = double # assign a funtion to variable
print(apply_to_one(my_double))
    

Function parameters can also be given default arguments, which only needto be specified when you want a value other than the default:

In [None]:
def full_name(first="What's-his-name", last="Something"):
    return first + " " + last

In [None]:
full_name()