# A Crash Course in Python

This is not a comprehensive Python tutorial but instead is intended to highlight the parts of the language that will be most important to us (some of which are often not the focus of Python tutorials). If you have never used Python before, you probably want to supplement this with some sort of beginner tutorial.

--------------------------------------------------------------------------

## Virtual Environment

As a matter of good discipline, you should always work in a virtual environment, and never using the “base” Python installation.

To create an (Anaconda) virtual environment:

In [None]:
# create a Python 3.6 environment named "dsfs"
conda create -n dsfs python=3.6

In [None]:
source activate dsfs # Activate the env

## Whitespace Formatting

In [None]:
# The pound sign marks the start of a comment. Python itself
# ignores the comments, but they're helpful for anyone reading the code.
for i in [1, 2, 3, 4, 5]:
    print(i) # first line in "for i" block
    for j in [1,2,3,4,5]:
        print(j)
        print( i+j )
    print(i)

print("Done looping")

Whitespace is ignored inside parentheses and brackets, which can be helpful for long-winded computations:

In [2]:
long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20)

lists_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

more_list_of_lists = [[1, 2, 3],
                     [4, 5, 6],
                     [7, 8, 9]]

In [5]:
# You can also use a backslash to indicate that a statement continues onto the next line

two_plus_three = 2 + \
                3
print (two_plus_three)

5


## Modules

Certain features of Python are not loaded by default. These include both features that are included as part of the language as well as third-party features that you download yourself. In order to use these features, you’ll need to import the modules that contain them.

In [7]:
# An example

import re

my_regex = re.compile("[0-9]+", re.I) #re is the module containing functions and constants for working with regular expressions.

If you need a few specific values from a module, you can import them explicitly and use them without qualification:

In [8]:
from collections import defaultdict, Counter

lookup = defaultdict(int)
my_counter = Counter()

In [10]:
# importing the entire contents of a module into your namespace might inadvertently overwrite variables you’ve already defined:

match = 10
from re import * # uh oh, re has a match function
print(match) # "<function match at 0x10281e6a8>"

<function match at 0x000001BB7B3D1D90>


## Functions

 A function is a rule for taking zero or more inputs and returning a corresponding output. In Python, we typically define functions using def:

Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other arguments:

In [12]:
def double(x):
    """
    This is where you put an optional docstring that explains what the
    function does. For example, this function multiplies its input by 2.
    """
    return x*2

def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)

my_double = double
x = apply_to_one(my_double)

In [14]:
# It is also easy to create short anonymous functions, or lambdas:
y = apply_to_one(lambda x: x + 4)  # Equals 5

print(y)

5


In [15]:
# Function parameters can also be given default arguments, which only need to be 
# specified when you want a value other than the default:

def my_print(message  = "Hello World, I'm Chris"):
    print(message)

my_print("What's Up")
my_print()

What's Up
Hello World, I'm Chris


In [19]:
def full_name( f_name = "First Name", l_name="Last Name"):
    return f_name + " " + l_name

print(full_name("Chris", "Barsolai"))
print(full_name("Chris"))
print(full_name(l_name = "Barso"))

Chris Barsolai
Chris Last Name
First Name Barso


## Strings

Python uses backslashes to encode special characters. For example:

In [20]:
tab_string="\t"

len(tab_string)

1

In [21]:
# To create raw strings i.e backlash as backlash

not_tab_string = r"\t"
len(not_tab_string)

2

In [23]:
# To combine strings, it can be done in two ways:
first = "Chris" ; last = "Barsolai"

full_name_1 = first + " " + last
full_name_2 = "{0} {1}".format(first, last)

print(full_name_1)
print(full_name_2)

Chris Barsolai
Chris Barsolai


## Exceptions

When something goes wrong, Python raises an exception. Unhandled, exceptions will cause your program to crash. You can handle them using try and except:

In [24]:
try:
    print (0/0)
except ZeroDivisionError:
    print("Cannot divide by zero")

Cannot divide by zero


## Lists

In [27]:
int_list = [1,2,3]
heterogenous_list = ["word", 0.1, True]
list_of_lists = [int_list, heterogenous_list, []]

list_length = len(int_list)
list_sum = sum(int_list)

print(list_length)
print(list_sum)

3
6


You can also use square brackets to _slice lists_. The slice i:j means all elements from i (inclusive) to j (not inclusive). If you leave off the start of the slice, you’ll slice from the beginning of the list, and if you leave of the end of the slice, you’ll slice until the end of the list:

In [29]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

first_three = x[:3]                # [0,1,2]
three_to_end = x[3:]               # [3,4, ..... ,9]
one_to_four = x[1:5]               # [1,2,3,4]
last_three = x[-3:]                # [7,8,9]
without_first_and_last = x[1:-1]   # [1,2 ... 8]
copy_of_x = x[:]

# You can similarly slice strings and other “sequential” types.
# A slice can take a third argument to indicate its stride, which can be negative:

every_third = x[::3]               # [0,3,6,9]
five_to_three = x[5:2:-1]          # [5,4,3]

Python has an in operator to check for list membership:

In [30]:
1 in [0,1,2]

True

In [31]:
1 in [4,7,5]

False

It is easy to concatenate lists together. If you want to modify a list in place, you can use extend to add items from another collection:

In [33]:
x = [1,2,3]
x.extend([4,5,6])

# Using list addition
y = x + [7,8,9]   # y is [1, 2, 3, 4, 5, 6]; x is unchanged

x.append(0)       # x is now [1, 2, 3, 0]
y = x[-1]         # equals 0
z = len(x)        # equals 4

In [34]:
# It’s often convenient to unpack lists when you know how many elements they contain:
x, y = [1, 2] # now x is 1, y is 2

# Although you will get a ValueError if you don’t have the same number of elements on both sides.
# A common idiom is to use an underscore for a value you’re going to throw away:

_, y = [1, 2] # now y == 2, didn't care about the first element

## Tuples

Tuples are lists’ immutable cousins. Pretty much anything you can do to a list that doesn’t involve modifying it, you can do to a tuple. You specify a tuple by using parentheses (or nothing) instead of square brackets.

In [36]:
my_list = [1,2]
my_tuple = (1,2)
other_tuple = 3,4
my_list[1] = 3         # my_list is now [1, 3]

try:
    my_tuple[1] = 3
except TypeError:
    print("Cannot modify a tuple")

Cannot modify a tuple


In [39]:
def sum_and_product(x,y):
    return (x+y), (x*y)

sp = sum_and_product(2, 3)       # sp is (5, 6)
s, p = sum_and_product(5,10)     # s is 15, p is 50

----------------------------------------------------------

## Dictionaries

A dictionary associates values with keys:

In [2]:
empty_dict = {}          # Pythonic
empty_dict2 = dict()     # Less pythonic
age = {"Chris": 25, "Nevis": 19}

chris_age = age["Chris"]      # Equals 25

In [3]:
# But you’ll get a KeyError if you ask for a key that’s not in the dictionary:

try:
    nel_age = age["Nelson"]
except:
    print("No age listed")

No age listed


In [4]:
# You can check for the existence of a key using 'in':

nevis_has_age = "Nevis" in age    # True
nel_has_age = "Nelson" in age     # False

In [5]:
# Dictionaries have a get method that returns a default value (instead of raising an exception) 
# when you look up a key that’s not in the dictionary:

chris_age = age.get("Chris", 0)        # Equals 25
nelson_age = age.get("Nelson", 0)      # Equals 0
no_ones_age = age.get("No One")        # default is None

# You can assign key/value pairs using the same square brackets:

age["Irene"] = 25
age["Gloria"] = 24

In [6]:
print(age["Irene"])

25


In [7]:
# You can use dictionaries to represent structured data:

tweet = {
    "user": "chrisbarso",
    "text": "Data Science is Awesome",
    "retweet_count": 30,
    "hashtag": ["data", "ml", "datascience", "GoogleML"]
}

# We can also look at all keys or values:

tweet_keys = tweet.keys()        # iterable for the keys
tweet_values = tweet.values()    # iterable for the values
tweet_items = tweet.items()      # iterable for the (key, value) tuples

In [8]:
print(tweet_keys)

dict_keys(['user', 'text', 'retweet_count', 'hashtag'])


In [9]:
"user" in tweet_keys          # True
"user" in tweet               # True (Pythonic way of checking for keys)
"chrisbarso" in tweet_values  # True (slow but the only way to check)

True

[1] Imagine that you’re trying to count the words in a document. An obvious approach is to create a dictionary in which the keys are words and the values are counts. As you check each word, you can increment its count if it’s already in the dictionary and add it to the dictionary if it’s not:

In [None]:
word_counts = {}
for word in document:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

[2] You could also use the “forgiveness is better than permission” approach and just handle the exception from trying to look up a missing key:

In [None]:
word_counts = {}
for word in document:
    try:
        word_counts[word] += 1
    except KeyError:
        word_counts[word] = 1

[3] A third approach is to use get, which behaves gracefully for missing keys:

In [None]:
word_counts = {}
for word in document:
    previous_count = word_counts.get(word, 0)
    word_counts[word] = previous_count + 1

A defaultdict is like a regular dictionary, except that when you try to look up a key it doesn’t contain, it first adds a value for it using a zero-argument function you provided when you created it. In order to use defaultdicts, you have to import them from collections:

In [11]:
from collections import defaultdict

word_counts = defaultdict(int)
for word in document:
    word_counts[word] += 1
    

# They can also be useful with list or dict, or even your own functions:

dd_list = defaultdict(list)                # list() produces an empty list
dd_list[2].append(1)                       # now dd_list contains {2: [1]}
dd_dict = defaultdict(dict)                # dict() produces an empty dict
dd_dict["Joel"]["City"] = "Seattle"        # {"Joel" : {"City": Seattle"}}
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1                          # now dd_pair contains {2: [0, 1]}