# Python crash course

code  like a zen in pythonic

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [1]:
# The pound sign marks the start of a comment. Python itself
# ignores the comments, but they're helpful for anyone reading the code.
for i in [1, 2, 3, 4, 5]:
    print(i)                    # first line in "for i" block
    for j in [1, 2, 3, 4, 5]:
        print(j)                # first line in "for j" block
        print(i + j)            # last line in "for j" block
    print(i)                    # last line in "for i" block
print("done looping")

1
1
2
2
3
3
4
4
5
5
6
1
2
1
3
2
4
3
5
4
6
5
7
2
3
1
4
2
5
3
6
4
7
5
8
3
4
1
5
2
6
3
7
4
8
5
9
4
5
1
6
2
7
3
8
4
9
5
10
5
done looping


In [2]:
import module 
from module import  value1, value2

ModuleNotFoundError: No module named 'module'

## Please Dont

In [3]:
match = 10
from re import *    # uh oh, re has a match function
print(match)        # "<function match at 0x10281e6a8>"

<function match at 0x7f3eefb128e0>


## Functions

Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other arguments:

In [5]:
def double(x):
    return x * 2

def apply_to_one(f):
    """Calls a function f with 1 as its argument"""
    return f(1)

my_double = double

x = apply_to_one(my_double)
x

2

In [7]:
y = apply_to_one(lambda x: x +4)
y

5

In [16]:
def full_name(name="Gustavo", lastname="Borges"):
    return f'{name} {lastname}'

print(full_name())
print(full_name(lastname="Souza"))
print(full_name("Doutor", "Abobrinha"))
print(full_name("Magic", 1))

Gustavo Borges
Gustavo Souza
Doutor Abobrinha
Magic 1


## Strings

In [24]:
# normal string
single = 'Aloooo'
double = "aaaa"

# raw string
raw = r"c:\dev"

multiline = """ 
This
is
multi
line
"""

fstring = f'{raw} {multiline}'


fstring

'c:\\dev  \nThis\nis\nmulti\nline\n'

## Lists

In [25]:
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [integer_list, heterogeneous_list, []]

list_length = len(integer_list)     # equals 3
list_sum    = sum(integer_list)     # equals 6

In [26]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

zero = x[0]          # equals 0, lists are 0-indexed
one = x[1]           # equals 1
nine = x[-1]         # equals 9, 'Pythonic' for last element
eight = x[-2]        # equals 8, 'Pythonic' for next-to-last element
x[0] = -1            # now x is [-1, 1, 2, 3, ..., 9]

You can also use square brackets to slice lists. The slice i:j means all elements from i (inclusive) to j (not inclusive). If you leave off the start of the slice, you’ll slice from the beginning of the list, and if you leave of the end of the slice, you’ll slice until the end of the list:

In [28]:
first_three = x[:3]                 # [-1, 1, 2]
three_to_end = x[3:]                # [3, 4, ..., 9]
one_to_four = x[1:5]                # [1, 2, 3, 4]
last_three = x[-3:]                 # [7, 8, 9]
without_first_and_last = x[1:-1]    # [1, 2, ..., 8]
copy_of_x = x[:]                    # [-1, 1, 2, ..., 9]

every_third = x[::3]                 # [-1, 3, 6, 9]
five_to_three = x[5:2:-1]            # [5, 4, 3]

## Tuples

immutable lists

In [29]:
my_tuple = (1, 2)
other_tuple = 3, 4

They are very useful for returning values from functions and multiple assignments

In [33]:
def sum_and_product(x, y):
    return (x+y), (x*y)

sum, product = sum_and_product(6,5)

print(f"{sum} {product} ")

11 30 


## Dictionaries

In [47]:
dict = {}

tweets = {
    "user" : "borges",
    "text" : "Data Science is Awesome",
    "retweet_count" : 100,
    "hashtags" : ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}

user_in =  "user" in tweets     #true

user  =  tweets.get("user")     #'borges'

tweet_keys   = tweets.keys()     # iterable for the keys
tweet_values = tweets.values()   # iterable for the values
tweet_items  = tweets.items()    # iterable for the (key, value) tuples

"borges" in tweet_values    # True (slow but the only way to check)

print(f""" 
{tweet_keys}
{tweet_values}
{tweet_items}
{"borges" in tweet_values}
""")

 
dict_keys(['user', 'text', 'retweet_count', 'hashtags'])
dict_values(['borges', 'Data Science is Awesome', 100, ['#data', '#science', '#datascience', '#awesome', '#yolo']])
dict_items([('user', 'borges'), ('text', 'Data Science is Awesome'), ('retweet_count', 100), ('hashtags', ['#data', '#science', '#datascience', '#awesome', '#yolo'])])
True



In [55]:
# Defaultdicts

document =  'A defaultdict is like a regular dictionary, except that when you try to look up a key it doesn’t contain, it first adds a value for it using a zero-argument function you provided when you created it'

from collections  import defaultdict

# int, list, dict all return empty or  zero values
word_counts = defaultdict(int)
for word in document:
    word_counts[word] += 1

## Counters

A Counter turns a sequence of values into a defaultdict(int)-like object mapping keys to counts:

In [56]:
from collections import Counter

c = Counter([0,1,1,1,2,3,35,1,6])
c

Counter({1: 4, 0: 1, 2: 1, 3: 1, 35: 1, 6: 1})

In [58]:
# it has a most common method that returns the value and its frequency
c.most_common(2)

[(1, 4), (0, 1)]

## Sets

In [60]:
s  = {1,2,4,5,6}
s.add(8)

print(f"{2 in s} |  len= {len(s)} | {3  in  s}")

True |  len= 6 | False


We’ll use sets for two main reasons. The first is that in is a very fast operation on sets. If we have a large collection of items that we want to use for a membership test, a set is more appropriate than a list:

In [63]:
hundreds_of_other_words = ['lala', "shuaa"]
stopwords_list = ["a", "an", "at"] + hundreds_of_other_words + ["yet", "you"]

"zip" in stopwords_list     # False, but have to check every element

stopwords_set = set(stopwords_list)
"zip" in stopwords_set      # very fast to check

False

The second reason is to find the distinct items in a collection:


In [65]:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list)                # 6

item_set = set(item_list)                 # {1, 2, 3}
num_distinct_items = len(item_set)        # 3

distinct_item_list = list(item_set)       # [1, 2, 3]

## Control flow

In [71]:
# if  condition1:
#     "aaa"
# elif conidtion2
#     "bbb"
# else:
#     "ccc"

# while condition
#     aaaa

#  for tanana in lalala

## List Comprehensions

Frequently, you’ll want to transform a list into another list by choosing only certain elements, by transforming elements, or both. The Pythonic way to do this is with list comprehensions:

In [74]:
even_numbers  =  [x for x in range(5) if x  % 2 == 0]
print(even_numbers)


square_dict = {x:  x * x for x in range(5)}
print(square_dict)

square_set = {x * x for x in range(5)}
print(square_set)

[0, 2, 4]
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
{0, 1, 4, 9, 16}


In [83]:
# if  you don't need any values from the list
zeros =  [0 for _ in even_numbers]
zeros

[0, 0, 0]

In [88]:
# it can include multiple fors

pairs =  [(x,y) for x in range(5) for y in range(2)]
pairs

[(0, 0),
 (0, 1),
 (1, 0),
 (1, 1),
 (2, 0),
 (2, 1),
 (3, 0),
 (3, 1),
 (4, 0),
 (4, 1)]

In [94]:
increasing_pairs = [(x, y) for  x in range(5) for y in range(x + 1, 5)]
increasing_pairs

[(0, 1),
 (0, 2),
 (0, 3),
 (0, 4),
 (1, 2),
 (1, 3),
 (1, 4),
 (2, 3),
 (2, 4),
 (3, 4)]

## Automated testing and  assert

In [97]:
assert 1 + 1 == 2
assert 1 + 1 == 2, "1 + 1 should equals 2, but it did not"
# message  to be printed if assertion fails

In [104]:
# It’s not particularly interesting to assert that 1 + 1 = 2. What’s more interesting is to assert that functions you write are doing what you expect them to:
def smallest_item(xs):
    return min(xs)

assert smallest_item([256, 53,56, 5, 8,88]) == 5
assert smallest_item([-2,2,5,4,1,-1]) == -2

assert smallest_item([-2,2,5,4,1,-1]) == -1, "It should be -1"

AssertionError: It should be -1

In [107]:
# Another less common use is to assert things about inputs to functions:

def smallest_item(xs):
    assert xs, "empty list has no smallest item"
    return min(xs)

smallest_item([])

AssertionError: empty list has no smallest item

## OOP / POO

In [128]:
# PascalCase
class CountingClicker:
    """A class should have a docstring, just like a function"""
    def __init__(self, count = 0 ) -> None:
        self.count = count
    
    def __repr__(self):
        return f"CountingClicker(count={self.count})"
    
    def click(self, num_times = 1):
        self.count += num_times
    
    def read_clicks(self) -> int:
        return self.count
    
    def reset(self):
        self.count = 0

clicker1 = CountingClicker()
clicker2 = CountingClicker(1)
clicker2 = CountingClicker(count=1)

clicker1.click(2)
clicker1.click(5)
clicker1.click()
clicker1.read_clicks()

8

In [129]:
# now that we have defined it, lets write some tests for our clicker

mock_clicker = CountingClicker()
assert mock_clicker.read_clicks() == 0, "Clicker should  start with count 0"
mock_clicker.click()
mock_clicker.click()
assert mock_clicker.read_clicks() == 2, "After 2 clicks, clicker should have read count 2"
mock_clicker.click(8)
assert mock_clicker.read_clicks() == 10, "After adding 8 clicks, clicker should have read count 10"
mock_clicker.reset()
assert mock_clicker.read_clicks() == 0, "After reset, clicker should have read count 0"

### subclass

In [136]:
class NoResetClicker(CountingClicker):
    def reset(self):
        """This  function does not work in no reset class, the  count will stay the same."""
        pass

noreset = NoResetClicker()
assert noreset.count == 0, "Clicker should  start with count 0"
noreset.click(2)
assert noreset.read_clicks() == 2, "After 2 clicks, clicker should have read count 2"
noreset.reset()
assert noreset.read_clicks() == 2, "Reset should not do anything"

## Iterables and generators

In [143]:
def generate_range(n):
    i =  0
    while i < n:
        yield i  # every call to yield produces a value of the generator
        i += 1

for i in  generate_range(10):
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3
i: 4
i: 5
i: 6
i: 7
i: 8
i: 9


In [150]:
# generators comprehension
evens_below_20 = (i for i in generate_range(20) if i % 2 == 0)
evens_below_20

# Such a “generator comprehension” doesn’t do any work until you iterate over it (using for or next). We can use this to build up elaborate data-processing pipelines:

<generator object <genexpr> at 0x7f3ece6f0860>

In [154]:
# Not infrequently, when we’re iterating over a list or a generator we’ll want not just the values but also their indices. For this common case Python provides an enumerate function, which turns values into pairs (index, value):

names = ['Alice', 'Cheshire', 'Mad hatter', 'The queen']

# Pythonic
for i, name  in  enumerate(names):
    print(f"name in index {i} is {name}")

name in index 0 is Alice
name in index 1 is Cheshire
name in index 2 is Mad hatter
name in index 3 is The queen


## Randomness

In [182]:
# The random module actually produces pseudorandom (that is, deterministic) numbers based on an internal state that you can set with random.seed if you want to get reproducible results

import random
random.seed(42) #seed to get same results every time

four_uniform_randoms = [random.random() for _  in range(4)]
four_uniform_randoms

[0.6394267984578837,
 0.025010755222666936,
 0.27502931836911926,
 0.22321073814882275]

In [205]:

up_to_ten = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
random.shuffle(up_to_ten)

up_to_ten, random.choice(up_to_ten),random.randrange(20,99),


([5, 7, 8, 2, 9, 3, 10, 6, 1, 4], 2, 83)

In [209]:
# if you need to randomly choose a sample of elements without replacement (i.e., with no duplicates), you can use random.sample:
lottery_numbers = range(60)
winning_numbers = random.sample(lottery_numbers, 6)  # [16, 36, 10, 6, 25, 9]

winning_numbers

[14, 8, 32, 31, 5, 48]

## Regex

In [213]:
import re

re_examples = [                        # All of these are True, because
    not re.match("a", "cat"),              #  'cat' doesn't start with 'a'
    re.search("a", "cat"),                 #  'cat' has an 'a' in it
    not re.search("c", "dog"),             #  'dog' doesn't have a 'c' in it.
    3 == len(re.split("[ab]", "carbs")),   #  Split on a or b to ['c','r','s'].
    "R-D-" == re.sub("[0-9]", "-", "R2D2") #  Replace digits with dashes.
    ]

assert all(re_examples), "all the regex examples should be True"

# One important thing to note is that re.match checks whether the beginning of a string matches a regular expression, while re.search checks whether any part of a string matches a regular expression. At some point you will mix these two up and it will cause you grief.

# (Link)[https://docs.python.org/3/library/re.html]

## Functional Programming

## zip and argument unpacking

Often we will need to zip two or more iterables together. The zip function transforms multiple iterables into a single iterable of tuples of corresponding function:

In [217]:
list1 = ['a', 'b', 'c']
list2 = [1,2,3]

pairs = [pair for pair in zip(list1, list2)]
pairs

[('a', 1), ('b', 2), ('c', 3)]

In [219]:
# unzip, *

letters, numbers = zip(*pairs)
letters, numbers

(('a', 'b', 'c'), (1, 2, 3))

In [220]:
def add(a,b): return a+b

add(1,2)

try:
    add([1,2])
except TypeError:
    print("add expects two inputs")
add(*[1,2])

add expects two inputs


3

## args and kwargs

args is a tuple of its unnamed arguments and kwargs is a dict of its named arguments. 

In [226]:
# context

def doubler(f):
    # Here we define a new function that keeps a reference to f
    def g(x):
        return 2 * f(x)

    # And return that new function
    return g

In [228]:
# it works in some cases
def f1(x):
    return x + 1

g = doubler(f1)
assert g(3) == 8,  "(3 + 1) * 2 should equal 8"
assert g(-1) == 0, "(-1 + 1) * 2 should equal 0"

In [229]:
def f2(x, y):
    return x + y

g = doubler(f2)

try:
    g(1, 2)
except TypeError:
    print("as defined, g only takes one argument")

as defined, g only takes one argument


In [222]:
# What we need is a way to specify a function that takes arbitrary arguments. We can do this with argument unpacking and a little bit of magic:

def magic(*args, **kwargs):
    print("unnamed args: ", args)
    print("keyword args: ", kwargs)

magic(1,2, key='word', key2='word2')

unnamed args:  (1, 2)
keyword args:  {'key': 'word', 'key2': 'word2'}


It works the other way too, if you want to use a list (or tuple) and dict to supply arguments to a function:

In [224]:
def magic_trick(x,y,z):
    return x+y+z

x_y_z_list =  [1,2]
z_dict =  {"z": 3}

assert magic_trick(*x_y_z_list, **z_dict) == 6, "1 + 2 + 3 should be 6"

You could do all sorts of strange tricks with this; we will only use it to produce higher-order functions whose inputs can accept arbitrary arguments:

In [231]:
def doubler_correct(f):
    """work no matter what kind of input f  expects """

    def g(*args, **kwargs):
        """whatever argumentg is supplied, pass them through to f"""
        return 2 * f(*args, **kwargs)
    return g

x =  doubler_correct(f2)

assert x(1,2) == 6,  "doubler should work now"

As a general rule, your code will be more correct and more readable if you are explicit about what sorts of arguments your functions require; accordingly, we will use args and kwargs only when we have no other option.

## Type annotations

In [244]:
#  use them, they are  great for making your code readable

# ok mode
def total(xs: list) -> float:
    return sum(xs)

# typed list
from typing import List  

def total(xs: List[float]) -> float:
    return sum(xs)


In [246]:
# ok,but unnecessary
x: int = 5

# good cases
from typing import Optional

values: List[int] = []
best_so_far: Optional[float] =  None # either float or none

In [253]:
from typing import Dict, Iterable, Tuple

counts: Dict[str, int] =  {"Data": 1, "Science": 2}

lazy = True

# lists and generators are both iterable
if lazy:
    evens: Iterable[int] = (x for x in range(10) if x % 2 == 0)
else:
    evens = [0,2,4,6,8]

# type for each element
triple: Tuple[int, float, int] = [1, 1.5, 2]

In [257]:
from typing import Callable

def twice(repeater: Callable[[str,  int], str], s:str) -> str:
    return repeater(s, 2)

def comma_repeater(s:str, n:int) -> str:
    n_copies  = [s for _ in  range(n)]
    return ', '.join(n_copies)

assert twice(comma_repeater, "type hints") == "type hints, type hints"

In [262]:
# As type annotations are just Python objects, we can assign them to variables to make them easier to refer to:

Num = int
Numbers  = List[Num]

def total(xs: Numbers) -> Num:
    return sum(xs)