# Modules

You'll need to `import` the modules that contain features you need.

In [4]:
import re
my_regex = re.compile("[0-9]+", re.I)

You could use `alias` as well if it has unwidely name or if you have to type it a lot.

In [5]:
import matplotlib.pyplot as plt

If you need a few specific values from a module, you can import them explicitly and use them without qualification.

In [6]:
from collections import defaultdict, Counter

lookup = defaultdict(int)
my_counter = Counter()

# Functions

A function is a rule for taking zero or more inputs and returning a corresponding output.

In [7]:
def double(x):
    """this function multiplies its input by 2"""
    return x * 2

Function in Python is first-class, which means that we can assign them to variables and pass them into functions just like other arguments.

In [9]:
def apply_to_one(f):
    """calls the function f with 1 as its argument"""
    return f(1)

my_double = double
x = apply_to_one(my_double)
print(x)

2


Short anonymous functions, or `lamdas`

In [10]:
y = apply_to_one(lambda x: x + 4)
print(y)

5


You can assign `lamdas` to variables as well.

In [11]:
another_double = lambda x: x * 2
# But don't do this, do this instead
def another_double(x): return x * 2

`default arguments`

In [12]:
def subtract(a=0, b=0):
    return a - b

print(subtract(10, 5))
print(subtract(0, 5))
print(subtract(b=6))

5
-5
-6


# Exceptions

In [15]:
try:
    print(0/0)
except ZeroDivisionError:
    print("Cannot divide by zero")

Cannot divide by zero


# Lists

The `list` is fundamental data structure in Python. The `list` is simply an ordered collection.

In [18]:
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [integer_list, heterogeneous_list, []]

`Get` or `Set` the '`n`th' element of a list.

In [30]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [47]:
# Get the last element
nine = x[-1]
print(nine)
# Get the next-to-last element
eight = x[-2]
print(eight)
# Change the first element(index 0) to -1
x[0] = -1
print(x)

9
8
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]


`Slice` a list.

In [44]:
first_three = x[:3]
three_to_end = x[3:]
one_to_four = x[1:5]
last_three = x[-3:]
without_first_and_last = x[1:-1]
copy_of_x = x[:]

print("The original of X")
print(x)
print("------------------")
print("First three: " + str(first_three))
print("Three to end: " + str(three_to_end))
print("One to four: " + str(one_to_four))
print("Last three: " + str(last_three))
print("Without first and last: " + str(without_first_and_last))
print("Copy of X: " + str(copy_of_x))

The original of X
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
------------------
First three: [-1, 1, 2]
Three to end: [3, 4, 5, 6, 7, 8, 9]
One to four: [1, 2, 3, 4]
Last three: [7, 8, 9]
Without first and last: [1, 2, 3, 4, 5, 6, 7, 8]
Copy of X: [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]


`List membership` checking.

In [45]:
1 in [1, 2, 3]

True

In [48]:
0 in [1, 2 ,3]

False

 `Concatenate` lists together

In [50]:
x = [1, 2, 3]
x.extend([4, 5, 6])
x

[1, 2, 3, 4, 5, 6]

In [55]:
# if you don't want to modify the original list.
x = [1, 2, 3]
y = x + [4, 5, 6]
y

[1, 2, 3, 4, 5, 6]

In [56]:
# x is unchanged
x

[1, 2, 3]

`Append` to the list.

In [61]:
z = [7, 8, 9]
z

[7, 8, 9]

In [62]:
z.append(10)
z

[7, 8, 9, 10]

`Unpack` list: you have to know how many elements they contain because they must have the same numbers of elements on both sides.

In [63]:
x, y = [1, 2]

In [64]:
x

1

In [65]:
y

2

In [68]:
# Use 'underscore' for a unwanted value.
_, y = [3, 4]

# Tuples

`Tuples` are immutable list. You can do pretty much anything as almost same as to a list except involving modifying it.

In [70]:
# for list
my_list =  [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4

# modify list
my_list[1] = 3
my_list

[1, 3]

In [71]:
# modify tuple
try:
    my_tuple[1] = 3
except TypeError:
    print("cannot modify tuple")

cannot modify tuple


`Tuples` are a convenient way to return multiple values from function.

In [72]:
def sum_and_product(x, y):
    return (x + y), (x * y)

sp = sum_and_product(2, 3)

In [73]:
sp

(5, 6)

In [82]:
# Multiple assignment
s, p = sum_and_product(5, 10)

In [78]:
s

15

In [83]:
p

50

`Pythonic` way to swap variables.

In [84]:
x, y = 1, 2
x, y = y, x

In [85]:
x

2

In [86]:
y

1

# Dictionaries

`Dicionaries` are the fundamental data structure in Python, which associates `values` with `keys`. It allows you to quickly retrieve the value corresponding to a given key. Dictionary keys must be `immutable`; you cannot use `list` as a key but `tuple` or `string` if you need a multipart key.

In [88]:
empty_dict = {}
empty_dict

{}

In [89]:
grades = {"Dill": 85, "Smiile": 95}
grades

{'Dill': 85, 'Smiile': 95}

In [90]:
grades["Dill"]

85

You'll get `KeyError` if you ask for a non-existent key.

In [95]:
dude_grade = grade["Dude"]

NameError: name 'grade' is not defined

Check for existence of a key using `in`

In [96]:
grades

{'Dill': 85, 'Smiile': 95}

In [98]:
dill_has_grade = "Dill" in grades
dill_has_grade

True

In [99]:
dude_has_grade = "Dude" in grades
dude_has_grade

False

`get` method of Dictionaries that return a default value instead of raising an exception when you look up a key that's not in the dictionary.

In [100]:
grades

{'Dill': 85, 'Smiile': 95}

In [101]:
rossi_grade = grades.get("Rossi", 0)
rossi_grade

0

In [104]:
no_one_grade = grades.get("No one")
print(no_one_grade) # default value of get method is None

None


using `square brackets` for assigning and adding `key-value` pairs

In [105]:
grades

{'Dill': 85, 'Smiile': 95}

In [106]:
grades["Dill"] = 90
grades

{'Dill': 90, 'Smiile': 95}

In [107]:
grades["Rossi"] = 100
grades

{'Dill': 90, 'Rossi': 100, 'Smiile': 95}

Example of using dictionaries to represent `structured data`.

In [110]:
tweet = {
    "user": "Dill",
    "text": "Deep learning is good things to study.",
    "retweet_count": 100,
    "hashtags": ["#data", "#datascience", "#machinelearning"]
}

tweet

{'hashtags': ['#data', '#datascience', '#machinelearning'],
 'retweet_count': 100,
 'text': 'Deep learning is good things to study.',
 'user': 'Dill'}

`Methods` of dictionaries.

In [111]:
tweet.keys()

dict_keys(['user', 'text', 'retweet_count', 'hashtags'])

In [112]:
tweet.values()

dict_values(['Dill', 'Deep learning is good things to study.', 100, ['#data', '#datascience', '#machinelearning']])

In [113]:
tweet.items()

dict_items([('user', 'Dill'), ('text', 'Deep learning is good things to study.'), ('retweet_count', 100), ('hashtags', ['#data', '#datascience', '#machinelearning'])])

Looking for the `specific key`.

In [116]:
"user" in tweet.keys()

True

In [117]:
"user" in tweet # Pythonic way uses faster dict

True

# defaultdict

Imagine that you are trying to count the words in a document.

#1: The very simple first method.

In [120]:
word_counts = {}

for word in document:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

#2: use `get` dictionaries's method.

In [121]:
word_counts = {}

for word in document:
    previous_count = word_counts.get(word, 0)
    word_counts[word] = previous_count + 1

Anyway! these methods are slightly unwidely.
That's why `defaultdict` comes to play.

`deafultdict` first adds a zero-argument function when you try look up for non-existence key.

In [123]:
from collections import defaultdict

In [None]:
word_counts = defaultdict(int) 
for word in document:
    word_counts[word] += 1

`defaultdict` can be useful with `list` and `dict` and even your `own functions`.

In [138]:
dd_list = defaultdict(list)
dd_list

defaultdict(list, {})

In [139]:
dd_list[2]

[]

In [140]:
dd_list[2].append(1)
dd_list[3].append("test")
dd_list

defaultdict(list, {2: [1], 3: ['test']})

In [143]:
dd_dict = defaultdict(dict)
dd_dict["Dill"]["City"] = "Bangkok"
dd_dict

defaultdict(dict, {'Dill': {'City': 'Bangkok'}})

In [145]:
dd_func = defaultdict(lambda: [0, 0])
dd_func[2]

[0, 0]

In [147]:
dd_func[2][1] = 5
dd_func

defaultdict(<function __main__.<lambda>>, {2: [0, 5]})

Again! these will be useful when you're `collecting results by some key` and do not want to `check every time to see if the key exists yet`.

# Counter

In [148]:
from collections import Counter

c = Counter([5, 5, 1, 3, 0, 5, 0, 0, 2, 3, 6])
c

Counter({0: 3, 1: 1, 2: 1, 3: 2, 5: 3, 6: 1})

`most_common` method.

In [149]:
for number, count in c.most_common(3):
    print(number, count)

5 3
0 3
3 2


# Sets

`set` is another data structure in Python, which represents a collection of `distinct` elements.

In [150]:
s = set()
s.add(1)
s.add(1)
s.add(2)
s

{1, 2}

`in` operation on `set` is much faster than `list`.

In [151]:
1 in s

True

In [152]:
2 in s

True

In [153]:
3 in s

False

# Control Flow

`if` operation:

In [154]:
if 2 > 3:
    print("2 is more than 3")
elif 2 > 4:
    print("2 is more than 4")
else:
    print("2 is the best")

2 is the best


`ternay if-then-else`

In [155]:
result = "even" if x % 2 == 0 else "odd"

`while` loop:

In [156]:
x = 0
while x < 3:
    print(x)
    x += 1

0
1
2


Using `for` and `in`:

In [157]:
for i in range(5):
    print(i)

0
1
2
3
4


More complex logic with `continue` and `break`.

In [158]:
for x in range(10):
    if x == 3:
        continue
    if x == 5:
        break
    print(x)

0
1
2
4


# Sorting

In [166]:
x = [4, 6, 2, 3, 1]
# using sorted function it will return new a list.
y = sorted(x)
y

[1, 2, 3, 4, 6]

In [167]:
x

[4, 6, 2, 3, 1]

In [169]:
# using sort method will modify the original list.
x.sort()
x

[1, 2, 3, 4, 6]

`Sorting` from `largest` to `smallest` by using `reverse=True`.

In [171]:
sorted_x = sorted([-4, 5, -7, 0, 1], reverse=True)
sorted_x

[5, 1, 0, -4, -7]

In [172]:
sorted_abs_x = sorted([-4, 5, -7, 0, 1], key=abs, reverse=True)
sorted_abs_x

[-7, 5, -4, 1, 0]

# List Comprehensions

If you want to `transform` a list into another list in the `Pythonic` way.

In [199]:
even_numbers = [x for x in range(10) if x % 2 == 0]
even_numbers

[0, 2, 4, 6, 8]

In [200]:
squares = [x * x for x in range(5)]
squares

[0, 1, 4, 9, 16]

In [201]:
even_squares = [x * x for x in even_numbers]
even_squares

[0, 4, 16, 36, 64]

Turning `list` into `dictionaries`.

In [202]:
squares_dict = {x: x * x for x in range(5)}
squares_dict

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Turning `list` into `set`

In [204]:
squares_set = {x * x for x in [-2, 2]}
squares_set

{4}

In [205]:
ones = [1 for _ in range(5)]
ones

[1, 1, 1, 1, 1]

In [208]:
pairs = [(x, y)
        for x in range(2)
        for y in range(5)]
pairs

[(0, 0),
 (0, 1),
 (0, 2),
 (0, 3),
 (0, 4),
 (1, 0),
 (1, 1),
 (1, 2),
 (1, 3),
 (1, 4)]

In [209]:
increasing_pairs = [(x, y)
                   for x in range(2)
                   for y in range(x + 1, 3)]
increasing_pairs

[(0, 1), (0, 2), (1, 2)]

# Randomness

Since we learn `data science`, we will frequently need to generate `random numbers`.

In [210]:
import random

four_uniform_randoms = [random.random() for _ in range(4)]
four_uniform_randoms

[0.11484047069520775,
 0.5139082135499312,
 0.23365718552392534,
 0.33094551944020634]

The `random` module actually produces deterministic numbers that you can set with `random.seed()`. if you want to reproduce results.

In [227]:
random.seed(10)
print(random.random())
random.seed(2)
print(random.random())

0.5714025946899135
0.9560342718892494


Randomly choose elements from specific range by `random.randrange`.

In [232]:
print(random.randrange(10))
print(random.randrange(11, 20))

4
15


Reorder the elements of a list by `random.shuffle`.

In [257]:
up_to_five = [x for x in range(5)]
random.shuffle(up_to_five)
print(up_to_five)

[0, 4, 2, 3, 1]


Randomly pick one element from a list by `random.choice`.

In [241]:
my_best_car = ["Nissan", "Honda", "Toyota"]
print(random.choice(my_best_car))

Honda


Randomly choose a `sample` of elements `without duplicates`.

In [252]:
generated_number = [x for x in range(10)]
print(random.sample(generated_number, 3))

[3, 9, 0]


Randomly choose a `sample` of elements with `allowing duplicates`.

In [245]:
print([random.choice(range(0, 10)) for _ in range(3)])

[2, 8, 2]


# enumerate

In case you want to iterate over a list and use both its `elements` and their `indexes`.

In [261]:
generated_list = [random.choice(range(0, 100)) for _ in range(10)]
generated_list

[83, 67, 31, 62, 35, 63, 64, 65, 45, 84]

In [262]:
for i, ele in enumerate(generated_list):
    print(i, ele)

0 83
1 67
2 31
3 62
4 35
5 63
6 64
7 65
8 45
9 84


# zip and Argument Unpacking

`zip` transforms multiple lists into a single list of tuple of corresponding elements.

In [265]:
list1 = ["a", "b", "c"]
list2 = [1, 2, 3]
zipped_list = zip(list1, list2)

for x in zipped_list:
    print(x)

('a', 1)
('b', 2)
('c', 3)


In [281]:
triple_zip = zip(('a', 1), ('b', 2), ('c', 3))
for x in triple_zip:
    print(x)

('a', 'b', 'c')
(1, 2, 3)


`unzip` a list using `*` trick.

In [278]:
pairs = [("d", 1), ("t", 2)]
letters, numbers = zip(*pairs)
letters

('d', 't')

In [279]:
numbers

(1, 2)

# map, filter

`map` function

In [289]:
def double(x):
    return 2 * x

test = [x for x in range(1, 5)]

twice_test1 = map(double, test) # [2, 4, 6, 8]
twice_test2 = [double(x) for x in test]

In [293]:
def multiply(x, y):
    return x * y

products1 = map(multiply, [1, 3], [2, 5]) # [2, 15]
products2 = [multiply(x, y) for x, y in zip([1, 3], [2, 5])]

`filter` function

In [294]:
test

[1, 2, 3, 4]

In [298]:
def is_even(x):
    return x % 2 == 0

x_evens1 = filter(is_even, test) # [2, 4]
x_evens2 = [x for x in test if is_even(x)]

# Object-Oriented Programming

Python allows you to define `class` that `encapsulate` data and methods that operate on them.

In [300]:
# by convention, we use PascalCase names
class Set:
    # every method take 'self' as a first default parameter
    # 'self' refers to the particular 'Set' object being used.
    def __init__(self, values=None):
        """This is the constructor.
        It's called when you create a new 'Set'."""
        self.dict = {}
        
        if values is not None:
            for value in values:
                self.add(value)
                
    def __repr__(self):
        """This is the string representation of a 'Set' object."""
        return "Set: " + str(self.dict.keys())
    
    def add(self, value):
        self.dict[value] = True
        
    def contains(self, value):
        return value in self.dict
    
    def remove(self, value):
        del self.dict[value]

In [301]:
s = Set([1, 2, 3])
s

Set: dict_keys([1, 2, 3])

In [302]:
s.add(4)
print(s.contains(4))

True


In [303]:
s.remove(3)
s

Set: dict_keys([1, 2, 4])

# args and kwargs

If you want to create a `higher-order function` that take some `function` as input and return a `new function`.

In [305]:
def doubler(f):
    def g(x):
        return 2 * f(x)
    return g

In [306]:
def f1(x):
    return x + 1

g = doubler(f1)
print(g(2))

6


In [307]:
def f2(x, y):
    return x + y

g2 = doubler(f2)
print(g2(1, 2))

TypeError: g() takes 1 positional argument but 2 were given

There is the problem! when we input more than one argument. So we need the way to specify a function that can takes arbitrary arguments.

In [308]:
def doubler_correct(f):
    def g(*args, **kwargs):
        return 2 * f(*args, **kwargs)
    return g

In [312]:
g3 = doubler_correct(f2)
print(g3(1, 2))

6
