<a href="https://colab.research.google.com/github/Asuskf/introduction_dataScience_exercises/blob/main/Chapter2/book1_chapter2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **The Zen of Python**
### Best practices for programming in **Python**



In [None]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## **Book1 - exercise 2 part1**

*   Whitespace Formatting
*   Modules



### Whitespace Formatting
*Notes:*
*    The sign # is used for comments in **Python**
*    Whitespace is ignored inside parentheses and brackets, which can be helpful for longwinded computations:

In [None]:
# The pound sign marks the start of a comment. Python itself
# ignores the comments, but they're helpful for anyone reading the code.
for i in [1, 2, 3, 4, 5]:
    print(i)                    # first line in "for i" block
    for j in [1, 2, 3, 4, 5]:
        print("j", j)                # first line in "for j" block
        print("plus",i + j)            # last line in "for j" block
        print("i", i)                # last line in "for i" block
print("done looping")

1
j 1
plus 2
i 1
j 2
plus 3
i 1
j 3
plus 4
i 1
j 4
plus 5
i 1
j 5
plus 6
i 1
2
j 1
plus 3
i 2
j 2
plus 4
i 2
j 3
plus 5
i 2
j 4
plus 6
i 2
j 5
plus 7
i 2
3
j 1
plus 4
i 3
j 2
plus 5
i 3
j 3
plus 6
i 3
j 4
plus 7
i 3
j 5
plus 8
i 3
4
j 1
plus 5
i 4
j 2
plus 6
i 4
j 3
plus 7
i 4
j 4
plus 8
i 4
j 5
plus 9
i 4
5
j 1
plus 6
i 5
j 2
plus 7
i 5
j 3
plus 8
i 5
j 4
plus 9
i 5
j 5
plus 10
i 5
done looping


In [None]:
long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20)
long_winded_computation

210


#### Tips for whitespace Formatting

In [None]:
# Easier to read
# Bad
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Better
easier_to_read_list_of_lists = [[1, 2, 3],
                                [4, 5, 6],
                                [7, 8, 9]]

In [None]:
two_plus_three = 2 + \
                 3

In [None]:
two_plus_three

5

In [None]:
for i in [1, 2, 3, 4, 5]:
 # notice the blank line
 print(i)

1
2
3
4
5


### Modules (features)
* There are modules that you need to install.
* Others you need to call with **import**
* Others are loaded by default

#### Modules that you need to call

In [None]:
# Modules that you need to call
import re

# regular expressions
my_regex = re.compile("[0-9]", re.I)

In [None]:
# Modules that you need to call with nickname
import re as regex
my_regex = regex.compile("[0-9]+", regex.I)

In [None]:
from collections import defaultdict, Counter
lookup = defaultdict(int)
my_counter = Counter()

In [None]:
lookup

defaultdict(int, {})

In [None]:
lookup["s"]

0

In [None]:
my_counter.update([1,2,2,3,1])

In [None]:
my_counter

Counter({1: 4, 2: 4, 3: 2})

#### Mistake

In [None]:
match = 10
from re import * # uh oh, re has a match function
print(match) # "<function match at 0x10281e6a8>"

<function match at 0x7fac6c3eda20>


## **Book1 - exercise 2 part2**
*   Functions
*   Strings

Key word **def**


### **Functions**

In [None]:
def double(x):
    """
    This is where you put an optional docstring that explains what the
    function does. For example, this function multiplies its input by 2.
    """
    return x * 2

Note

**Docstring**:  short for "documentation string," is a special type of string literal in computer programming that is used to document functions, classes, modules, or methods in source code.

In [None]:
def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)

In [None]:
my_double = double # refers to the previously defined function
x = apply_to_one(my_double) # equals 2

In [None]:
my_double

<function __main__.double(x)>

In [None]:
apply_to_one(my_double)

2

#### Default arguments

In [None]:
def my_print(message = "my default message"):
    print(message)
my_print("hello") # prints 'hello'
my_print() # prints 'my default message'

hello
my default message


In [None]:
def full_name(first = "What's-his-name", last = "Something"):
    return print(first + " " + last)
full_name("Joel", "Grus") # "Joel Grus"
full_name("Joel") # "Joel Something"
full_name(last="Grus") # "What's-his-name Grus"

Joel Grus
Joel Something
What's-his-name Grus


#### **Anonymous functions, or lambdas**

In [None]:
y = apply_to_one(lambda x: x + 4)

In [None]:
another_double = lambda x: 2 * x # don't do this

In [None]:
def another_double(x):
    """Do this instead"""
    return 2 * x

### Strings

In [None]:
single_quoted_string = 'data science'
double_quoted_string = "data science"

In [None]:
print(single_quoted_string)

data science


In [None]:
tab_string = "\t" # represents the tab character
len(tab_string) # is 1

1

In [None]:
single_quoted_string = 'data\t science'
print(single_quoted_string)

data	 science


In [None]:
not_tab_string = r"\t" # represents the characters '\' and 't'
len(not_tab_string) # is 2

2

In [None]:
single_quoted_string = r'data\t science'
print(single_quoted_string)

data\t science


In [None]:
multi_line_string = """This is the first line.
and this is the second line
and this is the third line"""

In [None]:
first_name = "Joel"
last_name = "Grus"

In [None]:
full_name1 = first_name + " " + last_name # string addition
full_name2 = "{0} {1}".format(first_name, last_name) # string.format
full_name3 = f"{first_name} {last_name}"  # f-Strings

In [None]:
full_name1

'Joel Grus'

In [None]:
full_name2

'Joel Grus'

In [None]:
full_name3

'Joel Grus'

### Exceptions

In [None]:
try:
    print(0 / 0)
except ZeroDivisionError:
    print("Cannot divide by zero")

Cannot divide by zero


## **Book1 - exercise 2 part3**
* Lists
* Tuples

### Lists

In [None]:
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [integer_list, heterogeneous_list, []]

list_length = len(integer_list) # equals 3
list_sum = sum(integer_list) # equals 6

#### Get value from list using index

In [None]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
zero = x[0] # equals 0, lists are 0-indexed
one = x[1] # equals 1
nine = x[-1] # equals 9, 'Pythonic' for last element
eight = x[-2] # equals 8, 'Pythonic' for next-to-last element
x[0] = -1 # now x is [-1, 1, 2, 3, ..., 9

#### Traverse a list without for

In [None]:
first_three = x[:3] # [-1, 1, 2]
three_to_end = x[3:] # [3, 4, ..., 9]
one_to_four = x[1:5] # [1, 2, 3, 4]
last_three = x[-3:] # [7, 8, 9]
without_first_and_last = x[1:-1] # [1, 2, ..., 8]
copy_of_x = x[:] # [-1, 1, 2, ..., 9]

#### Traverse a list with jumps

In [None]:
every_third = x[::3] # [-1, 3, 6, 9]
five_to_three = x[5:2:-1] # [5, 4, 3]

#### Check value in a list

In [None]:
1 in [1, 2, 3] # True

True

In [None]:
0 in [1, 2, 3] # False

False

#### Add values in a list

In [None]:
x = [1, 2, 3]
x.extend([4, 5, 6])
x

[1, 2, 3, 4, 5, 6]

In [None]:
x = [1, 2, 3]
y = x + [4, 5, 6] # y is [1, 2, 3, 4, 5, 6]; x is unchanged
y

[1, 2, 3, 4, 5, 6]

In [None]:
x = [1, 2, 3]
x.append(0) # x is now [1, 2, 3, 0]

#### Get values from a list and create variables with each one

In [None]:
x, y = [1, 2] # now x is 1, y is 2

In [None]:
_, y = [1, 2] # now y == 2, didn't care about the first element

### Tuples

In [None]:
my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4
my_list[1] = 3 # my_list is now [1, 3]

In [None]:
try:
    my_tuple[1] = 3
except TypeError:
    print("Cannot modify a tuple")

Cannot modify a tuple


## **Book1 - exercise 2 part4**
* Dictionaries
* Defaultdict

### Dictionaries

#### Declare dictionary

In [1]:
empty_dict = {} # Pythonic
empty_dict2 = dict() # less Pythonic
grades = {"Joel": 80, "Tim": 95} # dictionary literal

In [3]:
joels_grade = grades["Joel"] # equals 80
joels_grade

80

#### Handle error for key that does not exist

In [5]:
try:
    kates_grade = grades["Kate"]
except KeyError:
    print("No grade for Kate!")

No grade for Kate!


#### Know if the key exists

In [7]:
joel_has_grade = "Joel" in grades # True
joel_has_grade

True

In [6]:
kate_has_grade = "Kate" in grades # False
kate_has_grade

False

#### Query the data, when the key does not exist it returns none

In [8]:
joels_grade = grades.get("Joel", 0) # equals 80
joels_grade

80

In [9]:
kates_grade = grades.get("Kate", 0) # equals 0
kates_grade

0

In [10]:
no_ones_grade = grades.get("No One") # default is None
no_ones_grade

#### Add new values in a dict

In [11]:
grades["Tim"] = 99 # replaces the old value
grades["Kate"] = 100 # adds a third entry
num_students = len(grades) # equals 3

#### Exercise


In [12]:
tweet = {
"user" : "joelgrus",
"text" : "Data Science is Awesome",
"retweet_count" : 100,
"hashtags" : ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}

#### Get data from dict


In [13]:
tweet_keys = tweet.keys() # iterable for the keys
tweet_values = tweet.values() # iterable for the values
tweet_items = tweet.items() # iterable for the (key, value) tuples

In [14]:
"user" in tweet_keys # True, but not Pythonic

True

In [15]:
"user" in tweet # Pythonic way of checking for keys

True

In [16]:
"joelgrus" in tweet_values # True (slow but the only way to check)

True

### Defaultdict

We need to count words

In [25]:
from collections import defaultdict

In [27]:
dd_list = defaultdict(list) # list() produces an empty list
dd_list[2].append(1) # now dd_list contains {2: [1]}
dd_list

defaultdict(list, {2: [1]})

In [28]:
dd_dict = defaultdict(dict) # dict() produces an empty dict
dd_dict["Joel"]["City"] = "Seattle" # {"Joel" : {"City": Seattle"}}
dd_dict

defaultdict(dict, {'Joel': {'City': 'Seattle'}})

In [29]:
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1 # now dd_pair contains {2: [0, 1]}
dd_pair

defaultdict(<function __main__.<lambda>()>, {2: [0, 1]})