<h1>Chapter 2 A Crash Course in Python</h1>

<h2>The Zen of Python</h2>

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


<h2>Whitespace Formating</h2>

Normally in other languages we use curly braces but here in python we use white spacing

In [2]:
for i in [1,2,3,4,5]:
    print(i)
    for j in [1,2,3,4,5]:
        print(j)
        print(i+j)
    print(i)
print("Done Looping")

1
1
2
2
3
3
4
4
5
5
6
1
2
1
3
2
4
3
5
4
6
5
7
2
3
1
4
2
5
3
6
4
7
5
8
3
4
1
5
2
6
3
7
4
8
5
9
4
5
1
6
2
7
3
8
4
9
5
10
5
Done Looping


<h2>Module</h2>

All features are not added in python If we want to add more features in python so we can import it

In [3]:
import re
my_regex = re.compile("[0-9]",re.I)

if you have already used 're' in your code then u can do this

In [4]:
import re as regex
my_regex = regex.compile("[0-9]",regex.I)

If your module name is too long and its hard to type again and again then you can do this

In [5]:
import matplotlib.pyplot as plt

if you need few features from the module then do this

In [6]:
from collections import defaultdict, Counter

if you want to import all so do this

In [7]:
# from re import *

<h2>Airthmetic</h2>

Python 2.7 use integer by default so 5 / 2 is equal to 2

In [8]:
from __future__ import division

after above import 5 / 2 is equal 2.5

<h2>Function</h2>

In python function means takes zero or more inputs and return its corresponding output. In python we define function with "def" keyword

In [9]:
def double(x):
    """this is where you put optional docstring
    means what this function does"""
    return x * 2

Python functions are first-class, which means we can assign them to a variable and pass to any function as argument

In [10]:
def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)
my_double = double #refers to previously defined function
x = apply_to_one(my_double)
x

2

you can also make short annynomous functions or lambdas

In [11]:
y = apply_to_one(lambda x:x+4)
y

5

You can also provide default arguments

In [12]:
def my_print(message="my default message"):
    print(message)
my_print('Hello')
my_print()

Hello
my default message


We should specify the arguments with name

In [13]:
def subtract(a=0,b=0):
    return a - b

In [14]:
subtract(10,5)

5

In [15]:
subtract(0,5)

-5

In [16]:
subtract(b=5)

-5

<h2>Strings</h2>

Strings can be delimited by single or double quotation marks

In [17]:
single_quoted_string = 'data science'
double_quoted_string = "data science"

Python uses backslashes to encode specail characters

In [18]:
tab_string = "\t"
len(tab_string)

1

if you want backslashes as backslashes then you can create raw strings using r""

In [19]:
not_tab_string = r"/t"
len(not_tab_string)

2

for multiline strings

In [20]:
multi_line_string = """this is first Line.
and this is the second line
and this is the third line"""

<h2>Exceptions</h2>

we do exception handling for not to crash the program

In [21]:
try:
    print(0/0)
except ZeroDivisionError:
    print("Cannot divide by zero")

Cannot divide by zero


<h2>Lists</h2>

The Most fundamental Data Structures in Python is the List. In other languages we calls this an Array

In [22]:
integer_list = [1,2,3]
hetrogeneous_list = ["string",0.1,True]
list_of_lists = [ integer_list, hetrogeneous_list, []]

list_length = len(integer_list)
list_sum = sum(integer_list)

print(list_length)
print(list_sum)

3
6


You can set the nth element with square brackets

In [23]:
x = list(range(10)) # is the list [0,1,...,9]
zero = x[0] # equals 0, lists are 0-index
one = x[1] # equals 1
nine=[-1] # equals 9, 'Pythonic' for last element
eight = x[-2] # equals 8, 'Pythonic' for next-to-last element
x[0] = -1 #now x is [-1,1,2,3, ..., 9]

You can also slice it

In [24]:
first_three = x[:3] #[-1,1,2]
three_to_end = x[3:] # [3,4,...,9]
one_to_four = x[1:5] #[1,2,3,4]
last_three = x[-3:] #[7,8,9]
without_first_and_last = x[1:-1] #[1,2,....,9]
copy_of_x = x[:] #[-1,1,2,... ,9]

'in' operator to check for list membership

In [25]:
print(1 in [1,2,3])
print(0 in [1,2,3])

True
False


Concatenate the list

In [26]:
x = [1,2,3]
x.extend([4,5,6])
x

[1, 2, 3, 4, 5, 6]

if you don't want to modify the list then do addition but in this x won't be changed

In [27]:
x = [1,2,3]
y = x + [4,5,6]
y

[1, 2, 3, 4, 5, 6]

Append a list one by one

In [28]:
x = [1,2,3]
x.append(0) # x is now [1,2,3,0]
y = x[-1] 
z = len(x)
print(y)
print(z)

0
4


we unpack the list like this but you will get the ValueError if same numbers won't be available both at both sides

In [29]:
x, y  = [1, 2]
print(x)
print(y)

1
2


it's common to use an underscore for value you are going to throw away

In [30]:
_, y = [1,2] #now y==2, didn't care about the first element
print(y)

2


<h2>Tuple</h2>

Tuples are list's immutable cousins

In [31]:
my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4
my_list[1]  = 3 # my list is now [1, 3]
try:
    my_tuple[1] = 3
except:
    print("Cannot modify a tuple")

Cannot modify a tuple


Tuples are convenient way to return multiple values from functions

In [32]:
def sum_and_product(x, y):
    return (x + y), (x * y)

sp = sum_and_product(2,3)
s, p = sum_and_product(5,10)
print(sp)
print(s)
print(p)

(5, 6)
15
50


Tuples (and lists ) can also be used for multiple assignment

In [33]:
x, y = 1,2 # now x is 1 , y is 2
x, y = y, x #Pythonic way to swap variables; now x is 2, y is 1
print(x)
print(y)

2
1


<h2>Dictionaries</h2>

Another fundamental data structure is a dictionary, which assosiates values with keys 

In [34]:
empty_dict = {} # Pythonic
empty_dict2 = dict() # less Pytohnic
grades = {"Joel":80,"Tim":95} # dictionary literal
grades

{'Joel': 80, 'Tim': 95}

In [35]:
joels_grade = grades["Joel"]
joels_grade

80

You will get KeyError if key won't be there in the dictionary

In [36]:
try:
    kates_grades = grades["kate"]
except KeyError:
    print("no grade for kate !")

no grade for kate !


you can check the key by using "in"

In [37]:
joel_has_grade = "Joel" in grades # True
kate_has_grade = "kate" in grades # False
print(joel_has_grade)
print(kate_has_grade)

True
False


Dictionaries have the "get" method that returns default value(instead raising an exception) when you look up a key that's not in dictionary

In [38]:
joels_grade = grades.get("Joel",0) # equals to 80
kates_grade = grades.get("kate",0) # equals to 0
no_ones_grade = grades.get("No One") # default default is None
print(joels_grade)
print(kates_grade)
print(no_ones_grade)

80
0
None


You can assign key-value pairs

In [39]:
grades["Tim"] = 99 #replaces the old value
grades["kate"] = 100 # adds a third entry
num_students  = len(grades)
print(grades)
print(num_students)

{'Joel': 80, 'Tim': 99, 'kate': 100}
3


we can use dictionary to repersent the structured data in simple way 

In [40]:
tweet = {
    "user":"joelgrus",
    "text":"Data Science is Awsome",
    "retweet_count":100,
    "hastages":["#data","#science","#datascience","#awsome","#yolo"]
}

Besides looking for specific keys we can look at all of them

In [44]:
tweet_keys = tweet.keys() #list of keys
tweet_values = tweet.values() # lsit of valeus
tweet_items = tweet.items() # lsti of (key, values) tuple


In [45]:
"user" in tweet_keys  # True, but uses a slow list in 

True

In [46]:
"user" in tweet # more Pythonic, use faster dict in

True

In [47]:
"joelgrus" in tweet_values # True

True

<h2>defaultdict</h2>

Imagine that you are trying to count the words in a document.then in dictionary keys are words and then counts will be values then you will check then increment

In [49]:
# word_counts = {}
# for word in documents:
#     if word in word_counts:
#         word_counts[word] += 1
#     else:
#         word_counts[word] = 1

you could also use the "forgiveness is better the permission" approach and just handle the exception from trying to look up a missing key

In [50]:
# word_counts = {}
# for word in docuemnt:
#     try:
#         word_count[word] += 1
#     except KeyError:
#         word_counts[word] = 1

A third approch is to use get, which behaves gracefully for missing keys

In [51]:
# word_coutns = {}
# for word in document:
#     previous_count = word_counts.get(word,0)
#     word_counts[word] = previous_count + 1

All ways are slightly unwidely.A defaultdict is like a regular dictionary.if it is not their it will add value for it using a zero-argument function you provide when you created it
if you want to use defaultdict you will have to import from colleciton module

In [56]:
from collections import defaultdict

# word_counts = defaultdict(int) # int() produces 0 
# for word in document:
#     word_count[word] += 1

they can also be useful with list or dict or even your  own functions

In [57]:
dd_list = defaultdict(list) # list() produces an empty list
dd_list[2].append(1)
dd_list

defaultdict(list, {2: [1]})

In [59]:
dd_dict = defaultdict(dict) #dict() produces an empty dict
dd_dict["Joel"]["City"] = "seattle" # {"Joel":{ "city" : "seattle"}}
dd_dict

defaultdict(dict, {'Joel': {'City': 'seattle'}})

In [60]:
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1
dd_pair

defaultdict(<function __main__.<lambda>()>, {2: [0, 1]})

<h2>Counter</h2>

A counter truns a sequence of values into a defaultdict(int)-like object mapping to counts. We will primarily use it to create histograms

In [61]:
from collections import Counter
c = Counter([0,1,2,0])
c

Counter({0: 2, 1: 1, 2: 1})

this is a very simpler way for word_counts problem

In [65]:
document = "this is for counter example"
word_counts = Counter(document)
word_counts

Counter({'t': 2,
         'h': 1,
         'i': 2,
         's': 2,
         ' ': 4,
         'f': 1,
         'o': 2,
         'r': 2,
         'c': 1,
         'u': 1,
         'n': 1,
         'e': 3,
         'x': 1,
         'a': 1,
         'm': 1,
         'p': 1,
         'l': 1})

A counter instance has a most_common method that is frequently useful

In [67]:
#print the 2 most common words and their counts
for word, count in word_counts.most_common(2):
    print(word, count)

  4
e 3


<h2>Sets</h2>

Another data structure is set, which represents a collection of distinct elemets

In [2]:
s = set()
s.add(1)
s.add(2)
s.add(2)
x = len(s)
y = 2 in s 
z = 3 in s

We will use sets for two main reason. The first is that in is  a very fast operation on sets. If we have a large collection of items of items that we want to use for a memebership 

In [8]:
hundreds_of_other_words = ["ohterwords"]
stopwords_list = ["a","an","at"] + hundreds_of_other_words + ["yet","you"]
"zip" in stopwords_list #False, but have to check every element

False

In [9]:
stopwords_set = set(stopwords_list)
"zip" in stopwords_set # very fast to check

False

The second reason is to find the dictinct item in collection

In [10]:
item_list = [1,2,3,1,2,3]
num_items = len(item_list) # 6
item_set = set(item_list) # {1,2,3}
num_distinct_items = len(item_set) # 3
distint_item_list = list(item_set) # [1,2,3]


<h2>Control Flow</h2>

As in most programming language we do if like this

In [11]:
if  1 > 2:
    message = "if onlu 1 were greater than two.."
elif 1 > 3:
    message = "elif stands for 'else if'"
else:
    message = "when all else fails ise else (if you want to)"

you can also write ternary if-then-else on one line

In [12]:
parity = "even" if x % 2 == 0 else "odd"

Python has a while loop

In [13]:
x = 0 
while x < 10:
    print(x, "is less than 10")
    x += 1

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


although more ofthen we'll use for and in

In [14]:
for x in range(10):
    print(x, "is less than 10")

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


if you need more complex logic, you can use continue and break

In [15]:
for x in range(10):
    if x == 3:
        continue # go immediately to the next iteration
    if x == 5:
        break
    print(x)

0
1
2
4


<h2>Truthiness</h2>

Booleans in Python work as in most other languages, except that they're capitalized

In [16]:
one_is_less_than_two = 1 > 2 # equals True
true_equals_false = True == False # equals False

Python uses the value None to indicate a noneexistent value. It is similar to other languages' null

In [17]:
x = None
print(x == None) # prints True, but not is Pythonic
print(x is None) # prints True, and is Pythonic

True
True


Python lets you use any where it expects a Boolean. The following are all "Falsy"
<ul>
    <li>False</li>
    <li>None</li>
    <li>[] (an empty list)</li>
    <li>{} (an empty dict)</li>
    <li>""</li>
    <li>set()</li>
    <li>0</li>
    <li>0.0</li>
</ul>

In [18]:
def some_function_that_returns_a_string():
    return "some string"
s = some_function_that_returns_a_string()
if s:
    first_char = s[0]
else:
    first_char = ""

a simpler way of doing the same is 

In [23]:
first_char = s and s[0]
first_char

's'

since and return its second value whene first is "truthy" the first value when it's not.
similarly, if x is either a number or possibaly None

In [25]:
safe_x = x or 0
safe_x

0

Pyhton as "all" and "any" function for check truthness which takes list

In [31]:
print(all([True, 1,{3}]))
print(all([True, 1,{}]))     
print(any([True, 1,{}]))
print(all([]))
print(any([]))

True
False
True
True
False
