# A Crash Course in Python

## The Basics

### Whitespace Formatting
Many languages use curly braces to delimit blocks of code. Python uses **indentation**:

In [1]:
for i in[1,2,3,4,5]:
    print(i)
    for j in [1,2,3,4,5]:
        print(j)
        print(i+j)
    print(i)
print("done looping")

1
1
2
2
3
3
4
4
5
5
6
1
2
1
3
2
4
3
5
4
6
5
7
2
3
1
4
2
5
3
6
4
7
5
8
3
4
1
5
2
6
3
7
4
8
5
9
4
5
1
6
2
7
3
8
4
9
5
10
5
done looping


Whitespace is **ignored** inside parentheses and brackets

In [2]:
long_winded_computation=(1+2+3+4+5+6+7+8+9+10
                        +11+12+
                        13+14+15+16+17+18+19+20)

for making code easier to read

In [3]:
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
easier_to_read_list_of_lists = [ [1, 2, 3],
                                 [4, 5, 6],
                                 [7, 8, 9] ]

Use a **backslash** to indicate that a statement continues onto the next line

In [4]:
two_plus_three = 2 + \
                 3

### Modules

**Import** the modules that contain features
- import regular expression module: re is the module containing functions and constants for working with regular
expressions.

In [5]:
import re
my_regex = re.compile("[0-9]+", re.I)

In [6]:
#help(re.compile)
#cell 은 독립적???

In [7]:
help(compile)

Help on built-in function compile in module builtins:

compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1)
    Compile source into a code object that can be executed by exec() or eval().
    
    The source code may represent a Python module, statement or expression.
    The filename will be used for run-time error messages.
    The mode must be 'exec' to compile a module, 'single' to compile a
    single (interactive) statement, or 'eval' to compile an expression.
    The flags argument, if present, controls which future statements influence
    the compilation of the code.
    The dont_inherit argument, if true, stops the compilation inheriting
    the effects of any future statements in effect in the code calling
    compile; if absent or false these statements do influence the compilation,
    in addition to any features explicitly specified.



You may use an **alias**

In [8]:
import re as regex
my_regex = regex.compile("[0-9]+",regex.I)

In [9]:
import matplotlib.pyplot as plt

You can import them explicitly and use them **without qualification**

In [10]:
from collections import defaultdict, Counter
lookup = defaultdict(int)
my_counter = Counter()

You could import the entire contents of a module into your
namespace, which might inadvertently overwrite variables you’ve already defined:

In [11]:
match =10
from re import *   #uh oh, re has a match function
print(match)          #"<function re.match>"

<function match at 0x000001F94147A5E8>


In [12]:
#what is * mean?? -> all!!

### Arithmetic

Remember **quotient-remainder theorem**

$$n = d \cdot q + r$$

- $d$ is a divisor, $q$ is a quotient, $r$ is a remainder,
- $0 \leq r \lt q$ when $q$ is positive and $q \lt r \leq 0$ when $q$ is negative

In [13]:
print(2**10)
print(2** 0.5)
print(2** -0.5)
print(5 / 2)
print(5 % 3)
print(5 // 3)
print((-5) % 3)
print((-5) // 3)
print(5 % (-3))
print((-5) // (-3))
print((-5) % (-3))
print(7.2 // 3.5)
print(7.2 % 3.5)

1024
1.4142135623730951
0.7071067811865476
2.5
2
1
1
-2
-1
1
-2
2.0
0.20000000000000018


### Functions

- A function is a rule for taking zero or more inputs and returning a corresponding output

In [14]:
# for PEP on docstring
def double(x):
    """this is where you put an optional docstring
    that explains what the function does.
    for example, this function multiplies its input by 2"""
    return x * 2

double(2)
    

4

In [15]:
help(double)

Help on function double in module __main__:

double(x)
    this is where you put an optional docstring
    that explains what the function does.
    for example, this function multiplies its input by 2



Python functions are **first-class**, which means that we can assign them to variables and
pass them into functions just like any other arguments:

In [16]:
def apply_to_one(f):
    """calls the function f with 1 as its argument"""
    return f(1)

my_double = double
x = apply_to_one(my_double)

print(x)

2


**Lambda function**: short anonymous functions

In [17]:
#help(lambda)

In [18]:
y = apply_to_one(lambda x:x +4)
print(y)

5


In [19]:
another_double = lambda x:2*x
def another_double(x): return 2*x

In [20]:
add=lambda x,y:x+y
add(1,2)

3

Function parameters can also be given **default arguments**

In [21]:
def my_print(message="my default message"): #arguments
    print(message)
    
my_print("hello")
my_print() # default argument

hello
my default message


In [22]:
def subtract(a=0,b=0):
    return a-b

subtract(10,5)
subtract(0,5)
subtract(b=5)
subtract(b=5,a=20)
#how can be printed without print()?

15

### Strings

- Strings can be delimited by single or double quotation marks

In [23]:
single_quoted_string='data science'
double_quoted_string="data science"

In [24]:
tab_string= "\t"
len(tab_string)

1

multiline strings using triple-double-quotes

In [25]:
multi_line_string="""This is the first line.
and this is the second line 
and this is the third line"""

### Exceptions

- When something goes wrong, Python raises an exception.

In [26]:
try:
    print(0 / 0)
except ZeroDivisionError:
    print("cannot divide by zero")

cannot divide by zero


## Lists

- the most fundamental data structure in Python

In [27]:
integer_list =[1,2,3]
heterogeneous_list=["string",0.1,True]#other type posible
list_of_lists=[integer_list,heterogeneous_list,[]]

list_length =len(integer_list)
list_sum=sum(integer_list)

You can get or set the nth element of a list with square brackets

In [28]:
x=list(range(10))
zero=x[0]
one=x[1]
nine=x[-1]
eight=x[-2]
x[0]=-1


You can also use square brackets to “slice” lists:

In [29]:
first_three=x[:3]
three_to_end=x[3:]
one_to_four=x[1:5]
last_three=x[-3:]
without_first_and_last=x[1:-1]
copy_of_x=x[:]
#make a copy

In [30]:
print(x[::2])#???????
print(x[::-1])
#not in pdf

[-1, 2, 4, 6, 8]
[9, 8, 7, 6, 5, 4, 3, 2, 1, -1]


**in** operator to check for list membership

In [31]:
1 in [1,2,3]
0 in [1,2,3]

False

To concatenate lists together:

In [32]:
#concatenate is chainlike
x=[1,2,3]
x.extend([4,5,6])
print(x)

[1, 2, 3, 4, 5, 6]


In [33]:
x=[1,2,3]
y=x+[4,5,6]
print(x,y)

[1, 2, 3] [1, 2, 3, 4, 5, 6]


To append to lists one item at a time:

In [34]:
x=[1,2,3]
x.append(0)
y=x[-1]
z=len(x)

It is convenient to unpack lists:

In [35]:
x,y=[1,2]

In [36]:
_,y=[1,2]

### Tuples

- Tuples are lists’ **immutable** cousins.

In [37]:
#1, tuple. ,->all tuple
my_list=[1,2]
my_tuple=(1,2)
other_tuple=3,4
my_list[1]=3
try:
    my_tuple[1]=3
except TypeError:
    print("cannot modify a tuple")

cannot modify a tuple


Tuples are a convenient way to **return multiple values** from functions:

In [38]:
def sum_and_product(x,y):
    return (x+y),(x*y) #multiple value. function in python can return several values.

sp=sum_and_product(2,3)
s,p=sum_and_product(5,10)

Tuples (and lists) can also be used for **multiple assignment**:

In [39]:
x,y=1,2
x,y=y,x# but isn't it immutable?

### Dictionaries

- Another fundamental data structure which associates **values with keys**
- It allows you to quickly retrieve the value corresponding to a given key:

In [40]:
#from exel
#json tree
#pythonic is just heard to me 'be short'^^
empty_dict={}
empty_dict2=dict()
grades={"Joel":80,"Tim":95}

You can look up the value for a **key using square brackets**:

In [41]:
joels_grade=grades["Joel"]

In [42]:
grades={"Joel": 80,"Tim":95,"Tim":94}
grades

{'Joel': 80, 'Tim': 94}

In [43]:
len(grades)

2

In [44]:
grades.keys()

dict_keys(['Joel', 'Tim'])

In [45]:
grades.values()

dict_values([80, 94])

In [46]:
grades["Tim"]

94

**KeyError** if you ask for a key that’s not in the dictionary:

In [47]:
try:
    kates_grade=grades["Kate"]
except KeyError:
    print("no grade for Kate!")

no grade for Kate!


You can **check for the existence of a key** using in :

In [48]:
joel_has_grade = "Joel" in grades #by in former variable became Boolean True
kate_has_grade= "Kate" in grades

Dictionaries have a get method that returns a default value (**instead of raising an
exception**) when you look up a key that’s not in the dictionary:

In [49]:
joels_grade=grades.get("Joel",0)
kates_grade=grades.get("Kate",0)
no_ones_grade=grades.get("No One")
no_ones_grade==None

True

You assign key-value pairs using the same square brackets:

In [50]:
grades["Tim"]=99
grades["Kate"] =100
num_students=len(grades)

We will frequently use dictionaries as a simple way to represent **structured data**:

In [51]:
tweet={
    "user":"joelgrus",
    "text": "data Science is Awesome",
    "retweet_count": 100,
    "hashtags": ["#data", "#science","#datascience","#awesome","#yolo"]
    
}

**Iteration**: we can look at all of them

In [52]:
tweet_keys= tweet.keys()
tweet_values = tweet.values()
tweet_items=tweet.items()

"user" in tweet_keys
"user" in tweet

"joelgrus" in tweet_values

True

**WordCount Example**: Create a dictionary in which the keys are words and the values are counts.

In [53]:
document=['I','am','a','boy','I','love','you']

**First Approach:**

In [54]:
word_counts={}
for word in document:
    if word in word_counts:
        word_counts[word]+=1
    else: 
        word_counts[word]=1
        

**Second Approach:**

In [55]:
word_counts={}
for word in document:
    try:
        word_counts[word]+=1
    except KeyError:
        word_counts[word]=1
    

**Third Approach:**

In [56]:
word_counts={}
for word in document:
    previous_count=word_counts.get(word,0)
    word_counts[word]=previous_count+1

In [57]:
word_counts

{'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1}

### defaultdict

- A defaultdict is like a regular dictionary, except that when you try to look up a key it doesn’t contain, it first adds a value for it using a **zero-argument function** you provided when you created it.

In [58]:
from collections import defaultdict

word_counts=defaultdict(int)# int is function without argment

for word in document:
    word_counts[word]+=1

print(word_counts)

defaultdict(<class 'int'>, {'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1})


In [59]:
int()

0

In [60]:
dd_list=defaultdict(list)
dd_list[2].append(1)
#make empty list

dd_dict=defaultdict(dict)
dd_dict["Joel"]["City"]="Seattle"
dd_pair=defaultdict(lambda: [0,0])
dd_pair[2][1]=1

### Counter

- A Counter turns a sequence of values into a defaultdict(int)-like object mapping keys to counts.
- We will primarily use it to create **histograms**

In [61]:
from collections import Counter
c= Counter([0,1,2,0])

In [62]:
word_counts=Counter(document)
word_counts

Counter({'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1})

Use **help** function to see a man page

In [63]:
help(word_counts)

Help on Counter in module collections object:

class Counter(builtins.dict)
 |  Counter(*args, **kwds)
 |  
 |  Dict subclass for counting hashable items.  Sometimes called a bag
 |  or multiset.  Elements are stored as dictionary keys and their counts
 |  are stored as dictionary values.
 |  
 |  >>> c = Counter('abcdeabcdabcaba')  # count elements from a string
 |  
 |  >>> c.most_common(3)                # three most common elements
 |  [('a', 5), ('b', 4), ('c', 3)]
 |  >>> sorted(c)                       # list all unique elements
 |  ['a', 'b', 'c', 'd', 'e']
 |  >>> ''.join(sorted(c.elements()))   # list elements with repetitions
 |  'aaaaabbbbcccdde'
 |  >>> sum(c.values())                 # total of all counts
 |  15
 |  
 |  >>> c['a']                          # count of letter 'a'
 |  5
 |  >>> for elem in 'shazam':           # update counts from an iterable
 |  ...     c[elem] += 1                # by adding 1 to each element's count
 |  >>> c['a']                          

In [64]:
for word, count in word_counts.most_common(10):
    print(word,count)

I 2
am 1
a 1
boy 1
love 1
you 1


### Sets

- Another data structure is **set**, which represents a collection of **distinct** elements:

In [65]:
s=set()
s.add(1)
s.add(2)
s.add(2)
x=len(s)
y=2 in s
z=3 in s

- For a membership test, a set is more appropriate than a list
- **in** is a very fast operation on sets.

In [66]:
hundreds_of_other_words=[]
stopwords_list=["a"',"an"',"at"] + hundreds_of_other_words + ["yet", "you"]
"zip" in stopwords_list

stopwords_set = set(stopwords_list)
"zip" in stopwords_set

False

To find the **distinct** items in a collection:

In [67]:
item_list=[1,2,3,1,2,3]
num_itesm=len(item_list)
item_set=set(item_list)
num_distinct_items=len(item_set)
distinct_item_list=list(item_set)

### Control Flow

**if** statement:

In [68]:
if 1>2:
    message="if only I were greater than two..."
elif 1>3:
    message= "elif stands for 'elsr if'"
else:
    message="when all else fails use else (if you want to)"
    

a **ternary** if-then-else on one line

In [69]:
#ternary -> of three
parity= "even" if x%2==0 else "odd"

**while** statement:

In [70]:
x=0
while x<10:
    print(x, "is less than 10")
    x+=1

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


**for** statement

In [71]:
for x in range(10):#아이트러블??
    #generator??
    print(x,"is less than 10")

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


continue and break statement:

In [72]:
for x in range(10):
    if x==3:
        continue
    if x==5:
        break
    print(x)

0
1
2
4


### Truthiness

In [73]:
one_is_less_than_two=1<2
true_equals_false=True==False

Python uses the value **None** to indicate a nonexistent value

In [74]:
x=None
print(x==None)
print(x is None)


True
True


The following are all “Falsy”:

```python
False
None
[] : (an empty list)
{} : (an empty dict)
""
set()
0
0.0
```

In [77]:
s='abc'
if s:
    first_char =s[0]
else:
    first_char = ""

In [78]:
first_char=s and s[0]#???

In [79]:
x=None
safe_x =x or 0
safe_x

0

In [None]:
#short circuit notation

- Python has an **all** function, which takes a list and returns True precisely when every element is truthy, and 
- an **any** function, which returns True when at least one element is truthy:

In [80]:
all([True,1,{3}])
all([True,1,{}])
any([True,1,{}])
all([])#??? no falsy!
any([])

False

## The Not-So-Basics

### Sorting

In [None]:
x=[4,1,2,3]
y=sorted(x)
x.sort()

In [81]:
x=sorted([-4,1,-2,3],key=abs,reverse=True)
wc=sorted(word_counts.items(),
         key=lambda x:x[1],
         reverse=True)

wc

[('I', 2), ('am', 1), ('a', 1), ('boy', 1), ('love', 1), ('you', 1)]

### List Comprehensions

- you’ll want to transform a list into another list, by choosing only certain elements, or by transforming elements, or both. The Pythonic way of doing this is list comprehensions:
- Always use list comprehension if possible.

In [2]:
even_numbers=[x for x in range(5) if x%2==0]
squares = [x*x for x in range(5)]
even_squares= [x*x for x in even_numbers]

You can similarly turn lists into dictionaries or sets:

In [4]:
square_dict={x:x *x for x in range(5)}
square_set={x*x for x in [1,-1]}

- It’s conventional to use an underscore as the variable:

In [6]:
zeroes=[0 for _ in even_numbers]

A list comprehension can include multiple **for**s:

In [8]:
pairs=[(x,y)
      for x in range(10)
      for y in range(10)]

Easy to computer distance matrix

later **for**s can use the results of earlier ones:

In [9]:
increasing_pairs=[(x,y)
                 for x in range(10)
                 for y in range(x+1,10)]

### Generators and Iterators

- A problem with lists is that they can easily grow very big. range(1000000) creates an actual list of 1 million elements. If you only need to deal with them one at a time, this can be a huge source of inefficiency (or of running out of memory). If you potentially only need the **first few** values, then calculating them all is a waste.
- A generator is something that you can iterate over (for us, usually using for ) but whose values are produced only as needed (lazily).
- One way to create generators is with functions and the **yield** operator:

In [12]:
def lazy_range(n):
    """a lazy version of range""" #should fit indenting of """"""
    i=0
    while i<n:
        yield i
        i+=1

In [13]:
for i in lazy_range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


In [15]:
for i in lazy_range(10000):
    if i==3:break
    print(i)

0
1
2


t=lazy_range(3)
next(t)
next(t)
next(t)

In [16]:
t=lazy_range(3) 
next(t) 
next(t) 
next(t)

2

In [17]:
def lazy_inf_range():
    i=0
    while True:
        yield i
        i+=1
t= lazy_inf_range()
next(t)
next(t)
next(t)

2

A second way to create generators is by using for comprehensions wrapped in parentheses:

In [18]:
lazy_evens_below_20=(i for i in lazy_range(20) if i %2==0)

In [20]:
lazy_evens_below_20

<generator object <genexpr> at 0x00000201876F92C8>

### Randomness

- To generate random numbers, we can do with the random module
- random.random() produces numbers uniformly between 0 and 1

In [21]:
import random
four_uniform_randoms= [random.random() for _ in range(4)]
four_uniform_randoms

[0.662954355751201, 0.04743388701186346, 0.4524836077062905, 0.979524993134663]

if you want to get reproducible results:

In [22]:
random.seed(10)
print(random.random())
random.seed(10)
print(random.random())

0.5714025946899135
0.5714025946899135


random.randrange takes either 1 or 2 arguments and returns
an element chosen randomly from the corresponding range()

In [23]:
random.randrange(10)
random.randrange(3,6)

4

random.shuffle randomly reorders the elements of a list:

In [24]:
up_to_ten=list(range(10))
random.shuffle(up_to_ten)
print(up_to_ten)

[4, 5, 8, 1, 2, 6, 7, 3, 0, 9]


To randomly pick one element from a list:

In [25]:
my_best_friend = random.choice(["Alice","Bob","Charlie"])

To randomly choose a sample of elements without replacement (i.e., with
no duplicates)

In [27]:
lottery_numbers=range(60)
winning_numbers= random.sample(lottery_numbers,6)

To choose a sample of elements with replacement (i.e., allowing duplicates)

In [28]:
four_with_replacement=[random.choice(range(10)) for _ in range(4)]
four_with_replacement

[2, 9, 5, 6]

### Regular Expressions

- Regular expressions provide a way of searching text.
- They are incredibly useful but also fairly complicated, so much so that there are entire books written about them.

In [29]:
import re
print(all([
    not re.match("a","cat"),
    re.search('a','cat'),
    not re.search('c','dog'),
    3==len(re.split('[ab]','carbs')),
    'R-D-'== re.sub('[0-9]','-','R2D2')
]))# all true

True


### Object-Oriented Programming

In [32]:
#what is pascalcase?
class Set:
    def __init__(self,values=None):
        
        self.dict={}
        
        if values is not None:
            for value in values:
                self.add(value)
                
    def __repr__(self):
        return 'set: '+ str(self.dict.keys())
    
    def add(self,value):
        self.dict[value]=True
        
    def contains(self,value):
        return value in self.dict
    
    def remove(self,value):
        del self.dict[value]

In [33]:
s=Set([1,2,3])
s.add(4)
print(s.contains(4))
s.remove(3)
print(s.contains(3))
#first week class

True
False


### Functional Tools

- When passing functions around, sometimes we’ll want to **partially apply (or curry)** functions to create new functions.

In [2]:
def exp(base,power):
    return base**power

def two_to_the(power):
    return exp(2,power)

In [12]:
two_to_the(3)

8

A different approach is to use functools.partial :

In [4]:
from functools import partial
two_to_the =partial(exp,2)
print(two_to_the(3))

8


In [13]:
square_of= partial(exp,power=2)
print(square_of(3))

9


We will also occasionally use **map, reduce, and filter**, which provide functional alternatives to list comprehensions:

- Always use map, reduce, and filter if possible

#### Map

In [14]:
def double(x):
    return 2*x

xs=[1,2,3,4]
twice_xs=[double(x) for x in xs]
twice_xs=map(double,xs)

list_doubler=partial(map,double)
twice_xs=list_doubler(xs)

In [16]:
def multiply(x,y): return x*y

products=map(multiply,[1,2],[4,5])
list(products)

[4, 10]

In [17]:
def multiply(x,y,z): return x*y*z

products=map(multiply,[1,2],[4,5],[10,20])
list(products)

[40, 200]

#### Filter

In [19]:
def is_even(x):
    """True if x is even, False if x is odd"""
    return x%2==0

x_evens = [x for x in xs if is_even(x)]
x_evens=filter(is_even,xs)
#T/F
print(list(x_evens))
list_evener=partial(filter,is_even)
x_evens=list_evener(xs)
print(list(x_evens))

[2, 4]
[2, 4]


#### Reduce

In [22]:
from functools import reduce

def multiply(x,y): return x*y

xs=[1,2,3]
x_product=reduce(multiply,xs)
print(x_product)
list_product=partial(reduce,multifly)
x_product=list_product(xs)
print(x_product)

6
6


### enumerate 

- To iterate over a list and use both its elements and their indexes:

In [26]:
documents=['I','am','a','boy']

for i in range(len(documents)):
    document=documents[i]
    print(i,document)
    
i=0
for document in documents:
    print(i,document)
    i+=1

0 I
1 am
2 a
3 boy
0 I
1 am
2 a
3 boy


The Pythonic solution is enumerate , which produces tuples (index, element) :

In [27]:
for i,document in enumerate(documents):
    print(i,document)

0 I
1 am
2 a
3 boy


In [28]:
for i in range(len(documents)): print(i)
for i,_ in enumerate(documents): print(i)

0
1
2
3
0
1
2
3


### zip and unzip

- To zip two or more lists together. 
- zip transforms multiple lists into a single list of tuples of corresponding elements:

In [8]:
list1=['a','b','c']
list2=[1,2,3]
list(zip(list1,list2))

[('a', 1), ('b', 2), ('c', 3)]

You can also “unzip” a list using a strange trick:

In [9]:
pairs=[('a',1),('b',2),('c',3)]
letters,numbers=zip(*pairs) #re?
print(letters,numbers)

('a', 'b', 'c') (1, 2, 3)


In [10]:
pairs=[('a',1),('b',2),('c',3)]
letters,numbers=zip(('a',1),('b',2),('c',3)) 
print(letters,numbers)

('a', 'b', 'c') (1, 2, 3)
