# Python Review

The following cell makes sure that all of the outputs of a cell are printed. We will start all of our notebooks with this code.

In [1]:
# print all the outputs in a cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [2]:
%autosave 0

Autosave disabled


## Basics

### Variables are dynamically typed

We don't need to specify variable types -- the interpreter assigns it automatically. A variable <b>a</b> can change type.

In [3]:
a = 3
type(a)

int

In [4]:
a ='Hello'
type(a)

str

### Mathematical operations

Here are examples of addition (+), subtraction (-), multiplication (\*), division, (/) and exponentiation (\*\*)

In [5]:
a = 2
b = 3
a + b
a - b
a * b
a / b
a ** b

5

-1

6

0.6666666666666666

8

In Python 2, the old way to do integer calculation 

In [6]:
(a+0.0) / b

0.6666666666666666

In [7]:
float(a) / b

0.6666666666666666

Printing numbers: we can convert them to strings

In [8]:
print ('a equals to ' + str(a))

a equals to 2


or ...

In [9]:
print ("a equals to %d " % a)

a equals to 2 


### Conditions

Check whether a is greater than b <br/>
a = 3 <br/>
b = 5

In [10]:
a = 3
b =5
if a> b:
    print ('a is greater than b')
else:
    print (' a is smaller or equal to b')
print (' I compare a and b')

 a is smaller or equal to b
 I compare a and b


More concisely:

In [11]:
result = 'a is greater than b' if a>b else 'a is smaller or equal to b'

In [12]:
result

'a is smaller or equal to b'

### Functions

In Python, 
<ul>
<li>functions do not have a return type</li>
<li>The parameters do not have a type</li>
<li>Parameters may have a default value</li>
<li>Indentation (not braces) define the scope of the function</li>
</ul>

Here is a function that sums two numbers <i>a</i> and <i>b</i> after writing a message. The default value of <i>b</i> is 0.

In [13]:
def mysum(a, b=0):
    print ('mysum is starting')
    return a+b

In [14]:
mysum(4,5)

mysum is starting


9

In [15]:
mysum(4)

mysum is starting


4

In [16]:
mysum('hello')

mysum is starting


TypeError: can only concatenate str (not "int") to str

In [None]:
mysum('Hello','World')

## Lists

Ordered and mutable sequences of items (of possibly different types). Denoted by square brackets.

In [17]:
a = [3, -4.5, 'hello', [-1,1], -1, 2]

In [18]:
a

[3, -4.5, 'hello', [-1, 1], -1, 2]

Indexing starts at 0 (first item). It can be negative too.

In [19]:
a[0]

3

In [20]:
a[3]

[-1, 1]

In [21]:
a[-1]

2

The function <i>len</i> returns the length of the list

In [22]:
len(a)

6

In python 2, **range** function returns a list, but in python 3 range function now returns a *range object* which is a iterator that allows fast iteration with minimal memory required. 

In [23]:
range(5)

range(0, 5)

In [24]:
type(range(5))

range

In [25]:
for item in range(5):
    print (item)

0
1
2
3
4


If want to modify the *range* object or want to expand out at once, convert it to a *list* first, then you can use methods:
<ul>
<li><b>append</b>: add an item at the end</li>
<li><b>insert</b>: insert an item at a specific point</li>
<li><b>delete</b>: delete an item at a specific point</li>
<li><b>remove</b>: remove the first instance of an item</li>
<li><b>pop</b>: remove and return the item at a specific index</li>
</ul>

In [26]:
l = range(5)

In [27]:
l = list(range(5))

In [28]:
type(l)

list

In [29]:
l.append(10)
l

[0, 1, 2, 3, 4, 10]

In [30]:
l.insert(5,2)
l

[0, 1, 2, 3, 4, 2, 10]

In [31]:
l.remove(2)
l

[0, 1, 3, 4, 2, 10]

In [32]:
l.pop(3)
l

4

[0, 1, 3, 2, 10]

In [33]:
del(l[3])
l

[0, 1, 3, 10]

In [34]:
del(l)

In [35]:
l

NameError: name 'l' is not defined

### Slicing

We can select part of a list with the : operator - **[start:end+1:step]**.
a[1:4] returns a list containing the items from index 1 (included) to index 4 (excluded). a[:3] returns a list containing all elements up to index 3 (excluded)

In [36]:
a

[3, -4.5, 'hello', [-1, 1], -1, 2]

In [37]:
a[1:4]

[-4.5, 'hello', [-1, 1]]

In [38]:
a[:3]

[3, -4.5, 'hello']

In [39]:
a[2:]

['hello', [-1, 1], -1, 2]

In [40]:
a[-3:]

[[-1, 1], -1, 2]

In [41]:
a[0:5:2]

[3, 'hello', -1]

Slices provide a <b>view</b>, not a copy, of the list. That is, modifications on a slice apply to the list.

In [42]:
a

[3, -4.5, 'hello', [-1, 1], -1, 2]

In [43]:
a[:2]

[3, -4.5]

In [44]:
a[:2] = [0, 0]

In [45]:
a

[0, 0, 'hello', [-1, 1], -1, 2]

### Concatenate lists

In [46]:
a = [1, 2, 3]
b = [10, 11, 12]
a + b

[1, 2, 3, 10, 11, 12]

### Extend - Append an sequence of a list

In [47]:
a.extend(b)
a

[1, 2, 3, 10, 11, 12]

### Iterating

In [48]:
for item in a:
    print(item * 2)

2
4
6
20
22
24


### Check presence

We can check whether a list contains an item with **in**, **not in**

In [49]:
a

[1, 2, 3, 10, 11, 12]

In [50]:
2 in a

True

In [51]:
4 in a

False

In [52]:
4 not in a

True

### Reverse

In [53]:
a=[5,3,8,6]

In [54]:
a.reverse()
a

[6, 8, 3, 5]

### Sort - sort the list in place

In [55]:
a.sort()
a

[3, 5, 6, 8]

In [56]:
a=[5,3,8,6]

### sorted(x) return new sorted list without changing the original x

In [57]:
b=sorted(a)

In [58]:
b

[3, 5, 6, 8]

## Tuples

* Denoted by () parenthesis.
* Just like lists, but immutable(but member objects may be mutable).  
* Support all  operations for sequences.
* If the content of a list shouldn't change, use a tuple to prevent items accidently being added, changed or deleted

In [59]:
t = (4, 5, 6)

In [60]:
t

(4, 5, 6)

In [61]:
t[0]

4

In [62]:
t[:2]

(4, 5)

In [63]:
del(t[1])

TypeError: 'tuple' object doesn't support item deletion

In [64]:
t[0]=0

TypeError: 'tuple' object does not support item assignment

### Create tuple with a list

In [65]:
a=[5,3,8,6]

In [66]:
t1=tuple(a)

In [67]:
t1

(5, 3, 8, 6)

### two-item tuple, list and int 

In [68]:
t2=([1,2],3)

In [69]:
del(t2[0][1])

In [70]:
t2

([1], 3)

In [71]:
%timeit t[:2]

96.1 ns ± 0.949 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [72]:
a

[5, 3, 8, 6]

In [73]:
%timeit a[:2]

95.1 ns ± 0.479 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


## Set

A set is an unordered collection of items. Every element is unique (no duplicates) and must be immutable (which cannot be changed). Use curly braces {}.

However, the set itself is mutable. We can add or remove items from it.

In [74]:
my_set = {1,2,3,4,3,2}
my_set

{1, 2, 3, 4}

In [75]:
my_set = {1.0, "Hello", (1, 2, 3)}

In [76]:
my_set

{(1, 2, 3), 1.0, 'Hello'}

In [77]:
a=[1,2,3,4,3,2]
type(a)
a

list

[1, 2, 3, 4, 3, 2]

In [78]:
my_set=set(a)
type(my_set)

set

In [79]:
my_set

{1, 2, 3, 4}

In [80]:
my_set.add(5)
my_set

{1, 2, 3, 4, 5}

In [81]:
my_set.remove(2)
my_set

{1, 3, 4, 5}

## Dictionaries

Dictionaries are data structures that contain (key, value) pairs, where the keys are unique. They are denoted by curly brackets { }. 

<b>Example</b>: Let us make a dictionary where the keys are letters and the values are booleans indicating which letters are vowels.

In [82]:
v = {'a':True, 'b':False, 'c':False, 'd':False, 'e':True}

Values are accessed using the keys.

In [83]:
v['b']

False

In [84]:
v['e']

True

In [85]:
v['z']

KeyError: 'z'

You can use the method **in** or **not in** to check whether it contains a key

In [87]:
'z' in v

False

In [88]:
'z' not in v

True

In [89]:
'c' in v

True

<b>In-class exercise</b>: make a dictionary that reports the number of students in each course. Assume that there are three courses (data science, linux, and statistics) with 28, 23, and 29 students, respectively.

In [90]:
d = {'data science': 28, 'linux': 23, 'statistics': 29}

In [91]:
d

{'data science': 28, 'linux': 23, 'statistics': 29}

Equivalently, we can add elements one by one:

In [92]:
d1 ={}
d1['data science'] = 28
d1['linux'] = 23
d1['statistics'] = 29

In [93]:
d1

{'data science': 28, 'linux': 23, 'statistics': 29}

Testing equality:
<ul>
<li><b>==</b> tests component-wise equality</li>
<li><b>is</b> tests whether the instance is the same</li>
</ul>

In [94]:
d == d1

True

In [95]:
d is d1

False

In [96]:
d2 = d

In [97]:
d2 is d

True

## <i>For...in</i> loops

The for loops in Python are more like "foreach" loops

In [98]:
for i in [1, 'hello', 9.98]:
    print ('The current valus is ' + str(i))

The current valus is 1
The current valus is hello
The current valus is 9.98


A very typical loop:

In [99]:
for i in range(10):
    print ('i^2 = ' + str(i**2))

i^2 = 0
i^2 = 1
i^2 = 4
i^2 = 9
i^2 = 16
i^2 = 25
i^2 = 36
i^2 = 49
i^2 = 64
i^2 = 81


### Lambda (Anonymous) Function

In Python, anonymous function is a function that is defined without a name.

While normal functions are defined using the def keyword, in Python anonymous functions are defined using the lambda keyword.

Hence, anonymous functions are also called lambda functions.

syntax **lambda arguments: expression**

With function being defined:

In [100]:
def double(x):
    return x*2

In [101]:
double(5)

10

With lambda function:

In [102]:
double_lambda = lambda x: x * 2

In [103]:
double_lambda(5)

10

### lambda function usage example:
* We want to replace the values of the column "Job", as follows:
>* No, I'm not working at the moment --> 0  
>* Yes, I have a part-time job --> 0.5  
>* Yes, I have a full-time job --> 1

In [104]:
#the following code is from Module 5 lab note. 
#You can't execute it now. Just list here for you to reference
# 
#df['Job3'] = df['Job'].apply(lambda x: 0 if x.startswith('No') \
#                                   else 0.5 if 'part-time' in x \
#                                   else 1)

We use lambda functions when we require a nameless function for a short period of time.

### List comprehension

We can create lists (or tuples, or dictionaries) through loops using a very dense syntax called "List comprehension"

<b>Example</b>: create a list with these values: 0, 1, 4, 9, 16, ..., 81

In [105]:
l = []
for i in range(10):
    l.append(i**2)

In [106]:
l

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [107]:
l = [i**2 for i in range(10)]
l

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Build the same list but we want 0s in the first 3 elements

In [108]:
[0 if i<3 else i**2 for i in range(10)]

[0, 0, 0, 9, 16, 25, 36, 49, 64, 81]

In [109]:
[i**2 if i>=3 else 0 for i in range(10)]

[0, 0, 0, 9, 16, 25, 36, 49, 64, 81]

Creating a Dictionary {0:0, 1:1, 2:4, 3:9, 4:16, ..., 9:81}

In [110]:
{i:i**2 for i in range(10)}

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

### Looping through dictionary entries

We can loop through dictionary keys

In [111]:
d

{'data science': 28, 'linux': 23, 'statistics': 29}

In [112]:
d.keys()

dict_keys(['data science', 'linux', 'statistics'])

In [113]:
type(d.keys())

dict_keys

In [114]:
list(d.keys())

['data science', 'linux', 'statistics']

In [115]:
d.values()

dict_values([28, 23, 29])

We can loop through dictionary values:

In [116]:
d.values()

dict_values([28, 23, 29])

In [117]:
for v in d.values():
    print (str(v))

28
23
29


In [118]:
list(d.items())

[('data science', 28), ('linux', 23), ('statistics', 29)]

We can loop through dictionary (key,value) pairs

In [119]:
for k, v in d.items():
    print ('the key is ' + str(k) + '.  The value is ' + str(v))

the key is data science.  The value is 28
the key is linux.  The value is 23
the key is statistics.  The value is 29


## Strings

Strings are another type of sequence, and they can be sliced and accessed just like any sequence.

In [120]:
s = 'Welcome to the best Data Science course ever!!!!'

In [121]:
s

'Welcome to the best Data Science course ever!!!!'

Some important string methods:
<ul>
<li><b>replace</b>: replaces the occurrences of a string with another</li>
<li><b>lower/upper</b>: makes the string lower/upper case</li>
<li><b>rstrip/lstrip/strip</b>: remove the occurrences of certain trailing characters on the right or left or both</li>
<li><b>split</b>: splits the string in substrings</li>
</ul>

In [122]:
s.replace('best', 'worst')

'Welcome to the worst Data Science course ever!!!!'

In [123]:
s

'Welcome to the best Data Science course ever!!!!'

In [124]:
s.lower()

'welcome to the best data science course ever!!!!'

In [125]:
s.upper()

'WELCOME TO THE BEST DATA SCIENCE COURSE EVER!!!!'

In [126]:
s

'Welcome to the best Data Science course ever!!!!'

In [127]:
s.rstrip('!?,. ')

'Welcome to the best Data Science course ever'

In [128]:
s.split()

['Welcome', 'to', 'the', 'best', 'Data', 'Science', 'course', 'ever!!!!']

<b>In class exercise:</b> In one line of code, create a list containing all words in the string s after making them lower case and eliminating the punctuation (?,.:;!) at the end. <i>Hint</i>: use list comprehension.

In [129]:
[w.lower() for w in s.rstrip('!?,. ').split()]

['welcome', 'to', 'the', 'best', 'data', 'science', 'course', 'ever']

In [130]:
list(w.lower() for w in s.rstrip('!?,. ').split())

['welcome', 'to', 'the', 'best', 'data', 'science', 'course', 'ever']

In [131]:
s.lower().rstrip('!?,.').split()

['welcome', 'to', 'the', 'best', 'data', 'science', 'course', 'ever']

## Files I/O

Files can be opened in these modes:
<ul>
<li><b>w</b>: write (overwrite the file; create the file if it does not exist)
<li><b>a</b>: append (do not overwrite the file)
<li><b>r</b>: read
</ul>

Let us now create a file with two lines

In [132]:
f = open('myfile.txt', 'w')

In [133]:
f.write('This is the first line\n')
f.write('This is the 2nd line\n')

23

21

Files must be closed, or else they may stay locked.

In [134]:
f.close()

Let us now read from that file

In [135]:
f = open('myfile.txt', 'r')

In [136]:
for line in f:
    print (line)

This is the first line

This is the 2nd line



In [137]:
f.close()