# Module 1: Introduction to Python

Python is a high-level programming language with extensive libraries available to perform various data analysis tasks. 

The following tutorial contains examples of using various data types, functions, and library modules available in the standard Python library. Read the step-by-step instructions below carefully. To execute the code, click on each cell below and press the `SHIFT-ENTER` keys simultaneously.

<!-- 
This tutorial consists of the following:
1. Assignment
2. Print
3. Comment
...
 -->

In this tutorial, we will overview the basics in Python.
1. Data Types
2. Operations
3. Control Flows
4. Function
5. File I/O

## A Code Sample
Readability: enough to understand the Code!

In [1]:
name = 'Eunji Kim'     # Your name
x = 34 - 23            # A comment.
y = "Hello"            # Another one.
z, w = 3.45, -1        # Assign two variables at once
if z == 1.23 or y == "Hello":
    x = x + 1
    y = y + ", " + name + '!'   # String concat.
print('x = {}'.format(x))
print(y)

x = 12
Hello, Eunji Kim!


In the example above, we can find some characteristics of Python syntax.

- Indentation matters to code meaning
    - Block structure indicated by indentation
- `#` for comments
    - Start comments with `#`, rest of line is ignored.
- First assignment to a variable creates it
    - Variable types don’t need to be declared.
    - Python figures out the variable types on its own. 
- Assignment is `=` and comparison is `==`
    - Multiple variables can be assigned at once
- For numeric operations `+` `-` `*` `/` `%` are as expected
    - Special use of `+` for string concatenation
- Logical operators are words (`and`, `or`, `not`) not symbols
- The basic printing command is `print`
     - `{}` for formatting

---

## 1.1 Elementary Data Types

Every value in Python has a datatype. 

There are various data types in Python. Some of the important types are listed below.

| &nbsp;  |  Data Type | Class | Example | Mutable | Iterable |
|:----- -:|:-------|:----------:|:------------|:-------:|:---:|
|Atomic   | Integer | `int`    | x = 4       | &nbsp;  | &nbsp;  |
| &nbsp;  | Float | `float`    | x = 3.142   | &nbsp;  | &nbsp;  |
| &nbsp;  | Boolean | `bool`   | x = True    | &nbsp;  | &nbsp;  |
| Sequence    | String | `str`     | x = "this" or x = 'this' | &nbsp; | O |
| &nbsp;  | List | `list`  | x = [1, 3.3, 'python'] | O | O |
| &nbsp;  | Tuple | `tuple`    | x = (6, 'program', False) | &nbsp;  | O |
| Collection  | Set | `set`        | x = {7,1,3,6,9} | O | O |
| &nbsp;  | Dictionary | `dict` | x = {'a': 1, 'b': 2, 'c': 3} | O | O |

> #### Mutable?
> Mutable types have methods for in-place modification, but immutable types do not.
> - list (`list.append()`)
> - set (`set.pop()`)
> - dictionary (`dict.pop()`)

In [2]:
"""Atomic types"""
a = 6    # integer
print(a, "is of type", type(a))
a = 3.0  # float
print(a, "is of type", type(a))
a = True  # boolean, True or False
print(a, "is of type", type(a))

6 is of type <class 'int'>
3.0 is of type <class 'float'>
True is of type <class 'bool'>


In [3]:
"""Sequence types"""
mystr = "this is coffee" # string
print(mystr, "is of type", type(mystr))
mylist = [6, 99, 42, 'Apple'] # list
print(mylist, "is of type", type(mylist))
mytuple = (6, 'program', False) # tuple
print(mytuple, "is of type", type(mytuple))

this is coffee is of type <class 'str'>
[6, 99, 42, 'Apple'] is of type <class 'list'>
(6, 'program', False) is of type <class 'tuple'>


In [4]:
"""Collection types"""
myset = {7, 1, 3, 6, 9} # set
print(myset, "is of type", type(myset))
mydict = {'a': 1, 'b': 2, 'c': 3} # dictionary
print(mydict, "is of type", type(mydict))

{1, 3, 6, 7, 9} is of type <class 'set'>
{'a': 1, 'b': 2, 'c': 3} is of type <class 'dict'>


### List and Tuple
The most popular data type that holds multiple pieces of data in sequence. 
- list and tuple enclosed in brackets `[]` and parenthesis `()`
- lists are mutable, while tuple is immutable
    - This means tuples cannot be changed while lists can be modified.

Indexing of list and tuple
- Index starts with `0`. 
- Items can be accessed by indexing with brackets `[]`.

![](./figs/prime.png)

In [5]:
# create a list of primes
primes_list = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
print(primes_list)
# you can access items in list
print(primes_list[3])
print(primes_list[-3]) # reverse indexing
print(primes_list[7:9])  # slicing
print(primes_list[1:-3]) # slicing
print(primes_list[:5])   # slicing

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
7
23
[19, 23]
[3, 5, 7, 11, 13, 17, 19]
[2, 3, 5, 7, 11]


In [6]:
# create a tuple of primes
primes_tuple_manually = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31)
print(primes_tuple_manually)
# or, you can make it using `tuple` method
primes_tuple = tuple(primes_list)
print(primes_tuple)

# you can access items in tuple in the same way
print(primes_tuple[3])
print(primes_tuple[-3]) # reverse indexing
print(primes_tuple[7:9])  # slicing
print(primes_tuple[1:-3]) # slicing
print(primes_tuple[:5])   # slicing

(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31)
(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31)
7
23
(19, 23)
(3, 5, 7, 11, 13, 17, 19)
(2, 3, 5, 7, 11)


In [7]:
# add new elements to list
primes_list.append(36) # add single item
print(primes_list)

primes_list.extend([41, 43]) # add multiple item
print(primes_list)

primes_list = primes_list + [47, 51, 53] # add multiple item
# primes_list += [47, 51, 53] # the identical expression
print(primes_list)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 36]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 36, 41, 43]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 36, 41, 43, 47, 51, 53]


In [8]:
# add new elements to tuple
primes_tuple = primes_tuple + (37, 41, 43)
# primes_tuple += (37, 41, 43) # the identical expression
print(primes_tuple)

(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)


> Note: tuples have no methods like `append` and `extend`

In [9]:
# lists are mutable
primes_list[-6] = 37   # change the last element
print(primes_list)

# tuples are immutable
primes_tuple[-3] = 37
print(primes_list)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 51, 53]


TypeError: 'tuple' object does not support item assignment

In [10]:
# tuples are immutable
primes_tuple.append(37, )

AttributeError: 'tuple' object has no attribute 'append'

### String

![](./figs/sample.png)
- A sequence of characters
- Just like a list and tuple, the slicing operator **`[ ]`** can be used with strings. 
- Strings, however, are **immutable** (characters cannot be modified).

In [12]:
s = '''Apple'''
print(s)
s = """Apple"""
print(s)
s = 'Apple'
print(s)
s = "Apple"
print(s)
s = Apple   # cannot write string with out quotes ('', " ", """ """, ''' ''')
print(s)

Apple
Apple
Apple
Apple


NameError: name 'Apple' is not defined

In [13]:
s = "This is a string"  # s is my variable
print(s)
print() # empty line
s = '''A multiline
string'''
print(s)

This is a string

A multiline
string


In [14]:
# escape character code using backslash
tab = 'a\tb'
print(tab)
enter = 'a\nb'
print(enter)
backslash = 'a\\b'
print(backslash)

a	b
a
b
a\b


In [15]:
# Concatenation
head = 'Python'
tail = ' is fun!'
print(head+tail)

Python is fun!


In [16]:
s = 'Hello world!' # total 12 elements. Index start from '0' to '11'

print("s[4] = ", s[4]) # s[4] = 'o'
print("s[6:11] = ", s[6:11]) # # s[6:11] = 'world' # index '6' to '11' means element from 6 to 10

s[4] =  o
s[6:11] =  world


In [17]:
# Simiar to TUPLE, STRING is immutable
s = "I'm not mutable"
s[1:7] = " am"

TypeError: 'str' object does not support item assignment

String formatting

In [18]:
adj = "Red"
noun = "Alert"

cheese = "%s %s" % (adj, noun) # This style was deprecated (PEP 3101)
print(cheese)
cheese = "{} {}".format(adj, noun) # Possible since Python 3
print(cheese)
cheese = "{0} {1} {1} {0}".format(adj, noun) # Numbers can also be reused
print(cheese)
cheese = "{adj} {noun}".format(adj=adj, noun=noun) # using keyword arguments
print(cheese)

Red Alert
Red Alert
Red Alert Alert Red
Red Alert


### Set
Set is an **unordered collection** of **unique** items.
- Sets have unique values. They eliminate duplicates.
- Since, set are unordered collection, indexing has no meaning. Hence, the slicing operator **`[]`** does not work.

In [19]:
a = {1,2,2,3,3,3} # we can see total 6 elements 
print(a) # 

{1, 2, 3}


In [20]:
a[1]  # Index [1] means 2nd element

TypeError: 'set' object is not subscriptable

We can add new elements using the `add` method.

In [21]:
a.add(4)
print(a)

{1, 2, 3, 4}


We can perform set operations like union, intersection on two sets. 

In [22]:
x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

z = x.union(y)
print(z)
u = x.intersection(y)
print(u)

{'microsoft', 'google', 'cherry', 'banana', 'apple'}
{'apple'}


### Dictionary
![](./figs/dict.png)

**Dictionary** is an **unordered collection** of **key-value pairs**.
- Dictionaries are optimized for retrieving data. 
- We must know the key to retrieve the value.
- Duplicate keys cannot exist.

In [23]:
d = {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}  # 'Apple' is element and 1 is the key of element.
print("d['a'] = ", d['a']); # try to find the element from key.

d['a'] =  alpha


> Set elements and dictionary keys cannot be mutable.

In [24]:
x = set()  #initialization
x.add(3)   #int is immutable
print(x)
x.add([3]) #list is mutable
print(x)

{3}


TypeError: unhashable type: 'list'

> Iterable data types can be converted into list
> - iterable types: string, list, tuple, set, dictionary

> Using `len()`, we can get the length of iterable.

In [25]:
list("hello")

['h', 'e', 'l', 'l', 'o']

In [26]:
list(myset)

[1, 3, 6, 7, 9]

In [27]:
list(mydict) # only keys, not values

['a', 'b', 'c']

In [28]:
len('this is a string')

16

In [29]:
len({1, 2, 3})

3

---

## 1.2 Operators

Python can be used like a calculator. Simply type in expressions to get them evaluated.

**What are operators in python?**

Operators are special **symbols** in Python that carry out **arithmetic** or **logical computation**. The value that the operator operates on is called the **operand**.

For example:

```python
>>>6+3
9
```

Here, **`+`** is the operator that performs addition. **`2`** and **`3`** are the operands and **`5`** is the output of the **operation**.

### 1.2.1 Arithmatic Operators

Arithmetic operators are used to perform **mathematical operations** like **addition**, **subtraction**, **multiplication** etc.

| Symbol | Task Performed | Meaning | Example | 
|:------:|:---------------| :------: |:--------:|
| **`+`**      | Addition | add two operands or unary plus | **x + y** or **+2** | 
| **`-`**      | Subtraction | substract right operand from the left or unary minus | **x - y** or **-2** | 
| **`*`**      | Multiplication | Multiply two operands | **x \* y** |
| **`/`**      | Division | Divide left operand by the right one (always results into float) | **x / y** | 
| **`//`**     | Integer/Floor division | division that results into whole number adjusted to the left in the number line | **x // y** | 
| **`%`**      | Modulus (remainder) | remainder of the division of left operand by the right | **x % y** (remainder of **x/y**) | 
| <b>`**`</b>     | Exponentiation (power) | left operand raised to the power of right | **x \*\* y** (**x** to the power **y**) |

In [30]:
x = 16
y = 3

print('x + y =', x+y)
print('x - y =', x-y)
print('x * y =', x*y)
print('x / y =', x/y)
print('x // y =', x//y)
print('x % y =', x%y)
print('x ** y =', x**y)

x + y = 19
x - y = 13
x * y = 48
x / y = 5.333333333333333
x // y = 5
x % y = 1
x ** y = 4096


### 1.2.2 Comparison/Relational operators

Comparison operators are used to **compare values**. It either returns **True** or **False** according to the **condition**.

| Symbol | Task Performed | Meaning | Example | 
|:----:| :--- |:--- |:---: |
| **`>`** | greater than | True if left operand is greater than the right | **x > y** | 
| **`<`** | less than | True if left operand is less than the right | **x < y** | 
| **`==`** | equal to | True if both operands are equal | **x == y** | 
| **`!=`**  | not equal to | True if both operands are not equal | **x != y** | 
| **`>=`**  | greater than or equal to | True if left operand is greater than or equal to the right | **x >= y** | 
| **`<=`**  | less than or equal to | True if left operand is less than or equal to the right | **x <= y** | 

Note the difference between **`==`** (equality test) and **`=`** (assignment)

In [31]:
x = 16
y = 3

print('x > y', x>y)
print('x < y', x<y)
print('x == y', x==y)
print('x != y', x!=y)
print('x >= y', x>=y)
print('x <= y', x<=y)

x > y True
x < y False
x == y False
x != y True
x >= y True
x <= y False


Comparisons can also be chained in the mathematically obvious way. The following will work as expected in Python (but not in other languages like C/C++):

In [32]:
z = 3
0.5 < z <= 1

False

## 1.2.3 Logical/Boolean operators 

Logical operators are the **`and`**, **`or`**, **`not`** operators.

| Symbol | Meaning | Example | 
|:----:| :---: |:---:|
| **`and`** |  True if both the operands are true | **x and y** | 
| **`or`** |  True if either of the operand is true | **x or y** | 
| **`not`** |  True if operand are false (complements the operand) | **not x** | 


#### Example : Logical operators in Python

In [33]:
x = True  # e.g., (7 > 3)
y = False # e.g., (7 < 3)

print('x and y is', x and y) # False
print('x or y is', x or y) # True
print('not x is', not x) # False
print('not not x is', not not x) # False

x and y is False
x or y is True
not x is False
not not x is True


In [34]:
True and (not(not False)) or (True and (not True))  # What will be output?

# True and (not(True)) or (True and (False))
# True and False or (False)
# False or False
# False

False

### 1.2.4 Assignment operators

Assignment operators are used in Python to **assign values** to **variables**.

- **`a = 5`** is a simple assignment operator that assigns the value 5 on the right to the variable **`a`** on the left.

- There are various compound operators in Python like a **`+= 5`** that adds to the variable and later assigns the same. It is equivalent to **`a = a + 5`**.

| Symbol | Example | Equivalent to | 
|:---:|:---:|:---:|
| **`=`** | **x = 5** | **x = 5** | 
| **`+=`** | **x += 5** | **x = x + 5** | 
| **`-=`** | **x -= 5** | **x = x - 5** | 
| **`*=`** | **x \*= 5** | **x = x \* 5** | 
| **`/=`** | **x /= 5** | **x = x / 5** | 
| **`%=`** | **x %= 5** | **x = x % 5** | 
| **`//=`** | **x //= 5** | **x = x // 5** | 
| <b>`**=`</b> | **x \*\*= 5** | **x = x \*\* 5** | 

The binary operators can be combined with assignment to modify a variable value. For example:

In [35]:
x = 1
x += 2 # add 2 to x
print("x is", x)
x **= 2 # x := x^2
print('x is', x)

x is 3
x is 9


### 1.2.5 Special operators

Python language offers some special types of operators like the identity operator or the membership operator. They are described below with examples.

#### 1.2.5.1. Identity operators

**`is`** and **`is not`** are the identity operators in Python. They are used to check if two values (or variables) are located on the same part of the **memory**. Two variables that are equal does not imply that they are **identical**.

| Symbol | Meaning | Example | 
|:---:| :---: |:---:|
| **`is`** |  True if the operands are identical (refer to the same object) | **x is True**  | 
| **`is not`** |  True if the operands are not identical (do not refer to the same object)  | **x is not True** | 

In [36]:
x1 = 6
y1 = 6
x2 = 'Hello'
y2 = 'Hello'
x3 = [1,2,3] # list
y3 = [1,2,3] # list

# Output: False
print(x1 is not y1)

# Output: True
print(x2 is y2)

# Output: False because two list [] can never be equal
print(x3 is y3)

False
True
False


**Explanation:**

Here, we see that **`x1`** and **`y1`** are integers of same values, so they are equal as well as identical. Same is the case with **`x2`** and **`y2`** (strings).

But **`x3`** and **`y3`** are list. They are equal but not identical. It is because interpreter locates them **separately in memory** although they are equal.

> check memory address using `id()`.

### 1.2.5.2. Membership operators

**`in`** and **`not in`** are the membership operators in Python. They are used to test whether a value or variable is found in a **sequence** (**string**, **list**, **tuple**, **set** and **dictionary**).

In a dictionary we can only test for presence of **key, not the value**.

| Symbol | Meaning | Example | 
|:---:| :---: |:---:|
| **`in`** |  True if value/variable is found in sequence | **5 in x**  | 
| **`not in`** |  True if value/variable is not found in sequence | **5 not in x**  | 


In [37]:
x = 'Hello world'
y = {1:'a', 2:'b'} # dictionary 1 is key and 'a' is element. So we access element without its key.

print('H' in x)  # Do we have 'H' in 'Hello World' ?
print('hello' not in x)  # Do we have 'hello' in 'Hello World' ?
print(1 in y) # Do we have 1 in keys of y?
print('a' in y) # False because we cannot identify 'a' without its key hence it is Flase.

True
True
True
False


**Explanation:**

Here, **'`H`'** is in **`x`** but **'`hello`'** is not present in **`x`** (remember, Python is case sensitive). Similary, **`1`** is key and **'`a`'** is the value in dictionary y. Hence, **`'a'in y`** returns **`False`**.

## 1.3 Control Flow Statements

Similar to other programming languages, the control flow statements in Python include if, for, and while statements. Examples on how to use these statements are shown below. 

### 1.3.1 Conditional Statements
`if-else`, `if-elif-else`

In [38]:
# using if-else statement
# check the properties of a given number

x = 10

if x % 2 == 0:
    print("x =", x, "is even")
else:
    print("x =", x, "is odd")

if x > 0:
    print("x =", x, "is positive")
elif x < 0:
    print("x =", x, "is negative")
else:
    print("x =", x, "is neither positive nor negative")

x = 10 is even
x = 10 is positive


### 1.3.2 Loops
Loops are a way to repeat the same code multiple times.
- `for` do someting over iterable
- `while(condition):` do something while the `condition` is truthy

In [39]:
# using for loop
for i in range(10):
    if i % 2 == 0: # even
        print(i)

0
2
4
6
8


In [40]:
# using for loop with a list
mylist = ['this', 'is', 'a', 'list']
for word in mylist:
    print(word, '==>', word.replace("is", "at"))

mylist2 = [len(word) for word in mylist]  # list comprehension
print(mylist2)

this ==> that
is ==> at
a ==> a
list ==> latt
[4, 2, 1, 4]


In [41]:
# using for loop with list of tuples
states = [('MI', 'Michigan', 'Lansing'),
          ('CA', 'California', 'Sacramento'),
          ('TX', 'Texas', 'Austin')]

sorted_capitals = [state[2] for state in states]
sorted_capitals.sort()
print(sorted_capitals)

['Austin', 'Lansing', 'Sacramento']


In [42]:
# using for loop with dictionary
fruits = {'apples': 3, 'oranges': 4, 'bananas': 2, 'cherries': 10}
fruitnames = [k for (k,v) in fruits.items()]
print(fruitnames)

['apples', 'oranges', 'bananas', 'cherries']


In [43]:
# using while loop
mylist = list(range(-10, 10))
print(mylist)

i = 0
while (mylist[i] < 0):
    i = i + 1
    
print("First non-negative number:", mylist[i])

[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
First non-negative number: 0


## 1.4 User-Defined Functions

You can create your own functions in Python, which can be named or unnamed. Unnamed functions are defined using the lambda keyword as shown in the previous example for sorting a list of tuples. 

In [44]:
def get_primes(n):
    """return a list of primes less than or equal to n"""
    from math import sqrt # import sqrt function from math library
    primes = [] # Initialization
    # for loop to find primes less or equal to `n`
    for k in range(2, n+1):
        remainder_list = [k%d > 0 for d in primes if d <= sqrt(k)]
        if not remainder_list or all(remainder_list):
            primes.append(k)
    return primes

primes = get_primes(31)
print(primes)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]


In [45]:
def get_properties(x):
    """print the properties of a given number x"""
    if x % 2 == 0:
        print("x =", x, "is even")
    else:
        print("x =", x, "is odd")

    if x > 0:
        print("x =", x, "is positive")
    elif x < 0:
        print("x =", x, "is negative")
    else:
        print("x =", x, "is neither positive nor negative")
        
get_properties(20)

x = 20 is even
x = 20 is positive


In [46]:
import math

def discard(inlist, sorting=False):    # default value for sortFlag is False 
    """discard missing values from a list"""
    outlist = []
    for item in inlist:
        if not math.isnan(item):
            outlist.append(item)
            
    if sorting:
        outlist.sort()
    return outlist

mylist = [12, math.nan, 23, -11, 45, math.nan, 71]

print(discard(mylist, sorting=True))

[-11, 12, 23, 45, 71]


## 1.5 File I/O

You can read and write data from a list or other objects to a file.

In [47]:
states = [('MI', 'Michigan', 'Lansing'),
          ('CA', 'California', 'Sacramento'),
          ('TX', 'Texas', 'Austin'), 
          ('MN', 'Minnesota', 'St Paul')]

with open('states.txt', 'w') as f:
    f.write('\n'.join('%s,%s,%s' % state for state in states))
    
with open('states.txt', 'r') as f:
    for line in f:
        line = line.strip() # remove `\n`
        fields = line.split(sep=',')    # split each line into its respective fields
        print('State= {} ({}) Capital: {}'.format(fields[1], fields[0], fields[2]))

State= Michigan (MI) Capital: Lansing
State= California (CA) Capital: Sacramento
State= Texas (TX) Capital: Austin
State= Minnesota (MN) Capital: St Paul


### Advice

If you run into difficulties while coding, just google it.

![](./figs/googling.gif)

For debugging,
1. Read the error message carefully! You will find hint for debugging.
2. Google the error messages. Some of the more tshan 10 million Python users had already encountered and fixed the error.

