<div class="alert alert-block alert-info">
Singapore Management University<br>
CS105 Statistical Thinking for Data Science
</div>

# Lab 0: Introduction to Python

>#### Table of Contents
>
>- Python Basics
>    - Mathematical operations
>    - Logical operations
>    - Control flow
>    - List
>    - Dictionaries
>    - Function
>- The 'math' and 'random' modules
>

## 1 Python Basics

Python has been gaining popularity in recent years due to its expressiveness and the ease of learning.  Moreover, the ecosytem around Python has grown rapidly.  Many open-sourced libaries have been developed and matured over the years.  This is especially true if one is interested in doing data science, machine learning and deep learning.  Python-based packages like NumPy, pandas, Matplotlib, scikit-learn and TensorFlow are becoming the de facto tools embraced by the data science community.  In these labs, you will gradually learn these tools to complement the theory you learn in class.

Having a background in C programming will help you pick up Python quickly.  Here we would like to highlight a few differences <br>
- **Indentation** matters in Python.  In C you will structure your code using `{ }` and with `;` but spaces don't matter.  In Python, spaces are used to dictate blocks of codes.  Wrong indentation, and your codes won't run.
- C is a compiled language while Python is **interpreted**.  This allows you to code, run, and iterate the process quickly to get results.
- C is statically typed, while Python is **dynamically typed**.  In C, you have to declare the variable type before you can assign a value to it.  In Python you simply assign a value to a variable and the type is defined there and then.  You can re-assign the value, hence changing the type at the same time.

We have two quick examples to illustrate the points above, before we dive into the details of Python. <br>

Here you see your codes won't run if the indentation is wrong.

In [1]:
print("hello python")
 print("hello world") # extra leading space (anything after # is comment)

IndentationError: unexpected indent (3267430682.py, line 2)

And here is  an example which shows that type of `a` can be changed dynamically.

In [3]:
a = 100
print("a is", type(a))
a = "one hundred"
print("a is now", type(a))

a is <class 'int'>
a is now <class 'str'>


Here is the official Python 3.10 documentation reference: https://docs.python.org/3.10/

### 1.1 Mathematical operations

Let's now dive into Python proper.  Thanks to being an interpreted language, you can run a few lines of code and see the results immediately.  Often you can use it like a calculator.  In Python, mathematical operations are straight-forward.  You can do addition, subtraction, multiplication in a way similar to the C language.

In [4]:
a = 17
b = 5
a, b

(17, 5)

In [5]:
a + b

22

In [6]:
a - b

12

In [7]:
a * b

85

For division, you use single forward slash `/` for floating point division.

In [8]:
a / b

3.4

For integral division, you use double forward slash `//` for integer division.  This essentially divides `a` by `b` and gives you the integer part of the division.

In [9]:
a // b

3

For modulus operation, you use `%`.  Here you will get the remainder after dividing `a` by `b`.

In [10]:
a % b

2

For exponentiation, note that in Python you use `**` and **not** `^`.  In Python, the operator `^` is reserved for XOR bitwise operation.

In [11]:
a ** b 

1419857

### 1.2 Logical operations

You can also assign boolean values to your variables.  Use the keywords `True`, `False`, but please note the capitalization, as Python is case senstive.

In [12]:
a = True
b = False
c = not b
a, b, c

(True, False, True)

Here is a summary of logical operators in Python with some examples.

| Operation | Keyword |
| --- | --- |
| equal | == |
| not-equal | != |
| and | and |
| or | or |

In [13]:
a, b, c = True, True, False
(a == b) and (b == c)

False

In [14]:
(a != b) or (b != c)

True

### 1.3 Control flow

`if-else` constructs in Python are intuitive.  The syntax to watch out are the following:
- the colon `:` at the end of the condition statement
- indentation - this is super critical.  Indentation in Python is used to segregate blocks of codes

In [15]:
x = 10
if x > 0:
    print("positive")
elif x < 0:
    print("negative")
else:
    print("zero")

positive


For a simple `if` condition you can also succintly write it into a one-liner.

In [16]:
x = 17
even = True if x % 2 == 0 else False
even

False

Python provides a `range` object which is a compact representation of contiguous range of numbers.  Say if you want to store numbers from the first $5$ whole numbers.  You can define it like this.

In [17]:
rng = range(5)

Let's unwrap this to each what this range represents

In [18]:
for x in range(5):
    print(x)

0
1
2
3
4


We can also specify how to skip the elements.  For example if we want to keep one, skip three, keep one,skip three, etc... we can do the following.  You need to pass the `start` number, `end` number, `skip` elements, but noted that `end` is excluded - in the example below, 10 is excluded.

In [19]:
for i in range(1, 10, 3):
    print(i)

1
4
7


`while` construct should look familiar as well.

In [20]:
c = 0
n = 1024
while n > 1:
    n /= 2
    c += 1
c

10

### 1.4 List

One of the most fundamental data structures in Python is  the `list`.  It is similar to array in C programming, in the sense that it allows you to hold a collection of variables and allows you to retrieve a particuar value based on its index quickly.  In C programming, you have to first declare the data type and specify the size, so that compiler can allocate sufficient memory.  A Python `list` however can hold multiple data types.  Moreover, it is dynamically sized which means you can flexibly add or remove elements.

To create a list, use square brackets `[` `]`, separate the elements with comma `,` and assign it to a variable.  Often you might not know yet what elements are available.  In that case, you can declare an empty list first.

In [21]:
a = [1, 2, 3, 4]
a

[1, 2, 3, 4]

In [22]:
b = []  ## empty list

If you have two lists and you want to 'combine' them into one single list, you can use also the `+` operator.  This is similar to the way `+` is used to concatenate two strings.  Note that the order of the variables are maintained

In [23]:
a = [1, 3, 5, 7]
b = [2, 4, 6]
c = a + b
print(c)

[1, 3, 5, 7, 2, 4, 6]


You can also just `append` an element to a list.  What this does is to put a new element at the back of the list.  This is useful when you are dynamically growing a list

In [24]:
c.append(24)
c

[1, 3, 5, 7, 2, 4, 6, 24]

Many common and useful methods in Python can be applied across different data objects.  For example, you could also use `len` to determine length of your list, i.e. the number of elements

In [25]:
n = len(a)
n

4

Membership checking is also provided by Python.  You can use keyword `in` to check if an element is found in a given list.  A boolean value will be returned in this case

In [26]:
A = [2, 4, 6, 8]
a = 2
a_in_A = a in A
a_in_A

True

Suppose we have a list of numbers, and we want to square each number and put them into a new list.  One way to do this is as follow, where we run by each element, square it and append it to the new list

In [27]:
x = [1, 2, 3, 4, 5]
y = []
for i in range(len(x)):
    y.append(x[i]**2)
print(y)

[1, 4, 9, 16, 25]


The above approach is similar to how one would do it in C. List comprehension however is a more 'pythonic' way of achieving the same effect, allowing one to elegantly express a new list based on old list.  With list comprehension, you can almost read the line as "square x for each of the x found in the list"

In [28]:
x_list = [1, 2, 3, 4, 5]
y_list = [x**2 for x in x_list]
y_list

[1, 4, 9, 16, 25]

You can further impose conditions to act as filters as well.  If you want to square only the odd numbers, you can express it as follow: "square x if x gives remainder 1 when divided by 2"

In [29]:
x_list = [1, 2, 3, 4, 5]
y = [x**2 for x in x_list if x%2==1]
y

[1, 9, 25]

### 1.5 Dictionaries

Consider a situation where you want to store the grades of CS105 for students Andy, Becky, Charlie who obtain A, B+, A-.  How can we go about doing that?  One way to do this might be to create two lists - one to store the names of students, one to store the results of the students.  To do that we need to be careful and ensure that the indices from both list refer to the same students

In [30]:
students = ["Andy", "Becky", "Charlie"]
grades = ["A", "B+", "A-"]

Above method is perfectly valid, but it is inconvenient if you want to query for results for Becky.  To do that you first need to find out the index of Becky in the `students` list and use that to retrieve from the `grades`

A dictionary on the other hand allows you to create *key-value* pairs.  Using the example above, a dictionary lets you map the students names (the key) to his or her grades (the value).  After the mapping one can easily query for the value based on the key

We create a dictionary using `{` `}`, and to represent each item the syntax we use is *key`:`value*

In [31]:
grade_dict = {"Andy" : "A", "Becky" : "B+", "Charlie":"A-"}
grade_dict

{'Andy': 'A', 'Becky': 'B+', 'Charlie': 'A-'}

Once defined, we can retrieve Becky's grade

In [32]:
grade_dict["Becky"]

'B+'

We can get all the keys as follow.  Note that we are getting a `dict_keys` object, not a list.  Often we need to cast it to list for further processing

In [33]:
keys = grade_dict.keys()
print(type(keys))
list(keys)

<class 'dict_keys'>


['Andy', 'Becky', 'Charlie']

To get the values we can do the following

In [34]:
grade_dict.values()

dict_values(['A', 'B+', 'A-'])

One of the common operations on dictionary is to iterate through each key-value pair, which we can perform using the `.items` method

In [35]:
for key, value in grade_dict.items():
    print(f"{key} : {value}")

Andy : A
Becky : B+
Charlie : A-


### 1.5 Function

Python functions start with keyword `def`.  Here is a simple function to calculate factorial

In [36]:
def fact(n):
    fact = 1
    for x in range(1,n+1):
        fact = fact * x
    return fact

fact(5)

120

Python functions allow you to return multiple values at once (in fact, the multiple values are stored in an object called *tuple*).  We write a simple function here to provide summary statistics given a list of data points

In [37]:
def summary(data):
    high = max(data)
    low = min(data)
    rng = high - low 
    avg = sum(data)/len(data)
    return high, low, rng, avg
    
data = [2,5,1,7,4]
summary(data)

(7, 1, 6, 3.8)

## 2 The 'math' and 'random' modules

Python also has a `math` library which is furnished with common functions and constants.  Before invoking the functions you need to first import the library as follow. <br>
Reference: https://docs.python.org/3.10/library/math.html

In [38]:
import math

Here are some common mathematical functions that are supported by `math`.  Do note the way the methods are called

In [39]:
x = 1.2
k = 3.0
print(f"x = {x}")
print(f"k = {k}")
print("ke^x:", k * math.exp(x))
print("ln x:", math.log(k))

x = 1.2
k = 3.0
ke^x: 9.960350768209642
ln x: 1.0986122886681098


Here `log` is defined as the natural logarithm.  In addition, `math` provides several constants, in particular `inf` can be quite useful especially when you need to do initialization.

In [40]:
print("𝛑 is", math.pi)
print("-Infinity is", -math.inf)

𝛑 is 3.141592653589793
-Infinity is -inf


Another useful module provided by Python is the `random` module, which allows us to generate random numbers.<br>
Reference: https://docs.python.org/3.10/library/random.html

In [41]:
import random

Most commonly used function is the `random.random()` which generates a number between 0 and 1.

In [42]:
random.random()

0.3872103335242221

You can also generate a random integer, say between 0 and 10 (both inclusive), as follow.

In [43]:
print(random.randint(0, 10))

5


Using `random.choice`, you can also pass a list and generate a random item from the list.

In [44]:
random.choice(["a", "b", "c"])

'c'