# Python Crash Course I

This material was created written by **Arman Seyed-Ahmadi**, Jupyter notebooks prepared by **Danie Benetton**.

Project financially supported by the University of British Columbia, Department of Chemical and Biological Engineering (CBHE).

July 2020

**Preferred method:**

To use this notebook, make sure Anaconda is installed [Link to Download](https://www.anaconda.com/products/individual)

This tutorial is best run in **JupyterLab.** [JupyterLab Introduction Video](https://www.youtube.com/watch?v=A5YyoCKxEOU). 

**Alternative method:**

UBC students and professors can also access this notebook through [Syzygy](https://ubc.syzygy.ca/). Log in with your CWL, upload this notebook to your files, and run the notebook.

## Topics covered in this crash course:
- The Python language and Jupyter Lab
- Basic math operations
- Variables and data types
- Basic output with formatted strings
- Lists and tuples, as well as indexing and slicing them
- The Numpy package and how to works with arrays

## Matplotlib settings

In [1]:
%matplotlib inline
import matplotlib as mpl
mpl.rcParams.update({'mathtext.fontset': 'cm'})
mpl.rcParams.update({'axes.labelsize': 22})
mpl.rcParams.update({'axes.titlesize': 16})
mpl.rcParams.update({'axes.linewidth': 0.5})
mpl.rcParams.update({'xtick.labelsize': 10})
mpl.rcParams.update({'ytick.labelsize': 10})
%config InlineBackend.figure_formats = ['svg']

## The Basics - Hello World!

The print function will print text to the output.
**To run the code**, select the cell below and either hit the Run button or use <kbd>Ctrl</kbd>+<kbd>Enter</kbd> (run cell) or <kbd>Shift</kbd>+<kbd>Enter</kbd> (run and advance):


In [2]:
print('Hello World!')

Hello World!


You can edit the text inside the single quotes to change the text output.
To get the new results in the output, you have to run the cell again.

## Aside about JupyterLab format

The Jupyter Notebook format uses containers called *cells* to hold content.

There are two main types:
- *Markdown* for text
- *Code* for executable script

This is a Markdown cell. To see its content and how formatting is applied, double click on the cell. It contains text that can be formatted to be:

# Header1
## Header2
...

**bold**

*italics*

> Blockquotes

The content can also include links, images, tables, and equations. To learn more, please visit this [link](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html).

*If you have any formatting questions, don't hesitate to search the web. The chances of finding your answer is close to 100%, if not more!*

In [3]:
# This is a Code cell
# Hash marks are used to indicate comments so the line will not be executed

Each Code cell has empty square brackets [  ]: to the left of the cell before it is first run.
After you run the code in a cell, the brackets will be filled with a number indicating the order in which cells have been run. This helps to keep track of order when running multiple cells.

## Numbers and Math

The *IPython console* is essentially an advanced calculator. 

Operations such as `+` (addition), `-` (subtraction), `/` (division), `*` (multiplication) and `**` (exponentiation) can be done right away in the console.

The order of operations is **PE(M&D)(A&S)** = *Parentheses, Exponentiation, Multiplication and Division, Addition and Subtraction.*

In [4]:
5+10

15

In [5]:
(1+1)**2

4

The `%` sign gives you the remainder of the division of two numbers
> e.g. a%b gives you the remainder of a/b.

We can also compare two values with `==`,`!=`,`<`,`>`,`<=`,`>=`. The result of such operations is either *True* or *False*.

In [6]:
# What is the remainder of 5/2?
# Run this cell to see the result! You should get 1!
5%2

1

In [7]:
# Is 7.1 greater or equal to 6?
# Run this cell to see the result! You should get True!
7.1>=6

True

> Note: Python is case sensitive. When writing code, "true" is different than "True".

## Variables

You can assign values to variables using the assignment operator `=`


In [8]:
x = 2.5
print(x)

2.5


You can store many types of data in variables including *numbers, characters and lists of numbers and characters*

In [9]:
my_name = 'Student'
print(my_name)

Student


Note that printing a variable is different than printing a string. **This is a common mistake** to be aware of when performing operations with strings

In [10]:
print(my_name)
print('my_name')

Student
my_name


You can join strings by using the `+` sign

In [11]:
uni_name = 'University ' + 'of ' + 'British Columbia'
print(uni_name)

University of British Columbia


### Variable naming rules:
- The name of variables can contain **characters**, **numbers** and **underscores ( _ )**
- Variable names CANNOT start with a number
- Special characters are not allowed (#, %,!, ^, &, etc.) in naming variables
- Variables names cannot be any of the **reserved Python keywords**, which can be found in the [documentation](https://www.tutorialspoint.com/What-are-Reserved-Keywords-in-Python)

## Variable or Data Types

- **Booleans**: True or False
- **Numbers**:
 - Integers (1, 108, -12)
 - Floating-point numbers (0.2, 3.14)
 - Complex numbers (1+3j, 3.5+12j)
- **Strings**: Any type of text
 - 'John', 'Python', 'Morning'
 
We can get see the data type of any variable or data by using the `type( )` function in Python 

Let's try this with a couple of examples - you can replace the argument of the function to try out different data types

In [12]:
type(True)

bool

In [13]:
type('Python')

str

## Manipulating Text

Looking back on the `uni_name` variable that was defined in the Variables section above, we can see that variables can be stored across cells
> Hint: If running the cell below returns an error, remember that you must first assign a value to a variable before you can use it. Try running the cell where `uni_name` is defined

In [14]:
print(uni_name)

University of British Columbia


We can manipulate the variable using tools called *Methods*

To capitalize the entire string, we use the `upper()` method. Similarly, `lower()` is used to make the entire string lowercase.

In [15]:
uni_name.upper()

'UNIVERSITY OF BRITISH COLUMBIA'

In [16]:
uni_name.lower()

'university of british columbia'

Methods exist for other variable types as well. These will be covered more [later](#Methods). 

## It's Coding Time!

A basic program takes some inputs, performs one or more operations, and generates output. 

Here is what we'll do for our first program:

- Create two variables **a** and **b** and assign two numbers to them.
- Using the `print()` function we learned before, display the phrase 'The sum of a and b is =', and in front of that, print **a+b**

In [17]:
# Try it in this code cell




The program probably looks something like this:
> `a = 65.2`  
`b = 72`  
`print('The sum of a and b is =', a + b)`

There are other ways to complete the same task as this program does.  
Feel free to experiment with the little program above and see how many ways you can complete the same objective.

## Formatted Strings

Try to display this in the output:
> a = 65.2, b = 72, and the sum of a and b is 137.2

You can do this by joining many pieces of text (and you still don’t have any control on how the numbers are shown).

In [18]:
# Try it in this code cell




*Formatted Strings* are an efficient way to embed variables (numbers or other strings) inside of a string.
> The `f` before the quotation marks tells Python that this is a formatted string

In [19]:
a = 65.2
print(f'a is equal to {a}')

a is equal to 65.2


With formatted strings, not only you can easily place variables inside text, but also you can control the **format** of the shown variable.  
  The pattern is: `f'{(variable_name):(width).(precision)(type)}'`

In [20]:
a = 3.1415926
print(f'a is {a}')
print(f'a is {a:2.3f}')

a is 3.1415926
a is 3.142


## Comments in Python

Comments in Python usually start with a `#` sign, like the example below:

In [21]:
# This line is a comment, Python ignores this.
a = 65.2
print(f'a is equal to {a}')

a is equal to 65.2


The comment will be ignored by the interpreter and not displayed in the output. It is useful for documenting code and noting the intended purpose of the program. 

## Bags of Data in Python

Let’s assume that we want to save the x coordinates of a number of points.  
How would you do it? Store each x coordinate in a separate variable?

We can store a bunch of numbers or strings inside an *object*, which is like a bag:

In [22]:
numbers = [2.5, 1, 2, -110, 2]
names = ['Niagara', 'Rocky', 'Mercedes-Benz']
names_nums= ['Niagara', 'Rocky', 3.14, 9]

Each of the above “bags” of data is called a **List**. Lists are ordered collections of numbers, strings, etc. 

The `len()` function gives us the number of items in a list.

In [23]:
len(numbers)

5

## Indexing in Python

So, how can we access different items in a list?

Each item in a list has a unique **index** or simply a **number**, which shows the location of that item in the list.

In [24]:
numbers = [2.5, 1, 2, -110, 2]
numbers[3]

-110

In [25]:
numbers[0]

2.5

> **Note**: As seen above, indices in Python start from **0**. This is important to remember when calling items from a list or array.

## Slicing Lists

We can extract certain elements from a list by *slicing*

Slicing follows the syntax `list_name[start:stop:step]`

In [26]:
#Play around with the values in the slice to see what the output is
nums = [10, 20, 30, 40, 50, 60, 70, 80, 90]
some_nums = nums[2:7:2]
some_nums

[30, 50, 70]

## Two Dimensional Lists

Just as we can have lists of numbers or words, we can also define lists of other lists:

In [27]:
ones = [1, 1, 1, 1, 1, 1]
twos = [2, 2, 2, 2]
threes = [3, 3, 3, 3, 3]
numbers = [ones, twos, threes]
numbers

[[1, 1, 1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3, 3]]

Now, the **numbers** list is actually a **list of lists**, meaning that it contains three other lists.
The sub-lists don’t necessarily have to be the same size as each other.

To access items in each sub-list, we just need to use the indexing notation twice:

In [28]:
numbers[0][3]

1

## Some List Methods

<a id='Methods'></a>
Let’s consider the list numbers = [2.5, 1, 2, -110, 2]
- `numbers.append(object)`: Lets you add an item to the end of a list.
- `numbers.sort()`: Sorts the items numerically and alphabetically (only if all items are either numeric or strings).
- `numbers.reverse()`: Reverses the order of the list items.
- `numbers.copy()`: Creates a copy of the list (Why does this method exist? hmmm...)
- `numbers.insert(index, object)`: Lets you insert an object in a specific index in the list.
- `numbers.count(object)`: Returns the number of objects in the list.

In [29]:
# Try some of the methods in this cell
numbers = [2.5, 1, 2, -110, 2]
numbers.append(3)
numbers

[2.5, 1, 2, -110, 2, 3]

## Removing Items from a List

How to remove an item from a list?

In [30]:
numbers = [2.5, 1, 2, -110, 2]
del(numbers[3])
print(numbers)

[2.5, 1, 2, 2]


There are other item removal methods
- `numbers.remove(object)`: Removes the first time it sees object.
- `numbers.clear()`: Removes every item in the list. It is equivalent to `numbers = []`.
- `numbers.pop()`: Removes the last item in a list.

In [31]:
# Try some of the methods here
numbers = [2.5, 1, 2, -110, 2]
numbers.remove(2)
print(numbers)

[2.5, 1, -110, 2]


## List Concatenation

How can we join two lists together?

In [32]:
numbers = [2.5, 1, 2, -110, 2]
names = ['Niagara', 'Rocky']
new_list= numbers + names
print(new_list)

[2.5, 1, 2, -110, 2, 'Niagara', 'Rocky']


Did you notice that we can use this method to add items to a list, just like the `numbers.append(object)` method?

You have to be careful, though! The following will not work *(why?)*:

In [33]:
# This code should produce an error
new_list= numbers + 'Niagara'

TypeError: can only concatenate list (not "str") to list

> Hint: Look at the error statement to find out what went wrong

The `*` operator can also be used with lists.

In [34]:
list = [1, 2, 3]
list2 = 2*list
print(list2)

[1, 2, 3, 1, 2, 3]


## Tuples

Tuples in Python are collections objects, just like lists. To define tuples, instead of using `[]`, we should use `()`:

In [35]:
a_tuple= (2.5, 1, 2, 'Niagara', 'Rocky')

The main difference between a list and a tuple is that you can change the items in a list, remove items and change the order. This is not possible for tuples. In other words, lists are **mutable** and tuples are **immutable**:

In [36]:
# This cell should produce an error
del a_tuple[2]

TypeError: 'tuple' object doesn't support item deletion

Since tuples cannot be changed, it is faster for Python to go through its items. Also, they occupy less memory.
Tuples have only two methods for them: `a_tuple.count()` and `a_tuple.index()`.

## Dictionaries

Dictionaries in Python are collections which are unordered, changeable and indexed. Dictionaries are written with curly brackets `{ }`, and they have keys and values.

In [37]:
# Defining a dictionary
my_dictionary =	{
  "school": "UBC",
  "faculty": "APSC",
  "department": "CHBE",
  "grad year": 2022
}
print(my_dictionary)

{'school': 'UBC', 'faculty': 'APSC', 'department': 'CHBE', 'grad year': 2022}


In the example above, "school" is a key and "UBC" is a value.

To access a value from a dictionary, call its key in square brackets.

In [38]:
# Accessing items from a dictionary
x = my_dictionary["department"]
print(x)

CHBE


## Packages and Modules

We have seen functions such as `print()`, `type()` or `len()` so far. These are called Python’s built-in functions.

Python has many other functions that are not loaded by default. Also, there are a lot of external libraries we may want to use.

In order to use other functions, we have to import them in our code

<p style="text-align:center;">
<img src="images/crash_course_1.png" width=600>
</p>

The `import` command can take the following forms:
- `import package.subpackage1.module2[]`
> This imports `package.subpackage1.function1()`
- `from package.subpackage1.module2 import function1`
- `from package.subpackage1.module2 import *`
> The two above each import `function1`

Also, we can create a nickname for an imported package, module or function:
- `import package.subpackage1.module2 as mod2`
> This imports `mod2.function1()`

## The Numpy Package

> We can import the Numpy package with the following line: `import numpy as np` 

- The Numpy (or Numeric Python) package is the most important Python library for scientific computing.
- It provides the powerful **array** object, and a vast collection of mathematical tools to work with arrays (linear algebra, basic statistics, etc.)
- An array is basically a (multi-dimensional) grid of data of the same type.
- Most of the higher level packages (e.g. Pandas, Scikit-learn, Scipy, etc.) are built on top of Numpy.

You remember Python is much slower compared to C++/Fortran... So, how do we deal with that?

Why not use lists instead?

**ndarray** in Numpy supports vectorized operations:

In [39]:
# This cell should produce an error
my_list= [2.5, 1, 2, -110, 2]
my_list + 25

TypeError: can only concatenate list (not "int") to list

In [40]:
import numpy as np
my_array= np.array([2.5, 1, 2, -110, 2])
my_array+ 25

array([ 27.5,  26. ,  27. , -85. ,  27. ])

**ndarray** has consistent dimensions. You cannot have an array with columns or rows of different sizes.

**ndarray** (most often) has consistent data-type, whereas you can store data of different types in a list.

An array object occupies much less space in memory compared to a list.

## Basics of Numpy Arrays

Creating a one dimensional array (i.e., a vector) with custom values:
> Don't forget to import numpy if you haven't already

In [41]:
#import numpy as np
array_1d = np.array([2.5, 1, 2, -110, 2])
array_1d

array([   2.5,    1. ,    2. , -110. ,    2. ])

Creating a two dimensional array (i.e., a matrix) with custom values:

In [42]:
array_2d = np.array([[2.5, 1, 2, -110, 2],[0, 1, 0, 1, 1]])
array_2d

array([[   2.5,    1. ,    2. , -110. ,    2. ],
       [   0. ,    1. ,    0. ,    1. ,    1. ]])

Size and shape of an array:

In [43]:
array_1d.size

5

In [44]:
array_1d.shape

(5,)

In [45]:
array_2d.size

10

In [46]:
array_2d.shape

(2, 5)

## Array Indexing and Slicing

**Array indexing** is similar to list indexing.  
A visual guide to *how* this data is stored can be found [here](http://jalammar.github.io/visual-numpy/) to help you better understand single and multi-dimensional arrays.

When we have more than one dimension, the preferred way is to use a single pair of `[ ]`, instead of `[][]` or more in lists:

In [47]:
# Preferred method
array_2d[0, 3]

-110.0

In [48]:
array_2d[0][3]

-110.0

Slicing arrays is also done in the same way:

In [49]:
array_2d[0, 2:4]

array([   2., -110.])

## Functions for Creating Arrays

It’s neither easy nor efficient (and even impossible!) to create arrays manually. Numpy has many built-in functions for array generation.
- The two Numpy functions `np.arange(start, stop, step)` and `np.linspace(start, stop, numbers)` are used to create arrays with either a specified **step size** or a specified **number of values**.
- `np.zeros((N, M))` and `np.ones((N, M))` create arrays with shape *(N, M)* filled with zeros or ones.
- `np.eye(N)` creates the identity matrix of size *N*.
- `np.random.rand(N, M)` creates an array of shape *(N, M)* filled with random **real** numbers between 0 and 1.
- `np.random.randint(a, b, size=[N,M])` creates an array of shape *(N, M)* filled with random **integer** numbers between any *a* and *b*.

There are also similar functions to generate random numbers with a certain probability distribution!

> **Exercise:** Create the array [0, 10, 20, 30, ... 100] using one of the methods described above.

In [53]:
# Ensure numpy has been imported

## Basic Array Statistics

- `np.mean(array, axis=None)` computes the mean of all values in an array, or along a particular *axis*.
- `np.median(array, axis=None)` computes the median of all values in an array, or along a particular *axis*.
- `np.std(array, axis=None)` computes the standard deviation (a measure of the scatter in data) of all values in an array, or along a particular *axis*.
- `np.max(array, axis=None)` or `np.min(array, axis=None)` computes the min/max of all values in an array, or along a particular *axis*.
- We can also compute these values with **array methods** as `array_2d.mean()`, `array_2d.max()`, `array_2d.min()` and `array_2d.std()`.

## Reshaping Arrays

Array elements are stored **sequentially** in memory. The shape of an array is just a way that it is viewed.

The reshape function provides a way to give a new shape to an array:
`numpy.reshape(array, newshape)`

We can also use the reshape method directly on an array: `array.reshape(newshape)`

Again, you can find a visual guide to how data is stored [here](http://jalammar.github.io/visual-numpy/) to help you understand the different array shapes. 


In [51]:
# Try rearranging the following 2D array
array = np.array([[1,1,1,1],[2,2,2,2],[3,3,3,3]])

## Combining Arrays

- `np.concatenate((array1, array2), axis=0)`
- `np.hstack((array1, array2))`
- `np.vstack((array1, array2))`

<p style="text-align:center;">
<img src="images/crash_course_2.png" width=600>
</p>

In [52]:
# Try combining the following two arrays using different functions
array1 = np.array([3,4,5])
array2 = np.array([6,7,8])

np.hstack((array1,array2))

array([3, 4, 5, 6, 7, 8])

## More on Numpy

Numpy provides the **vectorized** version of most math functions such as `np.sin()`, `np.cos()`, `np.tan()`, `np.exp()`, `np.log()`, `np.sqrt()` and many others. Vectorized functions operate directly on each element of an array. These are called **element-wise** operations.

With Numpy arrays, operations such as `+`, `-`, `*`, `/ `and `**` are all **element-wise** operations by default.

Slicing in numpy only returns a view of the original array. In order to explicitly copy an array, we have to use `array2 = array1.copy()`.

To delete elements, rows or columns we can use `np.delete(arr, index, axis=None)`.

To add elements to an array, we can use `numpy.append(arr, values, axis=None)` to add to the end of an array or `numpy.insert(arr, index, values, axis=None)`.

## Summary

Today, we have started to use **IPython** and **JupyterLab** to write our programs. We have learned a number of core concepts such as variables and how to work with them in Python. Many of the concepts you’ve learned today, such as variables and their different types, math operations and many other that you will learn in the coming sessions, are **common** between most programming languages. The differences are usually about syntax, and some other subtle things here and there that we talk about. There are also some other concepts that are exclusive to Python.

## References
1. https://railsware.com/blog/python-for-machine-learning-indexing-and-slicing-for-lists-tuples-strings-and-other-sequential-types
2. https://www.datacamp.com/community/tutorials/python-numpy-tutorial
3. https://www.guru99.com/numpy-tutorial.html
4. https://medium.com/datadriveninvestor/artificial-intelligence-series-part-2-numpy-walkthrough-64461f26af4f
5. https://backtobazics.com/python/python-reshaping-numpy-array-examples/attachment/numpy-reshape-examples/
6. https://www.w3resource.com/numpy/