# Python tutorial

Since 2021 Q4, Python has become the [most popular programming language](https://madnight.github.io/githut/#/pull_requests/2022/1). Python code is often said to be almost like pseudocode as it allows you to express powerful ideas in very few codes while being readable. As an example, here is an implementation of the classic quicksort algorithm in Python:

In [1]:
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print(quicksort([3, 6, 8, 10, 1, 2, 1]))

[1, 1, 2, 3, 6, 8, 10]


If you are already familiar with Python, you may choose to skip this tutorial; however, if not, this tutorial will be a quick crash course on Python programming as the basis for subsequent tutorials on data analytics and visualization. In this tutorial, we will cover:

* Coding environment of Python
* Basic data types and Containers
* Control flows and Functions 
* Classes and Module imports
* How to look for help

## Jupyter Notebooks

Before we dive into Python, we'd like to briefly talk about *notebooks*.
A [Jupyter notebook](https://jupyter.org/try) allows you to execute Python code in your web browser, along with writing your own documentation. In Jupyter notebook, the basic unit is blocks, which categorize into two types: 
1. **Code blocks**, where we write and execute Python codes, and 
2. **Markdown blocks**, where we write our thoughts in texts. 

Jupyter makes it an excellent place to test our codes in pieces and record your excellent ideas at the same time; for this reason, it is widely used in data analytics. What's more🤔? the [Binder](https://mybinder.org/) service allows us to run Python codes entirely in the *cloud*. Binder is basically Jupyter notebook on steroids: it's free, requires no setup, comes preinstalled with many packages, and is easy to share with the world.

We will use Jupyter notebooks through this module and all the tutorials: 
+ **Run Tutorials in Binder (recommended)**.
Just click the rocket logo 🚀 and binder at the very top of each tutorial.
+ **Run Tutorials in Jupyter Notebook**.
If you wish to run the notebook locally, we would recommend installing [Anaconda](https://www.anaconda.com/products/individual) to manage your computer environment. After installation, you could open `Anaconda Navigator` and launch Jupyter Notebook in 'Home' page. You will get the same page as Binder. In 'Environments' Page, you could install many third-party packages.

## Basic data types

### Numbers

Integers and floats work as you would expect from other languages. When initializing variables, Python would assign proper data types. Python has a built-in function `type()` to look at the type.

In [2]:
x = 3  # x is a variable which is assigned with a numeric value: 3
y = 1.0

print(x, type(x))  #print() is a built-in function for printing 
print(y, type(y))

3 <class 'int'>
1.0 <class 'float'>


Python also supports common operators for numbers, as well as assignment operators.

In [3]:
print(x + 1)  # Addition
print(x * 2)  # Multiplication
print(x ** 2) # Exponentiation
print(x // 2) # Floor division

4
6
9
1


In [4]:
print(y)
y += 1  # Same as y = y + 1
print(y)
y *= 2  # Same as y = y * 2
print(y)

1.0
2.0
4.0


### Booleans

In Python, the two Boolean constants are written as `True` and `False`.

In [5]:
t, f = True, False  # Python can do multiple assignments in one line
print(type(t), type(f))

<class 'bool'> <class 'bool'>


Now let's look at logial operators for Booleans: `and`, `or` and `not`.

In [6]:
print(t and f) # Logical AND;
print(t or f)  # Logical OR;
print(not t)   # Logical NOT;

False
True
False


We could do comparison operators to number pairs, which produce Boolean results.

In [7]:
x, y, z = 3, 1.0, 3.0
print(x < y)   # Return True if x is LESS than y
print(x == z)  # Return True if x is EQUAL to y

False
True


### Strings

In [8]:
h = 'hello'   # String literals can use single quotes
w = "world"   # or double quotes
print(h, len(h))

hello 5


When using `+` operator for strings, it performs the concatenation; but this could not used to string and numbers.

In [9]:
hw = h + ' ' + w  # String concatenation
print(hw)

hello world


In [10]:
hw1 = h + ' ' + w + 1  # Cannot concatenate with numbers

TypeError: can only concatenate str (not "int") to str

However, we could format strings based on other variables. One common way is to use `%` in the string inserting values that come after, quite similar to C and Matlab.

In [11]:
hw1 = "%s %-10s! Number: %d"%(h, w, 1) # String formatting
print(hw1)

hello world     ! Number: 1


The above way insert values in the order they appear. Another way to format strings is to place `{}` within strings and use the `format()` method of strings to insert values. This allows us to insert values in different orders or even use a value multiple times. Distinct to the above method, this method .

In [12]:
hw2 = '{} {}! Number: {}'.format(h, w, 2)  # String formatting by sequence
print(hw2)
hw3 = '{1} {0}! Number: {2:.2f} {0}'.format(h, w, 3)  # String formatting by specifying orders and formats
print(hw3)

hello world! Number: 2
world hello! Number: 3.00 hello


String in Python is also an object, which comes with many useful methods; for example:

In [13]:
print(h)
print(h.upper())       # Convert a string to uppercase; prints "HELLO"
print(h.replace('l', '(ell)'))  # Replace all instances of one substring with another

hello
HELLO
he(ell)(ell)o


You can find more information about Python basic data types in the official [documantion](https://docs.python.org/3.7/library/stdtypes.html), such as a list of [all string methods](https://docs.python.org/3.7/library/stdtypes.html#string-methods).

## Containers

It would be really cumbersome to manage each single data with a separate variable. Python includes four built-in container types to store collections of data: **lists**, **dictionaries**, **sets**, and **tuples**.

### Lists

Lists are used to store multiple items in a single variable. In Python syntax, they are enclosed in square brackets `[]` with data separated by a comma `,`. Note that elements in Python lists can be different data types.

In [14]:
ls = [3, 1, 'foo']  # This list contains three elements with different types
print(ls, len(ls))  # Built-in function len() return the number of elements

[3, 1, 'foo'] 3


After creation, we could use `append` method to add elements to the end of lists, and `pop` method to remove a specific element. Some other methods of list objects can be found [here](https://docs.python.org/3/tutorial/datastructures.html).

In [15]:
ls.append('bar') # Add a new element to the end of the list
print(ls)
ls.pop()         # Remove and return the last element of the list
print(ls)

[3, 1, 'foo', 'bar']
[3, 1, 'foo']


There are two ways to retrive value(s) in lists:

1. **Index** one item:
Just use index number within enclosed brackets `[]`. Note that in python, **indexing starts from 0**. Indexing can also be **in reverse order using negative values** as following.
<div>
<img src="attachment:197c0d76-bd8d-47fb-a946-e1706393a447.png" width="500"/>
</div>

In [16]:
print(ls[2])     # Indexing 3rd element; list indexing starts from 0
print(ls[-1])    # Negative indices count from the end of the list

foo
foo


2. **Slice** a part: 
Slicing is done by defining the index values of the first element (a) and the last element (b) in the form of parentlist `[a:b]`. **Note that b is not included in the resulting slicing**. If a (or b) is not defined then slicing will include from the first (or till the last).

In [17]:
nums = [0, 1, 2, 3, 4, 5, 6]
print(nums)
print(nums[2:4])    # Get a slice from index 2 to 4 (exclusive)
print(nums[2:])     # Get a slice from index 2 to the end
print(nums[:-1])    # Slice indices can also be negative
nums[2:4] = [8, 9]  # Assign a new sublist to a slice
print(nums)

[0, 1, 2, 3, 4, 5, 6]
[2, 3]
[2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5]
[0, 1, 8, 9, 4, 5, 6]


You can also slice with a fixed step length (c): `[a:b:c]`.

In [18]:
print(nums[:-1:2])  # Get a slice from index 0 to -1 (exclusive) in a step length of 2
print(nums[::-1])   # Get a slice of whole list in reverse order

[0, 8, 4]
[6, 5, 4, 9, 8, 1, 0]


We will meet slicing again in `NumPy` [tutorial](./numpy-basic.ipynb).

### Dictionaries

A dictionary stores pairs of `key` and `value` in the form of braces `{key: value}`. Dictionaries are more like a database because here you can index a particular sequence with your user-defined string.

In [19]:
d = {'cat': 'cute', 'dog': 'furry'}  # Create a new dictionary with some data
print(d['cat'])       # Get an value from a dictionary

cute


In [20]:
print('fish' in d)  # `in` is the membership operator to check the presence
d['fish'] = 'wet'   # Set a new entry in a dictionary
print('fish' in d)

False
True


One useful built-in method of dictionaries is `get` where you can get the value with a default for the cases when the key does not exist.

In [21]:
print(d.get('monkey', 'N/A'))  # Get a value with a default
print(d.get('fish', 'N/A'))    # Get a value with a default

N/A
wet


### Tuples

A tuple is an **immutable** ordered version of lists in the form of parentheses `()`. A tuple is in many ways similar to a list; one of the most important differences is that tuples can be used as keys in dictionaries, while lists cannot.

In [22]:
t1 = (5, 6)  # Create a tuple
t2 = (7, 8)
d = {t1:'Group A', t2:'Group B'}

print(d, type(t1))
print(d[t1], d[(7, 8)])       

{(5, 6): 'Group A', (7, 8): 'Group B'} <class 'tuple'>
Group A Group B


In [23]:
t1[0] = 1  # Tuple is immutable after initialization; 

TypeError: 'tuple' object does not support item assignment

## Control Flow

### Conditions: `if-elif-else`
Control flow of conditions is used to specify different codes of algorithms to run under different conditions. Next is an example. **Note that there should be indentation with four blanks for each section of algorithms.**

In [24]:
x, y = 10, 12

if x > y:
    print("x>y")  # Four blanks before the algorithm
elif x < y:
    print("x<y")  # Four blanks before the algorithm
else:
    print("x=y")  # Four blanks before the algorithm

x<y


### Loops:
Control flow of loops is used to iterate codes for each element in containers or under a specific condition.

+ **`for` loops** across an <u>iterable object</u>

List itself is a typical iterable object. Here is an example that iterates over list's elements. Python built-in function `range(a, b)` also returns an iterable sequence from a to b (b not included) with increments by 1 (by default), which is quite common in `for` loops.

In [25]:
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

for list1 in list_of_lists:  # Iterate over elements in list_of_lists
    print(list1)             # Four blanks before the algorithm
print('Bye')    # Without four blanck, this is not a part of iterations

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
Bye


In [26]:
for i in range(0, 2):
    print(list_of_lists[i])

[1, 2, 3]
[4, 5, 6]


For a dictionary, its `item()` method returns an iterable list of keys and values, which could also be employed in `for` loops. As long as we have an iterable item, we could use it for `for` loops.

In [27]:
d = {'person': 2, 'cat': 4, 'spider': 8}

for animal, legs in d.items():
    print('A {} has {} legs'.format(animal, legs))

A person has 2 legs
A cat has 4 legs
A spider has 8 legs


+ **`while` loops** under a specific <u>condition</u>

In [28]:
i = 1
while i < 3:    # Iterate when i smaller than 3
    print(i**2) # Four blanks before each line of algorithm
    i += 1      # Four blanks before

1
4


### List comprehension and dictionary comprehension
As a special feature, comprehension offers a shorter loop syntax within one line. We can employ control flows of other lists for the initialization of a new list.

In [29]:
nums = [0, 1, 2, 3, 4]
squares = [x**2 for x in nums]
print(squares)

even_squares = [x**2 for x in nums if x % 2 == 0]
print(even_squares)

[0, 1, 4, 9, 16]
[0, 4, 16]


Similarly, for dictionaries, we could use dictionary comprehension to create a new dictionary based on an existing list.

In [30]:
even_num_to_square = {x: x**2 for x in nums if x % 2 == 0}
print(even_num_to_square)

{0: 0, 2: 4, 4: 16}


## Functions

Python functions are defined using the `def` keyword. Here is an example function `sign(x)` which return the sign of `x`. To call a function, use the function name followed by parenthesis.

In [31]:
def sign(x):  # Define a function with one argument x
    '''determine the sign of a single value'''
    if x > 0:  # four blanks before each line within the function body
        return 'positive'  # another four blanks within `if` expressions
    elif x < 0:
        return 'negative'
    else:
        return 'zero'

for x in [-1, 0, 1]: 
    print("{} is {}".format(x, sign(x)))  # Use parenthesis to run functions 

-1 is negative
0 is zero
1 is positive


The above function could be read as: a function by name `sign` is defined, which accepts one argument `x`. The statements of this function is an if-elif-else condition flow, returning a string `positive` if `x > 0`, `negative` if `x < 0`, or `zero` when `x == 0`.

We often define functions to take optional keyword arguments, like this:

In [32]:
def hello(name, loud=False):
    if loud:
        print('HELLO, {}'.format(name.upper()))
    else:
        print('Hello, {}!'.format(name))

hello('Bob')  # Without specifin second argument, function would take default for it
hello('Fred', loud=True)

Hello, Bob!
HELLO, FRED


For the case of one line simple algorithm within the function, Python offers a short `lambda` syntax in the form of `lambda arguments: expression`. The following is an example to calculate $y=x^3+x^2+x$.

In [33]:
y = lambda x: x**3 + x**2 + x  # define a simple lambda funtion
print(y(-1))  # call it

-1


## Classes and Objects

Using the above knowledge, we can already code the process of algorithms we want. However, the real magic and power of Python are its numerous online packages, supported by evolving and active communities. Before diving into them, let's take a very brief look at the object-oriented programming paradigm in Python, as almost all packages use this paradigm to pack their codes. TBH, almost everything in Python are objects.

Object-oriented programming defines a `class` – a "blueprint" for creating objects – at first; then we create an instance (**object**), which will incorporate its own properties and methods as defined by `class`. Properties are the **variables** reflecting status of this instance, and methods are the **functions** we could operate on an instance. The syntax for defining classes in Python is straightforward. Note that `.` is used to assign properties, and `__init__` is required for each class as the instance initialization method.

**Let's try the following Car 🚗 class example:**

In [34]:
class Car():
    
    def __init__(self, company, model, year):  
        """initialize the properties of a car"""
        self.company = company  # claim one property of object
        self.model = model
        self.year = year
        self.odometer = 0
        
    def get_info(self):  # functions are the methods of object
        """return car information in a string"""
        car_info = "{} {} {}".format(self.year, self.company, self.model)
        return car_info
        
    def run(self, distance):
        """run car for a distance"""
        self.odometer += distance
    
    def read_odometer(self):
        """return the distances the car has run through"""
        odo_info = "This car has run {} km.".format(self.odometer)
        return odo_info

<div>
<img src="attachment:d9bad949-0fee-4840-9f8e-dceb96dcb228.png" width="700"/>
</div>

Let's say now I buy a new car:

In [35]:
my_lovely_car = Car("Tesla", "Model 3", 2022)  # calling class name would trigger __init__ to create an instance
print(my_lovely_car.company)     # retrive a property
print(my_lovely_car.get_info())  # call a method

Tesla
2022 Tesla Model 3


And today I take this car out, and wish to look at distances it has already run by.

In [36]:
print(my_lovely_car.read_odometer())
my_lovely_car.run(5.5)
print(my_lovely_car.read_odometer())

This car has run 0 km.
This car has run 5.5 km.


In [37]:
Car.run(my_lovely_car, 3)  # methods can also be called under the class name
print(Car.read_odometer(my_lovely_car))

This car has run 8.5 km.


## Import modules

It's time to leverage on numerous Python-based packages to empower our codes. We could import modules by a statement of `import`. To access one of the functions, we could specify the name of the module and the name of the classes or functions, concatenating by a dot `.`.

In [38]:
import numpy  # import third-party NumPy module

print(numpy)
print(numpy.arange(1, 5))  # arange function generates an array with evenly spaced values

<module 'numpy' from '/Users/Shared/anaconda3/lib/python3.9/site-packages/numpy/__init__.py'>
[1 2 3 4]


Sometimes, in order to facilitate scripting we assign a short alias to the module name; we may also directly import specific functions or subpackages so that we could use it without the module name.

In [39]:
# Assign a short alias to make it easier for us to use it
import numpy as np
print(np.arange(1, 4))

# Import a submodule in module
from numpy import random  # random is a submodule of numpy for random sampling
print(random.random())    # random.random() function generates a random value

[1 2 3]
0.17016023155694282


In [40]:
# Try this!!!
import antigravity

## One more thing: Looking for help
As you may observe, coding is also a journey of DEBUG 🐞. Learning how to solve problems is also important for programmers. Here are some suggestions when you feel stuck or confused.

### Print documentation
It is almost impossible to memorize the tremendous amounts of packages and functions. Using built-in function `help()` can quickly print out the capability description of the function, as well as what are the inputs and outputs of this function:

In [41]:
import numpy as np

help(np.ones)  # Look for documentation of ones funcion in Numpy

Help on function ones in module numpy:

ones(shape, dtype=None, order='C', *, like=None)
    Return a new array of given shape and type, filled with ones.
    
    Parameters
    ----------
    shape : int or sequence of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: C
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    like : array_like
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this argument.
    
        .. versionadded:: 1.20

### Read online official documentation
When learning a new package, it's always good to briefly read its official documentation. A typical well-documented package offers ***User Guide*** (introducing the framework to work around the package), ***API references*** (listing details of each entry in the package), and ***gallery*** (showing off their good examples).

Try browsing the [official website of machine learning package](https://scikit-learn.org/stable/index.html) `scikit-learn`! Its doc combines theory, codes, and visualization to deliver ideas.

### Search in community: Stack Overflow
Python can also be titled as an internet-based programming language, not only because there are so many open-source third-party packages available online, but also due to actively engaged communities. [StackOverflow](https://stackoverflow.com) is a great Q&A website to search for similar questions from community buddies as you have. The answers supported by community the most will show at first following the question.

## References
+ This tutorial was edited based on the [Python Numpy Tutorial](https://cs231n.github.io/python-numpy-tutorial), [Andreas Ernst's Python4Maths](https://gitlab.erc.monash.edu.au/andrease/Python4Maths/tree/master), and [W3Schools](https://www.w3schools.com/python/default.asp).
+ This tutorial only touched Python basics that you need to know. You may refer to the official documentation of [Python Standard Library](https://docs.python.org/3.7/library/index.html) for detailed information when necessary.