# Introduction to Python

Here we provide a brief introduction to Python. Python will be the language we use most in Practical AI, with some C++ for some exercises.

If you already work in Python, this will be a basic review. If you are coming from another language or scientific computing environment (e.g., MATLAB or R) then this tutorial will help ease the migration to Python syntax and language idioms.


<a id='what-is-python'></a>
## What is Python

Python is a high level programming language that is highly accessible to new and experienced software developers. Advantages of Python include:

* easy to write
* highly extensible
* highly portable

These aspects of Python make it great for data scientists who may be less versed in coding. Python however is *slow* and is not suitable for the development of safety or life critical systems. For example, using Python to develop code that runs inside a car (such as power steering and brake control systems) is illegal! This is not only driven by the speed of Python, but also driven by the way Python handles typing, memory allocation, dependency management, and the fact that it is an interpreted language instead of a compiled language. We will talk about these issues more later on when we discuss why we don't usually deploy AI systems in Python for critical applications, even if it is easy to design them in Python.

Fortunately, in many cases the slowness of Python does not negate its benefits for non-real-time and non-safety-critical / non-mission critical applications. There have also been many pursuits to improve the performance of Python, including many *packages* for Python geared for high performance computing. Some of these packages are:

* `numpy`
* `numba`
* `dask`

These can help us write Python code that runs faster and better exploits the resources of our hardware, but it is very important to note that they do not address all the issues of employing Python for critical applications. So, even with these libraries and improvements that have been made to the Python language, we still opt to use (and sometimes are mandated to use per regulations) other languages for critical applications.

We will learn about all these considerations soon. Python is an imperfect language, but it has a tremendously large community behind it, and the many years surrounding Python development have enabled the language to effectively be able to do *anything* (even if not entirely well!). As a programming language, there are rules and expected syntax that must be followed. Breaking these rules results in code that does not work, and most of the time results in a critical error. The goal of this lesson is simply to provide a feel for the syntax of Python and help those who may be migrating from other languages and environments.

<a id='jupyter'></a>
### Jupyter

Python and the packages we will use provide many tools for designing and building robust machine learning pipelines, including exploring data, transforming data, engineering features, training models, testing models, explaining the outputs produced by models, optimizing the performance of models and exporting them to standard formats.

*Jupyter* - which is this tool that we are using right now - allows us to combine code, results, and explanation all in one document. These documents are called *notebooks* and they allow us to write both *markdown* and *Python* side by side. The Python code can be interactively executed in a *kernel*. This means that the code is actually executed on some computer (locally or remotely), and that computer sends the results back to Jupyter. Notebooks are comprised of *cells*. You can execute cells in any order, and re-execute them as many times as you like!


<a id='variables-and-primitives'></a>
## Variables & Primitives

Variables are named objects in our code that hold onto data. There are many *types* of variables. A *type* refers to the kind of data a variable represents and what we can do with it. The four basic (*primitive*) types in Python are:

* boolean (`bool`)
* integer (`int`)
* floating point number (`float`)
* string (`str`)

A boolean is a type that can only have one of two values: `True` or `False`. These are *logical* values to be used when computing if something is actually true or false.

An integer is any number without a decimal.

A floating point number is any number with a decimal.

A string is a sequence of characters. Here are a few examples:

In [None]:
True

In [None]:
1337

In [None]:
3.141592653589

In [None]:
"Hello World!"

Each of these types of data can be used in specific contexts; these are usually intuitive and familiar already.

In [None]:
True and False

In [None]:
1337 // 7

3.141592653589 * 2.0 * 2.0

Everything above here though are just data, not variables! We can *assign* a variable a value.

In [None]:
b = True
i = 1337
f = 3.141592653589
s = "Hello"
print(b, i, f, s)

In [None]:
bb = b or False
ii = i // 191
ff = f + 0.858407346411
ss = s + " World!"
print(bb, ii, ff, ss)

We can use our variables in *expressions* and use operators to perform actions with them. With booleans we can evaluate logical statements; with integers and floating point numbers we can perform arithmetic; and with strings we can perform concatenations and other string manipulations. By convention, we use descriptive variable names to explain to the readers of our code what our variables mean.

In [None]:
d = 8.2387  # bad
distance = 8.2387  # better
distance_m = 8.2387  # best!
distance_m

It is important that we choose variable names that are descriptive and make sense. In the above example, `d` is probably ok for representing distance, but perhaps `d` means something different to someone else. In this case `d` is not descriptive enough. So you may change it and use `distance`. This is much better, as it is clear that you are intending to represent a distance. But at what scale ar we talking about? A distance can be micro or macro in scale. We can just name our variable with the intended units to help guide the reader. Therefore, naming a variable for distance as `distance_m` indicates to the reader that the variable is a distance *and* its value is in meters.

Note that it is very possible to use variables in an incorrect way. Consider trying to add an integer and a string:

In [None]:
distance_string_km = distance_m / 1000.0 + "km"

This operation is not allowed - while `distance_m / 1000.0` is totally valid and results in another float, we try to add a string to that new float. What would it even mean to add a float and a string? Python prevents this from happening and instead returns an *error* explaining what went wrong. Note here that because of the error, `distance_string_km` is never created!

Note that because Python is a "duck typed" language, variables can hold whatever type of data we assign to them. The variables will take on the types of data that is assigned. (The origin of the nomenclature "duck typed" is the saying "if it looks like a duck and quacks like a duck then it must be a duck.") This makes Python easy to program in since we do not necessarily have to keep track of variable types. However, it can also lead to confusing and hard to read code where a variable takes on one type at one time and a different type at another time based on the type of data written to it if we are not careful. Tools like `mypy` can help us with this, and we will review these later.

In [None]:
distance_string_km

<a id='function-basics'></a>
## Function Basics

As expected we can also define *functions*. Functions are pieces of code that we can invoke to perform some defined set of actions. Functions can take in *arguments* (e.g. inputs) and can return data (e.g. outputs). Every function is defined using the following format:

```python
def function_name(a, b):
    # do stuff
```

Maybe we have a function that converts an input float to Celsius!

In [12]:
def fahrenheit_to_celsius(temp_f):
    temp_c = (temp_f - 32.0) * 5.0 / 9.0
    return temp_c

This is only a *definition*. We need to *call* the function to use it.

In [None]:
fahrenheit_to_celsius(212.0)

Functions can take 0, 1 or many arguments. They do not need to return anything, can return 0, 1, or many things at once! Consider the following function that takes in a speed and a time and computes the distance traveled:

In [None]:
def compute_distance_m(speed_mps, time_s):
    distance_m = speed_mps * time_s
    return distance_m


compute_distance_m(100.0, 2.0)

While this function is trivial, it is clear from the *function name* what this function is doing. As you may suspect, this is nothing stopping someone from using this function incorrectly. Someone might try to use a string here...

In [None]:
compute_distance_m("100.0 m/s", 10.0)

We can add *type hints* to hint to the coder how they should try to use a function. Let's rewrite our function above with type hints.

In [None]:
def compute_distance_m(speed_mps: float, time_s: float) -> float:
    distance_m = speed_mps * time_s
    return distance_m


compute_distance_m(100.0, 2.0)

Above we have just added *hints* to the function that state that the input variables are each a float, and that the function returns a float. **Note that this still does not prevent someone from using the function incorrectly.** Note that below we are passing a bool instead of a float, *and it works*. Bool values can be used in arithmetic expressions, and so the function code could be executed perfectly fine. The type hints are simply hints, not hard rules. This means that anyone is free to ignore them! When we review `mypy` later, we will see how we can add automatic type hint checking to our projects.

In [None]:
compute_distance_m(True, 10.0)

We can also define functions with *default values* so that additional flexibility can be given to those that need it, but is otherwise hidden from the users that do not need it!

In [None]:
def compute_position_m(
    speed_mps: float, time_s: float, starting_position_m: float = 0.0
):
    position_m = starting_position_m + speed_mps * time_s
    return position_m


compute_position_m(100.0, 2.0)

In [None]:
compute_position_m(100.0, 2.0, 300.0)

We can *omit* the final input to the function, because it is given a default value of *0.0* when one is not supplied. We can use this to great effect for creating configurable functions!

<a id='conditional-statements'></a>
### Conditional Statements

Python also provides the typical control flow mechanisms - let's start with conditional statements. In a conditional statement, we are checking if some value is logically true or false (using boolean algebra). If the statement is true, then we proceed to execute the code contained by the conditional statement; otherwise if the value is false then we do nothing. Simple conditional statements are of the form:

```python
if statement:
    # do stuff
```

In this simple setup if `statement` is `True` then we proceed to "do stuff". Otherwise we would do nothing! Here `statement` is a bool - `True` or `False`. We can generate boolean values from other data by using *comparisons*. Comparisons simply check how one value compares to another. We can check if two values are equal, or if one is greater than another. We have a number of operators for comparing values.

* `==` - checks if two values are equal
* `!=` - checks if two values are not equal
* `>` - checks if one value is greater than the other
* `>=` - checks if one value is greater than or equal to the other
* `<` - checks if one value is lesser than the other
* `<=` - checks if one value is lesser than or equal to the other

We can even perform compound conditions using `and` and `or` keywords.

Here are a few examples:

```python
1 == 2              # False
'hello' != 'world'  # True
5 <= x <= 10        # depends on what x is!
0.0 > 0.0           # False
```

Let's generate a random integer inclusively between 1 and 6 (like rolling a die) and check if the value if equal to 6. Let's try running it a few times!

In [43]:
import random

x = random.randint(1, 6)
if x == 6:
    print("Woo!")

In the code above, we are checking if a random integer between 1 and 6 is equal to 6. Other than the import and randomness shown, there are a few things to point out.

```python
if x == 6:
```

This is the conditional statement. We are using the equality operation `==` to check if `x` is equal to the value `6`. This operation returns a bool, either `True` or `False`. If the value of that bool is `True` then the *body* of the conditional statement is executed. Here the body is

```python
    print('Woo!')
```

This just prints the string `'Woo!'`, but only if `x` has a value of `6`.

We can also check multiple different conditions to set up if-else sort of logic. The Python keywords `elif` and `else` allow us to check if other conditions are true if some other condition(s) is false.

In [None]:
x = random.randint(1, 6)
y = random.randint(1, 6)
if x == y:
    print(f"double {x}s")
else:
    print("try again!")

In [None]:
x = random.randint(1, 6)
y = random.randint(1, 6)
if x > y:
    z = 1
elif y > x:
    z = 2
else:
    z = 3
(x, y, z)