<a href="https://colab.research.google.com/github/edoardochiarotti/class_datascience/blob/main/2024/00_Python-Basics/00_Python-Basics_Cheat-Sheet.ipynb" target="_blank" rel="noopener"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Statistics and Data Science: Python Cheatsheet

<img src='https://www.agent-x.com.au/wp-content/uploads/2011/06/Perfect-Programmer-dfe194b-e8d3b11-b960bd5.jpg' width="400">

Source: [Agent-X Comics - Perfect Programming](https://www.agent-x.com.au/comic/perfect-programming/)

## Content

This notebook summarizes the key concepts to program in Python. More details and illustrations are available in the notebooks "Python-Basics".

- [Introduction](#intro)
- [Variables and data type](#variables)
- [Operators](#operators)
- [Lists and Tuples](#list)
- [Dictionary](#dict)
- [String Methods](#string)
- [Conditionals](#condition)
- [Iteration](#iteration)
- [List Comprehension](#comprehension)
- [Function](#function)

## Introduction <a name="intro"></a>

- A programming language is a structured subset of natural language (words) and special characters that allow humans to describe operations they would like their computer to perform on their behalf
- The programming language translates these words and symbols into instructions the computer can execute
- [Python](https://www.python.org) is a general-purpose programming language conceived in 1989 by Dutch programmer [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum)
- Python is free and open source, with development coordinated through the [Python Software Foundation](https://www.python.org/psf/)

## Variables and data type <a name="variables"></a>

- **Variables** are containers for storing data values. 
- A variable is created the moment you first assign a value to it, using `=` 
- Variables can store data of different types. Here are some common data types:

|Data type| Definition |Example|
|--|--|-- |
|`str`| String: Text data| `"Hello, world"` |
|`int` | Integer: whole number, positive or negative, without decimals, of unlimited length | `42`|
|`float` | Floating point number: decimal number |`-7.39`|
|`complex` | Complex number|`1+2j`|
| `bool`| Boolean | `True`, `False` |

- Boolean are associated to numerical value: `True`has the value `1`, and `False` has the value `0`
- Determine the type of some data/variables with built-in function `type()`
- **Tips:**
    1. **Use explicit name for your variables**: you and someone reading your code should be able to understand what is the variable
    2. **Be careful not to overwrite the variables you are using**

## Operators <a name="operators"></a>

### Arithmetic operators

- **Operators** allow you to perform operations on variables, like add them. 
|action|operator|
|:-------|:----------:|
|addition | `+`|
|subtraction | `-`|
|multiplication | `*`|
|division | `/`|
|raise to power | `**`|
|modulo | `%`|
|floor division | `//`|
- Do not use the `^` operator to raise to a power. That is actually the operator for bitwise XOR.
- Dividing by zero will result in an error.
- Adding strings concatenate them. Multiplying a string by an integer *n* concatenates *n* times the string. Other string operations are not supported.
- Here is the order of operations.
|precedence|operators|
|:-------:|:----------:|
|1 | `**`|
|2 | `*`, `/`, `//`, `%`|
|3 | `+`, `-`|
- You can use parenthesis to group operations
- *Do not* use excessive parentheses. Excessive parentheses makes your code less readable, and can lead to mistakes. Trust the order of operations!

### Assignment operators

Assignment operators are used to assign values to variables:

|Operator|Example|Same as|
|:-------:|:----------:|:----------:|
|`=` | `var = 7` | `var = 7` |  
|`+=` | `var += 7` | `var = var + 7`| 
|`-=` | `var -= 7` | `var = var - 7`|
|`*=` | `var *= 7` | `var = var * 7`|
|`/=` | `var /= 7` | `var = var / 7`|
|`**=` | `var **= 7` | `var = var ** 7`|
|`%=` | `var %= 7` | `var = var % 7`| 
|`//=` | `var //= 7` | `var = var // 7`|

### Comparison operators

**Comparison operators** (also called **relational operators**) are used to compare two values:

|English|Python|
|:-------|:----------:|
|is equal to | `==`|
|is not equal to | `!=`|
|is greater than | `>`|
|is less than | `<`|
|is greater than or equal to | `>=`|
|is less than or equal to | `<=`|

- The result of the operation will be a Boolean: `True` or `False`
- Never use the `==` operator with `float`. Indeed, since floating point numbers are stored with a finite number of binary bits, there may be some rounding errors when operating on them. For instance, `2.1 + 3.2 == 5.3` will be `False`...

### Identity operators

**Identity operators** are used to compare objects, not if their values are equal, but if they are actually the same object, with the same memory location. The two identity operators are:

|English|Python|
|:-------|:----------:|
|is the same object | **`is`**|
|is not the same object | **`is not`**|

### Logical operators

**Logical operators** can be used to connect relational and identity operators. Python has three logical operators.

|Logic|Python|Meaning |
|:-------|:----------:|:----------|
|AND | `and`| If both operands are `True`, return `True` |
|OR | `or`| Gives `True` if *either* of the operands are `True`|
|NOT | `not`| Negates the logical result |

### Operator precedence

|precedence|operators|
|:-------|:----------:|
|1 | `**`|
|2 | `*`, `/`, `//`, `%`|
|3 | `+`, `-`|
|4 | `<`, `>`, `<=`, `>=`|
|5 | `==`, `!=`|
|6 | `=`, `+=`, `-=`, `*=`, `/=`, `**=`, `%=`, `//=`|
|7 | `is`, `is not`|
|8 | `and`, `or`, `not`|

## Lists and Tuples <a name="list"></a>

### List

- **Lists** (`list`) are used to store multiple items in a single variable. 
- We create lists by putting Python values or expressions inside **square brackets**, separated by **commas**: `[1,2,3]`
- Any Python expression can be part of a list, including another list
- A list that contain another list is called a **nested list**
- Adding lists concatenate them
- Lists are **mutable** objects: you can change their values without creating a new list

### Tuple

-  **Tuples** (`tuple`) are used to store multiple items in a single variable. 
- We create tuples by putting Python values or expressions inside **parenthesis**, separated by **commas**: `(1,2,3)`
- Any Python expression can be part of a tuple, including another tuple
- Adding tuples concatenate them
- Tuples are **immutable** object: once a tuple is created, its values cannot be changed. Always use tuples instead of lists unless you need mutability

### Indexing and slicing

- Lists and tuples are **ordered**, i.e., the items have a defined order. 
- Access a given item in a list (or a tuple) using **brackets**: we first write the name of our list and then enclosed in square brackets the location (**index**) of the desired element `list[index]`
- <span style='color:red'> **Indexing in Python starts at zero** </span>: the first element is at location 0
- We can index a nested list by adding another set of brackets: `list[index_level_1][index_level_2]`
- We can use negative indexing: the last element is at location -1

|Element|1|2|3|4|5|6|7|8|9|10|
|------|-:|-:|-:|-:|-:|-:|-:|-:|-:|-:|
|Forward indices|0|1|2|3|4|5|6|7|8|9|
|Reverse indices|-10|-9|-8|-7|-6|-5|-4|-3|-2|-1|

- **Slicing** allows to extract several elements, using semicolon `[start:end:stride]`:
    - The range is **inclusive of the first index** (`start` is included) and **exclusive of the last** (`end` is not included)
    - If there are no colons, a single element is returned (indexing)
    - If there are colons, we are slicing the list/tuple, and a list/tuple is returned.
    - If there is one colon, `stride` is assumed to be 1.
    - If `start` is not specified, it is assumed to be zero.
    - If `end` is not specified, it is assumed you want the entire list/tuple.
    - If `stride` is not specified, it is assumed to be 1.

### List and Tuples methods

- Python objects contain: 1) data; 2) built-in functions that can operate on the data, called methods
- Built-in methods for lists:

|Method|Description|
|:-------|:----------|
|`lis.append(val)` | Adds an element with value `val`at the end of the list `lis`|
|`lis.clear()` | Removes all the elements from the list `lis`|
|`lis.copy()` | Returns a copy of the list `lis`|
|`lis.count(val)` | Returns the number of elements with the specified value `val`|
|`lis.extend(iterable)` | Add the elements of a list (or any iterable), to the end of the list `lis`|
|`lis.index(val)` | Returns the index of the first element with the value `val`|
|`lis.insert(pos, val)` | Adds an element with value `val` at the position `pos`|
|`lis.pop(pos)` | Removes the element at the position `pos`|
|`lis.remove(val)` | Removes the item with the value `val`|
|`lis.reverse()` | Reverses the order of the list `lis`|
|`lis.sort()` | Sorts the list `lis`|

- Another useful function (that is not a method), is the `len(lis)` function. It returns the total number of items in the list `lis`
- `.count()`, `.index()`, and `len()` works the same with `tuple`, but unfortunately the others do not since tuples are immutable. Instead, a useful trick is to convert your tuple into a list, apply the list methods, and convert back to a tuple: 
    - Convert `tuple` into `list` using the function `list()`
    - Convert `list` into `tuple` using the function `tuple()`

## Dictionary

- In a nutshell:
    - Dictionaries (`dict`) are used to store multiple items in a single variable
    - A dictionary maps a **key** with an associated **value**
    - Any *immutable* object (e.g., `int`, `float`, `str`, `tuple`) can serve as key while the values can be of any type. You can even mix different type of keys and values in the same dictionary
    - Dictionaries are *mutable* objects
- Create dictionaries using curly braces `{}`: `d={key_1: value_1, key_2: value_2, ...}`
- Extract the value associated to a specified `key` in the dictionary `d` with square brackets: `d[key]`
- Add a new entry to dictionary `d`, or change value associated with a key: `d[key]=value`
- Dictionary methods:
|Method|Description|
|:-------|:----------|
|`d.clear()` | Removes all the elements from dictionary `d`|
|`d.copy()` | Returns a copy of `d`|
|`d.fromkeys(keys, value)` | Returns a dictionary with the specified keys and value (optional)|
|`d.get(key, default = None)` | Returns the value associated with `key`. The second argument specifies what should be returned if the key is absent|
|`d.items()` | Returns a list containing a tuple for each key value pair|
|`d.keys()` | Returns a list containing the dictionary's keys|
|`d.pop(key)` | Removes the element with the specified `key`|
|`d.popitem()` | Removes the last inserted key-value pair|
|`d.setdefault(key, value)` | Returns the value of the specified `key`. If the key does not exist: insert the key, with the specified `value` (optional)|
|`d.update({key: value})` | Updates `d` with the specified key-value pairs|
|`d.values()` | Returns a list of all the values in `d`|
- Other built-in functions useful for dictionaries:
|Function|Description|
|:-------|:----------|
|`len(d)` | Gives the number of entries of the dictionary `d`|
|`list(d)` | Extract a list of the keys of `d`|

## String Methods <a name="string"></a>

- We can use string methods to manipulate text data. Here are some of the common ones:

|Method|Description|
|:-------|:----------|
|`s.capitalize()` | Converts the first character to upper case|
|`s.casefold()` | Converts string into lower case |
|`s.count(value)` | Returns the number of times a specified value occurs in a string|
|`s.endswith()` |Returns true if the string ends with the specified value|
|`s.find(value)` | Searches the string for a specified value and returns the position of where it was found|
|`s.format()` | Formats specified values in a string|
|`s.index(value)` | Searches the string for a specified value and returns the position of where it was found|
|`s.isalnum()` | Returns True if all characters in the string are alphanumeric|
|`s.isalpha` | Returns True if all characters in the string are in the alphabet|
|`s.isdigit()` | Returns True if all characters in the string are digits|
|`s.islower()`|Returns True if all characters in the string are lower case|
|`s.isnumeric()` | Returns True if all characters in the string are numeric|
|`s.isupper()`|Returns True if all characters in the string are upper case|
|`s.join()`|Converts the elements of an iterable into a string|
|`s.lower()`|Converts a string into lower case|
|`s.replace(old_value, new_value)`|Returns a string where a specified value is replaced with another value |
|`s.rfind(value)`|Searches the string for a specified value and returns the last position of where it was found |
|`s.rindex(value)`|Searches the string for a specified value and returns the last position of where it was found |
|`s.rsplit(separator)`|Splits the string at the specified separator, and returns a list |
|`s.split(separator)`|Splits the string at the specified separator, and returns a list |
|`s.splitlines()`|Splits the string at line breaks and returns a list |
|`s.startswith(value)`|Returns true if the string starts with the specified value |
|`s.swapcase()`|Swaps cases, lower case becomes upper case and vice versa |
|`s.title()`|Converts the first character of each word to upper case|
|`s.translate()`|Returns a translated string using a dictionary or mapping table|
|`s.upper()`|Converts a string into upper case|

- A complete list of string methods is available in the [Python documentation](https://docs.python.org/3/library/stdtypes.html#string-methods)
- Since strings are sequences of characters, indexing and slicing works the same as it does for lists and tuples! Similarly, built-in functions such as `len()` also applies to strings.
- **f-strings** allow to directly insert some variables into strings
    - Definition: use `f` in front of a string and add the name of the variable to be inserted between braces
    - `f"Here is a f-string with {variable:format}"` 
    - The format is optional but allows to adjust, for example, how a number should be displayed: 

|format|description|
|:----------:|-----------|
|`d`| integer|
|`04d`| integer with four digits, possibly with leading zeros|
|`f`| float, default to six digits after decimal|
|`.8f`| float with 8 digits after the decimal|
|`e`| scientific notation, default to six digits after decimal|
|`.16e`| scientific notation with 16 digits after the decimal|
|`s`| display as a string|

## Conditionals <a name="condition"></a>

- **Conditionals** are used to tell your computer to do a set of instructions depending on whether or not a Boolean is `True`.
- Syntax:
```python
if <condition 1>:
    <statement 1>
elif <condition 2>:
    <statement 2>
else:
    <statement 3>
```
- In the above: 
    - statement 1 is only evaluated if condition 1 is `True`. 
    - `elif` (else if) is optional, and statement 2 is only evaluated if condition 1 is `False` and condition 2 is `True` 
    - `else` is optional, and statement 3 is only evaluated if conditions 1 and 2 are `False`.
- **Indentation matters**: any lines with the same level of indentation will be evaluated together.

## Iteration <a name="iteration"></a>

### For loop

- A `for` loop is used for iterating over a sequence, such as a list or a tuple.
- Syntax:
```python
for i in sequence:
    <statement>
```
- **Indentation matters**
- Note that a string is an ordered collection of characters, and we can thus iterate over strings
- Built-in functions useful for iterations:
|Function|Description|
|:-------|:----------|
|`enumerate(seq)` | Provides both the index and the associated item/value of sequence `seq`|
|`zip(seq_1,seq_2)` | Pair in tuples the items of sequences `seq_1` and `seq_2` |
|`reversed(seq)`|Reverse the order of `seq` |

### While loop

- A `while` loop allows iteration until a conditional expression evaluates `False`. 
- Syntax:
```python
k = 0                           # Initialize sequence index
while condition(k):             # Condition that depends on k
    statement                   # Statement evaluation if condition is true
    k+= 1                       # Update k for next iteration
```
- **Be cautious not to be struck in an infinite loop**, if the condition is always `True`:
    - Check your condition with a few values beforehand to make sure it can return `False`
    - Check that your incrementation is working as expected outside of the loop
    - If unsure, add a second condition, one that will for sure returns `False` after a given number of iterations (e.g., a constraint on the maximum number of iteration)

### Break, continue, else

- The `break` statement can stop a `for` or `while` loop before it has looped through all the items.
- The `continue` statement can stop the current iteration of the loop, and continue with the next.
- The `else` keyword specifies a block of code to be executed when the loop is finished.

## List Comprehension <a name="comprehension"></a>

- List comprehension offers a shorter and more efficient syntax when you want to create a new list based on the values of an existing list
- Instead of using a loop, you can use `for` and `if` statements inside a list:

```python
newlist =[expression_to_put_in_list for item in iterable if condition == True]
```

## Function <a name="function"></a>

- A function takes in arguments, performs some operation based on the identities of the arguments, and then returns a result
- Syntax:
    - A function is **defined** using the `def` keyword
    - Following the `def` keyword is a **function signature** which indicates the function's name and its arguments. The arguments are separated by commas and enclosed in parentheses
        - `arg_2` is a *named keyword argument*, also known as a *named kwarg*. If `arg_2` is not specified when the function is called, then the default value will be used
        - The unpacking operator `*` allows to define functions with an arbitrary number of arguments
        - The operator `**` alows to define functions with *arbitrary keyword arguments* (*kwargs*), e.g., dictionaries
    - The indentation following the `def` line specifies what is part of the function. As soon as the indentation goes to the left again, aligned with `def`, the contents of the functions are complete
    - Immediately following the function definition is the **doc string** (short for documentation string), a brief description of the function. You can return the doc string using `help(function_name)`
    - `return` specifies what the function will return when called
```python
def function_name(arg_1, arg_2 = default_value, *args, **kwargs):
    """Decription"""
    <operations>
    return <statement>
```
- A function that calls itself is said to be **recursive**, and the technique of employing a recursive function is called **recursion**.
- The Python programming language has several built-in functions. The complete set of **built-in functions** can be found [here](https://docs.python.org/3/library/functions.html)
- <span style="color: red; font-weight: bold;"> Never define a function or a variable with the same name as a built-in function. </span>
- **Always test your functions!**