# Python 101
Python is a high-level, interpreted, general purpose and open source programming language (build anything). It is a dynamically typed and garbage-collected language. Python supports multiple programming paradigms.
Python was first released in 1991. With a strong community a lot of packages for data emerged. Python is a popular choice for data science and data engineering tasks.

This notebook provides a short and fast overview of the Python language with small practical exercises. The focus is on syntax, to get you minimally familiar with Python to work through the DE Accelerated content without any problems. You can skip ahead to the exercises for testing your knowledge directly. For more details around Python please refer to the [official documentation](https://www.python.org/).

## Python for calculations
Short overview of common arithmetic operators.

In [None]:
# Addition
6 + 2

In [None]:
# Subtraction
3 - 4

In [None]:
# Multiplication
3 * 4

In [None]:
# Division
3 / 4

In [None]:
# Modulo
9 % 2

In [None]:
# Exponentiation
9 ** 2

## Variables
Variable names in Python (1) start with letter or underscore, (2) cannot start with number, (3) only contain alphanumeric characters and underscores, and (4) are case-sensitive.

In [None]:
# Assign a value to a variable
count = 0
count

In [None]:
# Use of assignment operators
count += 1 # same as count = count + 1
count

In [None]:
# Multiple assignments in one line, calculating with variables
triangle_base, triangle_height = 5, 6
triangle_area = 0.5 * triangle_base * triangle_height
triangle_area

In [None]:
# Introducing the f string
first_name = 'Kim'
print(f'Hello {first_name}, how are you?')

### Exercises
Alter the part in following function that was not implemented to calculate
Define a variable and calculate

In [None]:
def calculate_3_plus_4():
    return NotImplemented

In [None]:
def test_calculate_3_plus_4():
    result = calculate_3_plus_4()
    if result is NotImplemented:
        raise NotImplementedError("Implementation missing")
    assert result == 7
    print("Test passed")

# test_calculate_3_plus_4()

## Types and type conversion
Python has type inference (automatic detection of the data type of the expression). Common data types include Integer, Float, String and Boolean.

In [None]:
# Integer
participants = 90
type(participants)

In [None]:
# Float
distance = 3.5
type(distance)

In [None]:
# String
text = "text"
another_text = 'another text'
type(text)

In [None]:
# Boolean
condition = True
type(condition)

In [None]:
# Convert variable to a preferred data type with e.g. int(), float(), str(), bool()
distance_as_string = str(condition)
distance_as_string

### Exercises
Description

## Lists
[Lists](https://docs.python.org/3/library/stdtypes.html#lists) are mutable sequences of values that can be of a single type (typical) or different types. A list itself is a type. The next sections cover the following points:
- Create a list
- Access a list
- Manipulate a list
- Copy a list
- Join lists
- Sort and reverse a list

### Create a list
A list can hold values from the (1) same type or (2) any types (including e.g. list itself).

In [None]:
# Create a list with values of the same type
items = ['avocado', 'mushrooms', 'pasta', 'tomato', 'spinach'] # or use type constructor e.g. list(('avocado','pasta')) returns ['avocado', 'pasta']
type(items)

In [None]:
# Create a list with lists
items_with_price = [['avocado', 2],
                    ['mushrooms', 2.5],
                    ['pasta', 1.56]]

print(type(items_with_price))
print(items_with_price)

In [None]:
# Create a list with different types
some_value = 44
another_list = [3, 'hello', ['i', 5555, True], False, some_value, 4.555]

print(type(another_list))
print(another_list)

### Access a list
Use indexing and slicing to extract a subset of values from a list.

In [None]:
# Indexing second element
items[1]

In [None]:
# Indexing last element
items[-1]

In [None]:
# List slicing with defining start and end index
# Start index included, end index excluded
items[3:5]

In [None]:
# List slicing with defining either start or end index
items[:3]

### Manipulate a list
There are many more ways to manipulate a list. Here are some common ones.

In [None]:
# Look at current list elements
items

In [None]:
# Changing an element
items[3] = 'apple'
items

In [None]:
# Changing elements
items[:2] = ['strawberry', 'raspberry']
items

In [None]:
# Adding elements
items += ['cherry', 'kale']
print(items)

items.insert(2, 'watermelon')
print(items)

items.append('pizza')
print(items)

In [None]:
# Deleting elements
del(items[0]) # or use remove('strawberry), pop(0)
print(items)

items.remove('pizza')
print(items)

In [None]:
# Clear all elements from a list
items.clear()
items

### Copy a list

In [None]:
# Use list() function if you don't want to assign a reference but want to create a new list with the same values
original_items = ['apple', 'orange']
new_items = list(original_items) # or use items[:] or items.copy()
new_items

### Join lists

In [None]:
# Remove item from original items
del(original_items[0])

# Join two lists
joined_list = new_items + original_items
joined_list

### Sort and reverse a list

In [None]:
# Sort lists
number_list = [10,100,1,0,1000]
number_list.sort()
print(number_list)

# Reverse sort
number_list.sort(reverse = True)
print(number_list)

In [None]:
# Reverse order of list
number_list.reverse()
number_list

### Exercises
Description

## Tuples, dictionaries and sets
Other common data types include the tuple, dictionary and set.

### Tuple
[Tuples](https://docs.python.org/3/library/stdtypes.html#tuples) are sequences of values that are immutable and ordered. Duplicates are allowed. Values are indexed and can be from any type.

In [None]:
tuple_example = (12, 'word', True)
type(tuple_example)

### Dictionary
[Dictionaries](https://docs.python.org/3/library/stdtypes.html#dict) are mutable collections used to store key to value pairs. Hashable values (keys) are mapped to any kind of object (values). Keys have one data type while the values can be from any type. Duplicates are not allowed.

In [None]:
dict_example = {'firstKey': 1, 'secondKey': 2, 'thirdKey': 3} # or dict(firstKey=1, secondKey=2, thirdKey=3)
type(dict_example)

### Set
[Sets](https://docs.python.org/3/library/stdtypes.html#set) are unordered and un-indexed collections. Duplicates are not allowed. The items within the set are immutable but items can be removed or added. Items can be from any type.

In [None]:
set_example = {'word1', True, 34}
type(set_example)

### Exercises
Description

## Indentation
Python uses [indentation](https://docs.python.org/3.11/reference/lexical_analysis.html#indentation) in the form of white spaces to indicate blocks of code. Indentation is therefore a must. Mixing tabs and spaces is disallowed (often spaces preferred).
[Explicit line joining](https://docs.python.org/3.11/reference/lexical_analysis.html#explicit-line-joining) (using the backslash) and [Implicit line joining](https://docs.python.org/3.11/reference/lexical_analysis.html#implicit-line-joining) (for content within parentheses, square brackets or curly braces) are additional ways to structure code (e.g. for better readability).

In [None]:
# Explicit line joining
full_list = ['a', 'b'] \
    + ['c'] \
    + ['d', 'e']

full_list

In [None]:
# Implicit line joining
another_list = ['Monday', 'Tuesday', 'Wednesday',
                'Thursday', 'Friday', 'Saturday',
                'Sunday']

another_list

## Functions and methods
[Functions](https://docs.python.org/3/library/stdtypes.html#functions) are reusable code that has to be invoked to run and can be defined in Python with the <b>def</b> keyword.
[Methods](https://docs.python.org/3/library/stdtypes.html#methods) are functions, either built-in methods (e.g. string methods) or class instance methods.

In [None]:
# Defining and invoking a function
def print_hello():
    print('Hello')

print_hello()

In [None]:
# Define a function with arguments/parameters and return statement
def calculate_triangle_area(base, height):
    return 0.5 * base * height

calculate_triangle_area(base = 5, height = 6)

In [None]:
# Use the help method to get more context
help(print)

### Lambda functions
Lambdas are small anonymous functions. They are efficient for simple expressions (single line statements) and if a function should be used only once.

In [None]:
# A function defined by the def keyword
def add_two_numbers(val1, val2):
    return val1 + val2

# Equivalent lambda function
lambda_add_two_numbers = lambda a, b: a + b
lambda_add_two_numbers(4,5)

In [None]:
# Passing in a function as arguments
example_function = lambda a, b, func: a * func(a,b)
example_function(2, 3, lambda a, b: a + b)

### Map, filter and reduce
Map, filter and reduce are higher order functions. Map and filter are built-in functions. Reduce has to be imported from the functools.
- The [map](https://docs.python.org/3/library/functions.html?highlight=map#map) function takes a function and iterables as arguments. The function is applied to every element in an iterable (e.g. on each element of a list).
- The [filter](https://docs.python.org/3/library/functions.html?highlight=filter#filter) function filters elements based on a function. The output contains the elements for which the applied passed in function returned true.
- The [reduce](https://docs.python.org/3/library/functools.html?highlight=reduce#functools.reduce) function applies a provided function on the provided iterables. A single value is returned.

In [None]:
# Map with map(function, iterables)
example_list = [1,2,3,4]

list(map(lambda a: a-1, example_list))

In [None]:
# Filter with filter(function, iterables)
example_list = [1,2,3,4]

list(filter(lambda a: a>=2, example_list))

In [None]:
# Reduce with reduce(function, iterables)
from functools import reduce

example_list = [1,2,3,4]

reduce(lambda a,b: a+b, example_list)

### Exercises
Description

## Modules and packages
[Modules](https://docs.python.org/3/tutorial/modules.html) can be a .py files with various functions and variables.
[Packages](https://docs.python.org/3/tutorial/modules.html#packages) are a collection of modules with an init file. For installation of packages [pip](https://pypi.org/project/pip/) can be used.

In [None]:
# Import a module
import sys

# Use dir function to find out what is defined within a module
dir(sys)

In [None]:
# Install a package in the notebook
!{sys.executable} -m pip install numpy

In [None]:
# Import with the import keyword
import numpy

# Make use of the module
numpy.array([11,22])

In [None]:
# Import with different name
import numpy as np

np.array([11,22])

# Or import only a part
# from numpy import array
# array([11,22])

# Submodules can be referenced with a dot (.)

### Exercises
Description

## Conditions and if-else statement
This section introduces conditions and the [if-else statement](https://docs.python.org/3/reference/compound_stmts.html#the-if-statement).

In [None]:
# Define variables
x = 3
y = 5
z = 7

### Conditions

In [None]:
# Equals
x == y

In [None]:
# Not equals
condition = (x != y)
print(condition)

In [None]:
# Less/Greater than AND less/greater than or equal to
print(x < y)
print(x <= y - 2)
print(x > y)
print(x >= y - 2)

In [None]:
# and, or, not - keywords
print(x<y and x<z)
print(x<y or x>z)
print(not x<y)

### If-else statement

In [None]:
# If-else statement
if x > y:
    print('x is greater than y')
elif x > z:
    print('x is greater than z but not greater than y')
else:
    print('y and z are both greater than x')

In [None]:
# Ternary operators

# Short if
if x < y: print('x is smaller than y')

# Short if-else
print('x is greater than y') if x > y else print('y is greater or equal to x')

### Exercises
Description

## Loops
There are two common loops in Python, the [for loop](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement) and the [while loop](https://docs.python.org/3/reference/compound_stmts.html#the-while-statement). The for statement is used to iterate over iterables (e.g. strings, lists...). The while statement is used for repeating code blocks until a defined condition turns false.

This section also introduces the [range type](https://docs.python.org/3/library/stdtypes.html?highlight=range#range) and [list comprehensions](https://docs.python.org/3/tutorial/datastructures.html?highlight=list%20comprehension#list-comprehensions).

### Range function

In [None]:
# Define range that starts at 3 and stops before 20 with a step of 2
example_range = range(3,20,2)

# Print the type
print(type(example_range))

# Convert the range into a list
list(example_range)

### for statement

In [None]:
# Iterate through a list
example_list = ['flower', 'tree', 'stone']
for element in example_list:
    print(element)

In [None]:
# Iterate using a range
for element in range(6):
    if element % 2 == True: print(element)

### while statement

In [None]:
number = 4
while number >= 0:
    print(number)
    number -= 2

### List comprehensions
List comprehension are used to create a new list out of an existing list. An expression and a condition can be used to define the new list. The old list will be unchanged.

new_list = [expression for element in iterable if condition]

In [None]:
old_list = list(range(3))
print(old_list)

new_list = [element + 100 for element in old_list if element < 2]
print(new_list)

### Exercises
Description

## Pattern matching
For structural pattern matching, the [match statement](https://docs.python.org/3/reference/compound_stmts.html#the-match-statement) is used in Python, since version 3.10.

In [None]:
# status = 200
#
# match status:
#     case 200:
#         print('OK')
#     case 500:
#         print("Internal Server Error")
#     case _:
#         print("Status code not known")

## File handling
File handling, such as read and write, operations are supported by Python.
- Before any operation can be performed the [open function](https://docs.python.org/3.11/library/functions.html?highlight=open#open) is used to load in files. A mode has to be specified to define how the file should be used (e.g. r for read and w for write).
- Other useful functions include read, write and append functions.
- Use the close function to free up resources immediately if not using the with keyword.
- With the import of the os module rename (renaming a file) and remove (deleting a file) functions can be used.
More on [reading and writing files](https://docs.python.org/3.11/tutorial/inputoutput.html#reading-and-writing-files).

In [None]:
# Open a txt file for reading operations
file = open(file='sample.txt', mode='r')

# Print each line of the file
for line in file:
    print(line)

# Terminate all resources if not using the with keyword (see below)
file.close()

In [None]:
# Create a file
f = open('file.txt','w')

f.write("Write something. ")
f.write("Write more into the file. ")

f.close()

In [None]:
# Append to a file
f = open('file.txt','a')

f.write("Append something.")

f.close()

In [None]:
# Rename a file
import os

os.rename('file.txt', 'fancy.txt')

In [None]:
# Delete a file
os.remove('fancy.txt')

### With statement
Using the [with statement](https://docs.python.org/3/reference/compound_stmts.html#the-with-statement) is considered a good practice for file handling. The file is closed and frees up resources after the code block is executed.

In [None]:
with open("sample.txt") as file:
    text = file.read()

print(text)

In [None]:
# Check if file was closed automatically (no execution of the close function required)
file.closed

### Try statement
The [try statement](https://docs.python.org/3/reference/compound_stmts.html#the-try-statement) is used to define how exceptions are handled for a group of statements.

In [None]:
# Try to read in a file and print the error if not possible
try:
    with open('content.txt', 'r') as file:
        contents = file.read()
except IOError as e:
    print(e)

### Exercises
Description
create a function that creates a txt file with xx name and xx content + another function that appends more text to the already existing file. Create a function that deletes the file. Use the with keyword and handle exceptions with following message xxx

## Optional typesystem
Python is a dynamically typed language. The actual types are inferred during code execution. The module [typing](https://docs.python.org/3/library/typing.html) introduces runtime support for type hints. Reference for a [typing cheat sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html).

Dynamic typing gives a lot of flexibility, which also comes with some drawbacks. Type annotations become more common as they serve as documentation purpose, better error handling etc. Read more about it [here](https://cerfacs.fr/coop/python-typing). User defined types are possible.

It is important to know that the types are annotations and are ignored by the Python interpreter. The potential of type annotations can be used by leveraging type checkers, linters etc.

In [None]:
# Some example usage for the optional typesystem
my_number: int = 3
my_string: str = 'hello'
my_list: list[float] = [1.3, 2.4]

def say_hello(name: str) -> str:
    return 'hello ' + name

def print_hello() -> None:
    print('hello')

### Exercises
Description

## [Optional] Classes
Python classes are available for bundling data and functionality. "Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state.". Read [more about classes](https://docs.python.org/3.11/tutorial/classes.html?highlight=object%20oriented%20programming) (e.g. inheritance...).

In [None]:
# Create a class object with two attributes (data, method)
class ExampleClass:
    class_value = 0
    def add_one(self):
        self.class_value += 1

In [None]:
# Reference class attributes
print(ExampleClass.class_value)
print(ExampleClass.add_one)

In [None]:
# Instantiate class
exampleClass = ExampleClass()

# Set class value
exampleClass.class_value = 5
exampleClass.add_one()

print(exampleClass.class_value)

In [None]:
# class_value is a class variable shared by all instances
# Defining an init function for a class introduces variables that are unique to each instance
class AnotherClass:
    class_value = 0
    def __init__(self, instance_value):
        self.instance_value = instance_value

# Instantiate classes with init function
first_class = AnotherClass('instance value 1')
second_class = AnotherClass('instance value 2')

print(first_class.class_value == second_class.class_value) # shared class value
print(first_class.instance_value == second_class.instance_value) # unique instance value

### Exercises
Description

## [Optional] Concurrent execution
[Concurrent execution](https://docs.python.org/3/library/concurrency.html) handles multiple active processes at the same time.

[Threading](https://docs.python.org/3/library/threading.html) enables concurrent execution by running multiple threads, switching between tasks, but not running in parallel.
Initially there is no multi threading in the CPython implementation due to the [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) and only one thread executes the code. Code runs on a default thread called the main thread. Threads run in a shared memory space and are a good option for CPU bound processing and I/O bound task execution.
Short video introduction to [Python threading](https://www.youtube.com/watch?v=A_Z1lgZLSNc).

[Multiprocessing](https://docs.python.org/3/library/multiprocessing.html) enables better use of computational resources of multicore machines and overcomes the global interpreter lock. Multiprocessing enables true parallelism. Processes have separate memory space and are a good option for I/O bound applications.

## [Optional] Other useful modules and packages
To find modules for Python browse at https://pypi.org

Useful modules for Python in the data space include the following (good to have heard about them):
  - [Regex](https://docs.python.org/3.11/library/re.html?highlight=rx) for regular expression matching operations.
  - [Math](https://docs.python.org/3.11/library/math.html?highlight=math#module-math) for mathematical functions.
  - [NumPy](https://numpy.org/devdocs/index.html) is a fundamental package for scientific computing. It provides multidimensional arrays and the capability to perform fast operations (mathematical, statistical, algebraic etc.) on them.
  - [Pandas](https://pandas.pydata.org/) is used for data analysis and manipulation. Fast and easy to use. Exploratory analysis is a very good usecase.
  - [Matplotlib](https://matplotlib.org/) and [Seaborn](https://seaborn.pydata.org/) are both excellent visualisation tools.
  - [scikit-learn](https://scikit-learn.org/stable/) is a machine learning library providing implementations for common algorithms for predictive data analysis.