<a href="https://colab.research.google.com/github/simon-mellergaard/RL/blob/main/Tutorials/04_python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Notebook pre steps

# ALWAYS SAVE YOUR OWN COPY OF THIS NOTEBOOK: File > Save a copy in Drive
# If danish menus do: Hjælp > Engelsk version
# To clear output do: Edit > Clear all outputs

## Useful shortscuts
# Run current cell: Cmd+Enter
# Run current cell and goto next: Shift+Enter
# Run selection: Cmd+Shift+Enter
# Comment/uncomment: Cmd+/

## debugging (set off when finish coding)
# Use ipdb.set_trace() to set breakpoint
# Useful ipdb commands
    # p <variable>: Print the value of a variable.
    # n: Execute the next line of code.
    # c: Continue execution until the next breakpoint or the end of the program.
    # q: Quit the debugger.
# %debug # Can be run in a code cell after the cell with error
# !pip install ipdb
# import ipdb
# %pdb on
# %pdb off


## Missing packages
!pip install dfply
!pip install plotnine

# A short introduction to Python and packages

This notebook gives an introduction to Python and some of the packages we will use during the course.

## Formatted Strings with `f'...'

Formatted strings allow you to embed expressions inside string literals.

In [None]:
name = 'Alice'
age = 30
print(f'{name} is {age} years old.')

# Your turn: Create variables for your name, a hobby, and a favorite food.
# Then use an `f-string` to print a sentence including all three.

## Lists, Tuples, and Sets

Lists, tuples, and sets are all built-in data structures in Python, each with different properties:


| Feature         | List ([])                | Tuple (())                 | Set ({})                   |
|-----------------|---------------------------|-----------------------------|-----------------------------|
| Ordered         | Yes                       | Yes                         | No                          |
| Duplicates      | Allowed                   | Allowed                     | Not allowed                 |
| Mutability      | Mutable                   | Immutable                   | Mutable                     |
| Syntax          | my_list = [1, 2, 3]       | my_tuple = (1, 2, 3)        | my_set = {1, 2, 3}          |
| Use case        | General purpose           | Fixed group of values       | Unique, unordered values    |

Lists are more flexible, while tuples are often used for fixed data, like RGB values or coordinate points. Tuples are useful for data you want to protect from modification, or when using sequences as dictionary keys or set elements. Sets are useful when you need to eliminate duplicates or check membership quickly.

In [None]:
# Creating empty structures
empty_list = []
empty_tuple = ()
empty_set = set()  # Use set() instead of {} to avoid creating an empty dictionary
print("Empty list:", empty_list)
print("Empty tuple:", empty_tuple)
print("Empty set:", empty_set)

# List example
colors = ['red', 'green', 'red', 'blue']
colors.append('yellow')
print("List:", colors)
# Your turn: Get the index of green (try colors. to see possiblities)
# Your turn: Get the first index of red
# Your turn: Remove the first red element
# Your turn: Remove the second element

# Tuple example
rgb = ('red', 'green', 'blue', 'red')
# rgb.append('yellow')  # This will raise an AttributeError
print("Tuple:", rgb)
# Your turn: Get the length of the tuple
# Your turn: Count the number of red

# Set example
unique_colors = {'red', 'green', 'blue', 'red'}  # duplicates removed
unique_colors.add('yellow')
print("Set:", unique_colors)
# Your turn: Add black to the set
unique_colors.add('red')
# Your turn: Try adding a duplicate to the set. What happens?

# Your turn: Convert the list to a tuple, the tuple to a set, and the set to a list.

## The NumPy package

NumPy is a library for numerical computing in Python. It provides the ndarray type, which is more efficient than lists for numerical operations.


In [None]:
import numpy as np
import random  # for generating random numbers

# Creating arrays
arr = np.array([1, 3, 7, 7, 2, 1])
print("1D array:", arr)
# Your turn: Create a NumPy array of the numbers 10 through 20.

# 2D array
matrix = np.array([[1, 2], [3, 4]])
print("2D array:\n", matrix)
# Your turn: Create a 3x3 matrix with values from 1 to 9.

# Array operations
print("Element-wise addition:", arr + 10)
print("Sum:", np.sum(arr))
print("Mean:", np.mean(arr))
print("Shape of matrix:", matrix.shape)
# Your turn: Compute the row-wise and column-wise sums of the matrix.

# Array slicing
print("First two elements:", arr[:2])

# Creating arrays with special values
zeros_array = np.zeros((2, 3))  # 2x3 array of zeros
print("Zeros array:\n", zeros_array)
full_array = np.full((3, 3), 7)  # 3x3 array filled with the value 7
print("Full array:\n", full_array)
# Your turn: Create a 4x4 array filled with the number 5 using np.full.

# Max values
print("Max:", np.max(arr))
print("Index of first max value in arr (using argmax):", np.argmax(arr))

# Choosing randomly among multiple maximums
max_value = np.max(arr)
max_indices = np.where(arr == max_value)[0]
print('Max index', random.choice(max_indices))
# alternative (note enumerate([2,8]) would produce: (0, 2) (1, 8))
max_indices = [i for i, val in enumerate(arr) if val == max_value]
print('Max index', random.choice(max_indices))
# Your turn: Find the index of the minimum value, and select one at random.

# Converting lists and tuples to NumPy arrays and back
lst = [1, 2, 3]
tup = (4, 5, 6)
print("Array from list:", np.array(lst)) # list to array
print("Array from tuple:", np.array(tup)) # tuple to array
print("List from array:", np.array(arr).tolist()) # array to list
print("Tuple from array:", tuple(arr)) # array to tuple

## Dictionaries

Dictionaries store key-value pairs. They are mutable, unordered (prior to Python 3.7), and very fast for lookups.

Keys must be unique and immutable (e.g., strings, numbers, tuples). Values can be of any type. Use .get(key) if you're not sure the key exists.

In [None]:
# Creating dictionaries
person1 = {"name": "Alice", "age": 30, "city": "Paris"}
print("Dictionary:", person1)

# Accessing values
print("Name:", person1["name"])

# Adding or updating entries
person1["height"] = 165
person1["age"] = 31
print("Updated dictionary:", person1)
# Your turn: Remove city from person1

# Dictionary methods
print("Keys:", person1.keys())
print("Values:", person1.values())
print("Items:", person1.items())

# Creating an empty dictionary
person2 = {}
print("Empty dictionary:", person2)
# Your turn: Add info about city and height to person2
# Your turn: Add person1 and person2 to a new dictionary people using their name as keys.
# Your turn: Print person2's height using people

## Advanced Loops & Conditional Statements

Python allows advanced use of loops and conditional statements to write expressive and efficient code.
These features are useful when filtering data, checking multiple conditions, or evaluating nested structures

In [None]:
# Nested conditions
age = 20
gender = "female"
if age >= 18:
    if gender == "female":
        print("Adult woman")
    else:
        print("Adult man")
else:
    print("Minor")

# Using `for` loops with `if` conditions
numbers = [1, 2, 3, 4, 5, 6]
even_numbers = []
for num in numbers:
    if num % 2 == 0: # modulus operator (the remainder of a division)
      even_numbers.append(num)
print("Even numbers:", even_numbers)
# Your turn: Use a loop to filter out odd numbers from a list and print the result.

# List comprehension with conditionals
squared_evens = [num**2 for num in numbers if num % 2 == 0]
print("Squared even numbers:", squared_evens)
# Your turn: Create a list of squares for numbers 1 through 10, but only include those that are divisible by 3.
# Your turn: Write a function that checks if all elements in a list are positive using all().

# Complex expressions using all() and any() using tic-tac-toe example
board = ['X', 'X', 'X', 'O', '', 'O', '', '', ''] # state of the board read from left to right and down
wins = [(0,1,2), (3,4,5), (6,7,8), (0,3,6), (1,4,7), (2,5,8), (0,4,8), (2,4,6)]  # a list of tuples with indices giving a win if 3 equal symbols
mark = "X"
winner = any(all(board[i] == mark for i in win) for win in wins) # check if X wins
print("Player X wins:", winner)
# Your turn: Add the check to a function check_win with input board and mark
def check_win(board, mark):
  return any(all(board[i] == mark for i in win) for win in wins)
# Your turn: Check if O wins
# Your turn: Modify the check_win function to print each win combination as it checks

## Classes in Python

A class is a blueprint for creating objects. In Python classes, `self` is a convention used as the first parameter in instance methods. It refers to the instance of the class itself.

In [None]:
class Person:
    def __init__(self, name, age = 34):
        self.name = name
        self.age = age
        self.info = None

    def say_hello(self):
        print(f'Hello my name is {self.name} :-)')

# Test class
person1 = Person("Alice", 25)
person2 = Person("Bob")
person1.say_hello()
person2.say_hello()

# Your turn: Select the class text, right click and choose Explain code.
# Your turn: Add a function teen that returns True if the person is between 13 and 19 otherwise False.
# Your turn: Add a function set_info that takes input join (default False) and str.
#  If join = False the set info to str else set info to a string with name, age and str.



## The `pandas` and `pfply` Package

Pandas is a powerful Python library for data analysis and manipulation. It provides data structures like DataFrame and Series.
dfply is a library that mimics R's dplyr syntax to transform pandas DataFrames using a pipeline style.

In [None]:
# Importing libraries
import pandas as pd
from dfply import *

print("Original DataFrame:")
display(diamonds) # The 'diamonds' dataset is typically available in pandas
print("First 3 rows:")
df = diamonds >> head(3) # note the pipe operator >> takes the dataframe before as input to the function after
display(df)
print("Selected columns:")
df = diamonds >> select(X.carat, X.price) >> head(5)
display(df)
print("Diamonds with price > 10000:")
df = diamonds >> mask(X.price > 18000)
display(df)
print("Diamonds with cut equal Ideal:")
# Your turn: Filter so only consider Ideal diamonds
print("Add new column:")
df = diamonds >> mutate(price_str = X.price.astype(str) + ' dollars')
display(df)
print("Volumne:")
# Your turn: Add column volume = x * y * z
print("Average price:")
df = diamonds >> group_by(X.cut) >> summarize(avg_price = X.price.mean())
display(df)
print("Given cut and clarity, the maximum, average and minimum depth is:")
# Your turn: Consider the first 50 rows. Given cut and clarity, calc the maximum, average and minimum depth

## The `plotnine` Package


`plotnine` is a grammar of graphics library similar to `ggplot2` in R.

In [None]:
from plotnine import *
from vega_datasets import local_data

anscombes_quartet = local_data.anscombe()
display(anscombes_quartet >> head(2))

pt = (
    ggplot(anscombes_quartet, aes("X", "Y", fill="Series"))
    + geom_point() +
    labs(title = "Points")
)
pt.show()
# Your turn: Use geom_col instead of geom_point and let fill be based on Series

pt = (
    ggplot(anscombes_quartet, aes("X", "Y", color="Series"))
    + geom_point()
    + geom_smooth(method="lm", se=False, fullrange=True)
    + theme(legend_position='bottom')
    + labs(title = "Points and linear fit")
)
pt.show()

pt = (
    ggplot(anscombes_quartet, aes("X", "Y", color="Series"))
    + geom_point()
    + geom_smooth(method="lm", se=False, fullrange=True)
    + facet_wrap("~Series")
    + theme(legend_position='bottom')
    + labs(title = "Subplots")
)
pt.show()

pt = (
    ggplot(anscombes_quartet, aes("X", "Y", color="Series"))
    + geom_point()
    + geom_line()
    + geom_smooth(method="lm", se=False, fullrange=True)
    + facet_wrap("~Series")
    + theme(legend_position='bottom')
    + labs(title = "Subplots with points connected")
)
pt.show()