## Code alongs - python fundamentals 2

## Error handling

- syntax errors
- runtime errors
- logical errors

In [1]:
# syntax error

prin("hej")

NameError: name 'prin' is not defined

In [2]:
print("hej")

hej


In [3]:
# runtime errors, is discovered when you run the cell
numbers = list(range(19))
numbers[19]

IndexError: list index out of range

In [6]:
import numpy as np

radius = 10
# logical error, wrong formula used
area_circle = radius * np.pi
# correct formula is np.pi * radius**2 
print(f"{area_circle = :.2f} a.u.")

area_circle = 31.42 a.u.


In [7]:
age = input("Enter your age")
age

'-42'

In [14]:
# type casting might give a ValueError
# to stop the program from crashing we use Try Except
while True:
    try:
        age = int(input("Enter your age"))
        if not 0 <= age <=125:
            raise ValueError(f"Age must be between 0 and 125 not {age}")
        break
    except ValueError as err:
        print(err)

age

Age must be between 0 and 125 not -5


6

## Functions

- avoid spaghetti code
- change at one place, do not make yourself change at several places at the same time
- DRY - don't repeat yourself
- organize code
- make code modular
- break down complex programs



In [19]:
def smallest_of_two(number1, number2):
    return number1 if number1 < number2 else number2 # oneline if-else

# parameters when we define a function, but arguments when we use the function

# positional arguments: number1= 2, number2= -5
smallest_of_two(2, -5)


2

In [20]:
# keyword arguments
smallest_of_two(number1=2, number2=-5)

-5

In [21]:
# one can use first positional arguments, then keyword arguments
smallest_of_two(-5, number2=9)

-5

## default value

In [2]:
# x o o o o
# x x o o o
# x x x o o
# x x x x o
# x x x x x 
# operator overload, polymorphism - operator behaves differently dependent on which data it works with
def draw_ascii_pattern(number_rows = 5):
    print(number_rows*"o")

# note default value: number_rows = 5
draw_ascii_pattern()
draw_ascii_pattern(3)

ooooo
ooo


In [10]:
def draw_ascii_pattern(number_rows = 5):
    for i in range(number_rows):
        print(f'{(i+1)*"x " + (number_rows-i-1)*"o "}')

draw_ascii_pattern()
draw_ascii_pattern(3)


x o o o o 
x x o o o 
x x x o o 
x x x x o 
x x x x x 
x o o 
x x o 
x x x 


## *args
arbitrary number of *positional arguments 

In [11]:
def mean_(*args):
    print(args)

# this will retur a tuple, immutable, order matters!
# if args is tuple we could use it as a typical tuple
mean_(1,2,3,4)

(1, 2, 3, 4)


In [12]:
mean_(1,2)

(1, 2)


In [13]:
def mean_(*args): # mean_ because there is already a mean fucntion in python and we do not want to override it
    sum_= 0

    for arg in args:
        sum_+= arg
    return sum_/len(args)

mean_(1,2,3)

2.0

## **kwargs


In [16]:
def print_kwargs(**options):
    print(options)
    print(f"{options.keys() = }")
    print(f"{options.values() = }")

print_kwargs(a = 1, is_active = True, age = 33)

{'a': 1, 'is_active': True, 'age': 33}
options.keys() = dict_keys(['a', 'is_active', 'age'])
options.values() = dict_values([1, True, 33])


## File handling

In [28]:
import re 

with open("data/ml_text_raw.txt", 'r') as file:
    raw_text = file.read()

text_fixed_spacing = re.sub(r"\s+"," ",raw_text)

text_fixed_spacing.split(". ")

['SUperViseD LEARNinG IS a PaRt oF MaCHinE LEARniNG, wheRE aLgORithms LEARn FRom a tRAINIng DaTa Set',
 'THese aLgORithms TRY TO MaKE SeNSe Of ThE DaTa BY MaTChiNG INpUtS TO CoRResPonDInG OutpUTs',
 'In suPERviseD LEARNing, EACH DaTa PoINt in ThE tRAINIng Set IS LaBELEd WiTH ThE CoRReCT OutpUT, WHich aLLOWS thE ALgORithM To LEARn FRom ThE ExAMPles',
 'THis alLOWS thE ALgORithM To MaKe PREDIcTions On UnSEEN DaTa, BaSED On ITs TRaiNIng',
 'iT IS USEd FoR taSKS SuCH AS CLaSSIFICaTion, WheRE ThE GoAL IS To aSSIGn a LaBEL To InpUt DaTa, anD REGrESsIoN, WheRE ThE GoAL IS To PREDIcT a CoNtINuoUS OutpUT VaRIabLE',
 'SuPERviseD LEARNing HaS MaNY APPLIcatIoNS In ArEas LIke Image ReCOGNitiON, NatuRaL LaNGuaGE PRoCESSINg, anD FiNaNCiaL FoRECasting.']

In [32]:
sentences = [text.strip().capitalize() for text in text_fixed_spacing.split(".")]
sentences = sentences[:-1]
sentences

['Supervised learning is a part of machine learning, where algorithms learn from a training data set',
 'These algorithms try to make sense of the data by matching inputs to corresponding outputs',
 'In supervised learning, each data point in the training set is labeled with the correct output, which allows the algorithm to learn from the examples',
 'This allows the algorithm to make predictions on unseen data, based on its training',
 'It is used for tasks such as classification, where the goal is to assign a label to input data, and regression, where the goal is to predict a continuous output variable',
 'Supervised learning has many applications in areas like image recognition, natural language processing, and financial forecasting']

In [34]:
cleaned_text = ".\n".join(sentences)
print(cleaned_text)

Supervised learning is a part of machine learning, where algorithms learn from a training data set.
These algorithms try to make sense of the data by matching inputs to corresponding outputs.
In supervised learning, each data point in the training set is labeled with the correct output, which allows the algorithm to learn from the examples.
This allows the algorithm to make predictions on unseen data, based on its training.
It is used for tasks such as classification, where the goal is to assign a label to input data, and regression, where the goal is to predict a continuous output variable.
Supervised learning has many applications in areas like image recognition, natural language processing, and financial forecasting


In [35]:
with open("data/cleaned_ml_text.txt", "w") as file:
    file.write(cleaned_text)
