## Code alongs - fundamentals part 2

## Error handling

- syntax errors
- runtime errors
- logical errors

In [1]:
# syntax error
prin("hej")

NameError: name 'prin' is not defined

In [2]:
print("hej")

hej


In [4]:
## runtime error
numbers = list(range(19))
numbers[19]

IndexError: list index out of range

In [7]:
import numpy as np

radius = 10
# np.pi*radius**2 is the area of the circle -> logical error 
area_circle = radius * np.pi
print(f"{area_circle = :.2f} a.u.")

area_circle = 31.42 a.u.


## Handle errors

try - except 

In [9]:
age = input("Enter your age")
age


'-42'

In [14]:
while True:
    try:
        # type casting might give ValueError
        age = int(input("Enter your age"))
        if not 0 <= age <= 125:
            raise ValueError(f"Age must be between 0 and 125 not {age}")
        break
    except ValueError as err:
        print(err)

age

Age must be between 0 and 125 not -5


0

## Functions

- avoid spaghetti code
- change one place 
- DRY - Don't Repeat Yourself
- organize code
- make code modular
- break down complex programs 

In [16]:
# number1 and number2 are paramters
def smallest_of_two(number1, number2):
    return number1 if number1 < number2 else number2 # oneline if-else

# positional arguments
smallest_of_two(2, -5)

-5

In [19]:
# keyword arguments
smallest_of_two(number1=-5, number2=-5)

-5

In [21]:
smallest_of_two(-5, number2=-9)

-9

**default value**

In [3]:
# x o o o o
# x x o o o
# x x x o o
# x x x x o
# x x x x x 
def draw_ascii_pattern(number_rows = 5):
    print(number_rows*"o")

# note default value: number_rows = 5 
draw_ascii_pattern()

draw_ascii_pattern(3)

ooooo
ooo


In [11]:
def draw_ascii_pattern(number_rows=5):
    for i in range(number_rows):
        print(f'{i*"x " + (number_rows-i)*"o "}')


draw_ascii_pattern()

o o o o o 
x o o o o 
x x o o o 
x x x o o 
x x x x o 


In [13]:
draw_ascii_pattern(3)

o o o 
x o o 
x x o 


\*args

arbitrary number of positional arguments 

In [15]:
def mean_(*args):
    print(args)

# it prints out a tuple 
mean_(1,2,3,4)

(1, 2, 3, 4)


In [16]:
mean_(1,2)

(1, 2)


In [19]:
def mean_(*args):
    sum_ = 0

    for arg in args:
        sum_ += arg
    return sum_/len(args)

mean_(1,2,3)

2.0

\*\*kwargs

In [22]:
def print_kwargs(**options):
    print(options)
    print(f"{options.keys() = }")
    print(f"{options.values() = }")


print_kwargs(a = 5, is_active = True, age = 33)

{'a': 5, 'is_active': True, 'age': 33}
options.keys() = dict_keys(['a', 'is_active', 'age'])
options.values() = dict_values([5, True, 33])


## File handling

In [27]:
import re 

with open("data/ml_text_raw.txt", 'r') as file:
    raw_text = file.read()


text_fixed_spacing = re.sub(r"\s+", " ",raw_text)
text_fixed_spacing

'SUperViseD LEARNinG IS a PaRt oF MaCHinE LEARniNG, wheRE aLgORithms LEARn FRom a tRAINIng DaTa Set. THese aLgORithms TRY TO MaKE SeNSe Of ThE DaTa BY MaTChiNG INpUtS TO CoRResPonDInG OutpUTs. In suPERviseD LEARNing, EACH DaTa PoINt in ThE tRAINIng Set IS LaBELEd WiTH ThE CoRReCT OutpUT, WHich aLLOWS thE ALgORithM To LEARn FRom ThE ExAMPles. THis alLOWS thE ALgORithM To MaKe PREDIcTions On UnSEEN DaTa, BaSED On ITs TRaiNIng. iT IS USEd FoR taSKS SuCH AS CLaSSIFICaTion, WheRE ThE GoAL IS To aSSIGn a LaBEL To InpUt DaTa, anD REGrESsIoN, WheRE ThE GoAL IS To PREDIcT a CoNtINuoUS OutpUT VaRIabLE. SuPERviseD LEARNing HaS MaNY APPLIcatIoNS In ArEas LIke Image ReCOGNitiON, NatuRaL LaNGuaGE PRoCESSINg, anD FiNaNCiaL FoRECasting.'

In [30]:
text_fixed_spacing.split(". ")

['SUperViseD LEARNinG IS a PaRt oF MaCHinE LEARniNG, wheRE aLgORithms LEARn FRom a tRAINIng DaTa Set',
 'THese aLgORithms TRY TO MaKE SeNSe Of ThE DaTa BY MaTChiNG INpUtS TO CoRResPonDInG OutpUTs',
 'In suPERviseD LEARNing, EACH DaTa PoINt in ThE tRAINIng Set IS LaBELEd WiTH ThE CoRReCT OutpUT, WHich aLLOWS thE ALgORithM To LEARn FRom ThE ExAMPles',
 'THis alLOWS thE ALgORithM To MaKe PREDIcTions On UnSEEN DaTa, BaSED On ITs TRaiNIng',
 'iT IS USEd FoR taSKS SuCH AS CLaSSIFICaTion, WheRE ThE GoAL IS To aSSIGn a LaBEL To InpUt DaTa, anD REGrESsIoN, WheRE ThE GoAL IS To PREDIcT a CoNtINuoUS OutpUT VaRIabLE',
 'SuPERviseD LEARNing HaS MaNY APPLIcatIoNS In ArEas LIke Image ReCOGNitiON, NatuRaL LaNGuaGE PRoCESSINg, anD FiNaNCiaL FoRECasting.']

In [37]:
sentences = [text.strip().capitalize() for text in text_fixed_spacing.split(".")]
sentences = sentences[:-1]
sentences

['Supervised learning is a part of machine learning, where algorithms learn from a training data set',
 'These algorithms try to make sense of the data by matching inputs to corresponding outputs',
 'In supervised learning, each data point in the training set is labeled with the correct output, which allows the algorithm to learn from the examples',
 'This allows the algorithm to make predictions on unseen data, based on its training',
 'It is used for tasks such as classification, where the goal is to assign a label to input data, and regression, where the goal is to predict a continuous output variable',
 'Supervised learning has many applications in areas like image recognition, natural language processing, and financial forecasting']

In [45]:
cleaned_text = ".\n\n".join(sentences)
print(cleaned_text)

Supervised learning is a part of machine learning, where algorithms learn from a training data set.

These algorithms try to make sense of the data by matching inputs to corresponding outputs.

In supervised learning, each data point in the training set is labeled with the correct output, which allows the algorithm to learn from the examples.

This allows the algorithm to make predictions on unseen data, based on its training.

It is used for tasks such as classification, where the goal is to assign a label to input data, and regression, where the goal is to predict a continuous output variable.

Supervised learning has many applications in areas like image recognition, natural language processing, and financial forecasting


In [55]:
with open("data/cleaned_ml_text.txt", "w") as file:
    file.write(cleaned_text)