<div style="max-width:66ch;">


# Lecture notes - Python summary part 1

this is a lecture note for **python summary**. It contains

- output
- data types
- collections (list, set, tuple, dict)
- strings
- if-statements
- for-statements
- while-statements

Note that this is an introduction to the subject, you are encouraged to read further. We will only cover enough fundamental parts that will be useful for this course and the next. 

---

</div>


<div style="max-width:66ch;">

## Data types

Python is a <b>dynamically typed</b> language and the data types are determined at <b>runtime</b>, not at compiled-time. This means that you don't have to declare the data type in the definition in contrast to statically typed languages such as C#, Java, C++ ... 


<table border="1", style="text-transform: lowercase; display:inline-block; text-align:left;">
    <tr style="background-color: #174A7E; color: white;">
        <th>Data Type</th>
        <th>Examples</th>
        <th>Comment</th>
    </tr>
    <tr>
        <td>int</td>
        <td>42, -3, 0</td>
        <td></td>
    </tr>
    <tr>
        <td>float</td>
        <td>3.14, -0.5, 2.0</td>
        <td></td>
    </tr>
    <tr>
        <td>str</td>
        <td>'Hello', "Python", '123'</td>
        <td>
            Store Text
            <ul>
                <li>Immutable</li>
                <li>Indexing</li>
                <li>Slicing</li>
                <li>Concatenation</li>
                <li>String Formatting</li>
                <li>Length</li>
                <li>String Methods</li>
                <li>Iteration</li>
                <li>String Comparison</li>
                <li>String Conversion</li>
            </ul>
        </td>
    </tr>
    <tr>
        <td>list</td>
        <td>[1, 2, 3]<br>['apple', 'banana', 'cherry']</td>
        <td>storing elements, can have different types
            <ul>
                <li>index, slice, append, insert, delete </li>
                <li>sorting, reversing </li>
                <li>check length, iterate over</li>
                <li>list comprehension</li>
                <li>mutable</li>
            </ul>
        </td>
    </tr>
    <tr>
        <td>tuple</td>
        <td>(1, 2, 3)<br>('apple', 'banana', 'cherry')</td>
        <td>
            storing elements, can have different data types
        <ul>
            <li>immutable</li>
            <li>index, slice</li>
            <li>iteration</li>
        </td>
    </tr>
    <tr>
        <td>dict</td>
        <td>{'name': 'John', 'age': 30}<br>{'fruit': 'apple', 'color': 'red'}</td>
        <td>
            store key-value pairs</li>
            <ul>
                <li>Fast Lookup</li>
                <li>Mutable</li>
                <li>Unordered</li>
                <li>Key-Based Access</li>
                <li>Iteration</li>
                <li>Checking for Key Existence</li>
                <li>Dictionary Comprehensions</li>
            </ul>
        </td>
    </tr>
    <tr>
        <td>bool</td>
        <td>True, False</td>
        <td>represent binary logic</td>
    </tr>
    <tr>
        <td>set</td>
        <td>{1,"a", True, 3.3, 3.2}</td>
        <td>store unique elementst
            <ul>
                <li>Duplicate Removal</li>
                <li>Set Operations</li>
                <li>Membership Testing</li>
                <li>Iteration</li>
                <li>Mathematical Modeling</li>
                <li>Removing Items</li>
                <li>Frozensets</li>
                <li>Deduplicate Lists</li>
                <li>Checking for Subsets</li>
            </ul> 
        </td>
    </tr>
</table>

</div>


In [1]:
name = "Kokchun"
age = 32.8
number_of_children = 1
loves_math = True

"""
this gives a tuple as output
note that the last statement in a jupyter notebook cell is printed out, 
this is not the case for Python scripts
"""

name, age, number_of_children, loves_math

('Kokchun', 32.8, 1, True)

### f-strings


In [7]:
# type(variable) is read out as "type of variable"
# here we use f-strings or formatted strings to incorporate strings and variables
print(f"type(name) is {type(name)}")
print(f"type(age) is {type(age)}")
print(f"type(number_of_children) is {type(number_of_children)}")
print(f"type(loves_math) is {type(loves_math)}")

type(name) is <class 'str'>
type(age) is <class 'float'>
type(number_of_children) is <class 'int'>
type(loves_math) is <class 'bool'>


<div style="max-width:66ch;">

---
### Collection types

- dictionary
- list
- tuple
- set

</div>

#### list

- list creation
- append to a list
- access elements
- list comprehension


In [16]:
interests = ["badminton", "yoga", "math", "programming"]
# can also have elements with different data types
person = ["Kokchun", 32.8, 1, True]

person

['Kokchun', 32.8, 1, True]

In [17]:
# we have mutated the list by appending interests list to it
person.append(interests)
person

['Kokchun', 32.8, 1, True, ['badminton', 'yoga', 'math', 'programming']]

**Accessing elements in a list**


In [24]:
# access elements, note index starts at 0
print(f"{person[0] = }")
print(f"{person[2] = }")
print(f"{person[-1] = }")  # negative index starts backwards
print(f"{person[2] = }")
print(f"{person[-1][-2] = }")

person[0] = 'Kokchun'
person[2] = 1
person[-1] = ['badminton', 'yoga', 'math', 'programming']
person[2] = 1
person[-1][-2] = 'math'


**Slicing a list**


In [42]:
print("original list")
print(f"{person = }\n")

print("slice from 0 to 2 (exclusive)")
print(f"{person[:2] = }\n")

print("every second element")
print(f"{person[0::2] = }\n")

print("reverse")
# reverse a list  as we start from 0 and end in 0 but traverses the list in reverse order
print(f"{person[::-1] = }")

original list
person = ['Kokchun', 32.8, 1, True, ['badminton', 'yoga', 'math', 'programming']]

slice from 0 to 2 (exclusive)
person[:2] = ['Kokchun', 32.8]

every second element
person[0::2] = ['Kokchun', 1, ['badminton', 'yoga', 'math', 'programming']]

reverse
person[::-1] = [['badminton', 'yoga', 'math', 'programming'], True, 1, 32.8, 'Kokchun']


<div style="max-width:66ch;">

**Iterating over lists**

for statement should be used when

- we know number of times it should iterate
- we want to traverse through each element in a collection

while statement should be used when

- we don't know number of times it will iterate
- we have a condition when to stop, can be seen colloquially as if-loop

</div>

In [67]:
numbers = []
for i in range(10):
    numbers.append(i)

print(f"{numbers = }")

squares = []
# looping through a list with for
for number in numbers:
    squares.append(number**2)

print(f"{squares = }")

# cube as long as the result is lower than 500
i, cubes = 0, [0]  # multiple assignment
while (i + 1) ** 3 < 500:  # to avoid off-by-one error
    i += 1
    cubes.append(i**3)

print(f"{cubes = }")

print(f"{len(cubes)} cubes before we go over 500")

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
squares = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
cubes = [0, 1, 8, 27, 64, 125, 216, 343]
8 cubes before we go over 500


**list comprehensions for list creations**


In [69]:
numbers = [number for number in range(10)]
squares = [number**2 for number in range(10)]

# if-statement in list comprehension
cubes = [number**3 for number in range(10) if number ** 3 < 500]

print(f"{numbers = }")
print(f"{squares = }")
print(f"{cubes = }")

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
squares = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
cubes = [0, 1, 8, 27, 64, 125, 216, 343]


#### tuple - an immutable collection

- useful when you don't want the collection to change


In [114]:
person = ("Kokchun", 32.8, 1, True)
person

('Kokchun', 32.8, 1, True)

In [117]:
# doesn't support item assignment
person[2] = 12

TypeError: 'tuple' object does not support item assignment

#### set - a unique collection


In [89]:
import random as rnd

dices = [
    rnd.randint(1, 6) for _ in range(20)
]  # convention: underscore _ when not using the variable in the loop

print(f"{dices = }")

unique_dices = set(dices)
print(f"{unique_dices = }")

dices = [3, 1, 2, 4, 4, 4, 3, 4, 6, 4, 1, 4, 4, 4, 3, 5, 5, 1, 2, 5]
unique_dices = {1, 2, 3, 4, 5, 6}


In [91]:
# set doesn't have any order, and can't be accessed using bracket notation
unique_dices[1]

TypeError: 'set' object is not subscriptable

**set logic**


In [110]:
set2 = {1, 3, 10, 15}

print(f"{unique_dices = }")
print(f"{set2 = }\n")

print(f"{unique_dices.union(set2) = }")
print(f"{unique_dices | set2 = }\n")  # union

print(f"{unique_dices.intersection(set2) = }")
print(f"{unique_dices & set2 = }\n")

print(f"{unique_dices.difference(set2) = }")
print(f"{unique_dices - set2 = }\n")

set3 = {1, 2}
print(f"{unique_dices.issubset(set2) = }")
print(f"{set3.issubset(unique_dices) = }")

unique_dices = {1, 2, 3, 4, 5, 6}
set2 = {1, 10, 3, 15}

unique_dices.union(set2) = {1, 2, 3, 4, 5, 6, 10, 15}
unique_dices | set2 = {1, 2, 3, 4, 5, 6, 10, 15}

unique_dices.intersection(set2) = {1, 3}
unique_dices & set2 = {1, 3}

unique_dices.difference(set2) = {2, 4, 5, 6}
unique_dices - set2 = {2, 4, 5, 6}

unique_dices.issubset(set2) = False
set3.issubset(unique_dices) = True


<div style="max-width:66ch;">

---

#### Dictionary - two different syntaxes

The two different syntaxes to create dictionary in Python are

- using the dict keyword
- using bracket syntax

</div>

In [12]:
person_dict_syntax = dict(
    name="Kokchun", age=32.8, number_of_children=1, loves_math=True
)
person_dict_bracket_syntax = {
    "name": "Kokchun",
    "age": 32.8,
    "number_of_children": 1,
    "loves_math": True,
}

print(f"both syntax are equal? {person_dict_syntax == person_dict_bracket_syntax}")

# note equal in the sense of containing same content, but not pointing to same object
# they have different memory addresses, so changing one won't affect the other
print("but they have different memory addresses")
hex(id(person_dict_bracket_syntax)), hex(id(person_dict_syntax))

both syntax are equal? True
but they have different memory addresses


('0x108880f40', '0x1089dad00')

**Iterating over dictionary**


In [71]:
for key in person_dict_syntax:
    print(key)

name
age
number_of_children
loves_math


In [78]:
print("key, value")
for key, value in person_dict_syntax.items():
    print(f"{key, value}")

key, value
('name', 'Kokchun')
('age', 32.8)
('number_of_children', 1)
('loves_math', True)


**Dictionary comprehension**


In [133]:
ml_terms = [
    "Supervised Learning",
    "Unsupervised Learning",
    "Feature Engineering",
    "Overfitting",
    "Cross-Validation"
]


ml_explanations = [
    "Learns from labeled data.",
    "Finds patterns in unlabeled data.",
    "Enhances input features for models.",
    "Model fits training data too closely.",
    "Evaluates model performance robustly."
]

glossary = {term.lower(): explanation.lower()  for term, explanation in zip(ml_terms, ml_explanations)}
glossary

{'supervised learning': 'learns from labeled data.',
 'unsupervised learning': 'finds patterns in unlabeled data.',
 'feature engineering': 'enhances input features for models.',
 'overfitting': 'model fits training data too closely.',
 'cross-validation': 'evaluates model performance robustly.'}

<div style="max-width:66ch;">

#### Strings

- important to understand how to manipulate strings as we will clean a lot of data in our work

</div>

In [135]:
# oops weird formatting 
ml_term = "sUperVised   Learning  "

print(f"{len(ml_term) = }")

print(f"{ml_term.strip() = }")
print(f"{ml_term.split() = }\n")

print("cleaned data")
print(f"{' '.join(ml_term.split()).lower() = }")

len(ml_term) = 23
ml_term.strip() = 'sUperVised   Learning'
ml_term.split() = ['sUperVised', 'Learning']

cleaned data
' '.join(ml_term.split()).lower() = 'supervised learning'


**Strings as list of characters**

In [145]:
ml_term = "clustering algorithms"

print(f"{ml_term[0] = }")
print(f"{ml_term[:10] = }")
print(f"{ml_term[10:-1] = }")

supervised_learning = ["regression", "classifcation"]
unsupervised_learning = ["clustering"]



ml_term[0] = 'c'
ml_term[:10] = 'clustering'
ml_term[10:-1] = ' algorithm'


'regression algorithm'

<div style="max-width:66ch;">

**string operators**

<table border="1" style="text-transform: lowercase; display:inline-block; text-align:left;">
    <thead>
        <tr style="background-color: #174A7E; color: white;">
            <th>Operator Name</th>
            <th>Operator</th>
            <th>Example</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Concatenation</td>
            <td>+</td>
            <td>"Neural" + " Network" = "Neural Network"</td>
        </tr>
        <tr>
            <td>Repetition</td>
            <td>*</td>
            <td>"ML" * 3 = "MLMLML"</td>
        </tr>
        <tr>
            <td>Indexing</td>
            <td>[]</td>
            <td>text = "Regressor"; text[0] = "R"</td>
        </tr>
        <tr>
            <td>Slicing</td>
            <td>:</td>
            <td>text = "Classifier"; text[2:5] = "ass"</td>
        </tr>
        <tr>
            <td>String Comparison</td>
            <td>==, !=, etc.</td>
            <td>"Model" == "Model" evaluates to <span style="text-transform: capitalize;">True</span> </td>
        </tr>
        <tr>
            <td>Membership</td>
            <td>in</td>
            <td>"Loss" in "Mean Squared Loss"</td>
        </tr>
    </tbody>
</table>

</div>

In [156]:
# concatenate strings with + operator
print(f"{supervised_learning[0] + ml_term[10:-1] = }")
print(f"{'reinforcement learning' + 'learning' = }")
print(f"{'5' + '5' = }")

# repöeat strings with * operator
print(f"{'feature '*5 = }")

supervised_learning[0] + ml_term[10:-1] = 'regression algorithm'
'reinforcement learning' + 'learning' = 'reinforcement learninglearning'
'5' + '5' = '55'
'feature '*5 = 'feature feature feature feature feature '


<div style="max-width:66ch;">

---
## if-statement

- conditional control structure
- execute block of code if condition fulfilled

Some concepts of if-statements have been covered already throughout other concepts, hence this section will be shorter and try cover parts that are new

</div>
  

In [158]:
predicted_probability = 0.8

# note however that you probably shouldn't choose 0.5 as threshold for this case
# try figure out why
if predicted_probability > 0.5:
    # convention for prediction variable is y_pred
    # 1 for positive and 0 for negative
    y_pred = 1
else:
    y_pred = 0

# one line if-else statement
prediction = "positive" if y_pred else "negative"

print(
    f"{prediction} prediction for the disease with probability {predicted_probability}"
)

positive prediction for the disease with probability 0.8


In [162]:
accuracy = float(input("give your models accuracy"))

if accuracy > .9:
    model_performanace = "Good"
elif accuracy > .7:
    model_performanace = "Moderate"
else:
    model_performanace = "Bad"

print(f"the accuracy of your model is {accuracy} and the performance is {model_performanace.lower()}")

the accuracy of your model is 0.5 and the performance is bad


<div style="max-width:66ch;">

---
## while statement 

- iterative control structure
- iteratively repeat execution as long as specified condition remains True
- can be seen as an "if-loop"

### Example - oil leakage

There is an oil leakage causing the bird population in an island to 1/2 in each year. From start there were 80000 birds, how many years does it take for it to have 1/10 remaining?

</div>

In [1]:
birds = 80000
year = 0

while birds > 8000:
    print(f"Year {year} there were {birds:.0f} birds") 
    birds /= 2 # divide by half 
    year += 1

print(f"It takes {year} years for the birds to have 1/10 remaining")

Year 0 there were 80000 birds
Year 1 there were 40000 birds
Year 2 there were 20000 birds
Year 3 there were 10000 birds
It takes 4 years for the birds to have 1/10 remaining


<div style="max-width:66ch;">

---
## for statement 

- iterative control structure
- iterates over a sequence (list, tuple, string, range, ...)
- typically used when you know number of iterations beforehand

</div>

In [3]:
unsupervised_abbreviations = [
    "K-Means",
    "PCA",
    "GMM",
]

unsupervised_algorithms = [
    "K-Means Clustering",
    "Principal Component Analysis",
    "Gaussian Mixture Model",
]

for abbreviation in unsupervised_abbreviations:
    print(abbreviation)

K-Means
PCA
GMM


In [8]:
for abbreviation, algorithm in zip(unsupervised_abbreviations, unsupervised_algorithms):
    print(abbreviation, ": ",algorithm)

K-Means :  K-Means Clustering
PCA :  Principal Component Analysis
GMM :  Gaussian Mixture Model


In [7]:
supervised_algorithms = [
    "Support Vector Machine",
    "Decision Tree",
    "k-Nearest Neighbors"
]

supervised_abbreviations = [
    "SVM",
    "DT",
    "kNN"
]

# this structure is useful later when we create dropdown lists in plotly dash
[{abbreviation: algorithm} for abbreviation, algorithm in zip(supervised_abbreviations, supervised_algorithms)]

[{'SVM': 'Support Vector Machine'},
 {'DT': 'Decision Tree'},
 {'kNN': 'k-Nearest Neighbors'}]

In [12]:
# don't do this, this is not pythonic
for i in range(len(supervised_algorithms)):
    print(f"algorithm {i}: {supervised_algorithms[i]}")

print()

# this is pythonic
for i, algorithm in enumerate(supervised_algorithms): 
    print(f"algorithm {i}: {algorithm}")

algorithm 0: Support Vector Machine
algorithm 1: Decision Tree
algorithm 2: k-Nearest Neighbors

algorithm 0: Support Vector Machine
algorithm 1: Decision Tree
algorithm 2: k-Nearest Neighbors


<div style="max-width:66ch;">

---
## Summary

In this lecture note we have covered the most fundamental data types in Python, both how to create and manipulate them in useful ways. Also some control structures (if, while, for) have been covered in some extent.

---

</div>

<div style="background-color: #FFF; color: #212121; border-radius: 1px; width:22ch; box-shadow: rgba(0, 0, 0, 0.16) 0px 1px 4px; display: flex; justify-content: center; align-items: center;">
<div style="padding: 1.5em 0; width: 70%;">
    <h2 style="font-size: 1.2rem;">Kokchun Giang</h2>
    <a href="https://www.linkedin.com/in/kokchungiang/" target="_blank" style="display: flex; align-items: center; gap: .4em; color:#0A66C2;">
        <img src="https://content.linkedin.com/content/dam/me/business/en-us/amp/brand-site/v2/bg/LI-Bug.svg.original.svg" width="20"> 
        LinkedIn profile
    </a>
    <a href="https://github.com/kokchun/Portfolio-Kokchun-Giang" target="_blank" style="display: flex; align-items: center; gap: .4em; margin: 1em 0; color:#0A66C2;">
        <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" width="20"> 
        Github portfolio
    </a>
    <span>AIgineer AB</span>
<div>
</div>
