<h1>1. String methods</h1>

Like any programming language, Python allows many operations on strings. Finding sub-strings, splitting, joining, etc. You can find a list of the available methods [here](https://docs.python.org/3/library/stdtypes.html#string-methods).<br>
<br>
<b>Exercise</b><br>
Use the appropriate methods to make the following lines of code work.

In [64]:
string = "In computer programming,  a string is traditionally  a sequence of characters.  "

 
print(string.index('c'))                             # index of the first 'c'
print(string.rindex('c'))                            # index of the last 'c' (zie r{name} methods voor right-hand functions)
print(len(string.rstrip()))                          # length of the string without trailing whitespaces
print(len(string))                                   # length of the string to check the amount of whitespaces
print(string.startswith("In"))                       # whether the string starts with "In"
print(string.lower())                                # small letters
print(string.split(","))                             # split the string by a comma
print(" ".join(string.split()))                      # Using the whitespace between to join all text
print(string.replace("traditionally","").strip())    # Replacing traditionally with nothing 
print(string.replace(" traditionally ","").strip())  #Replacing traditionally including its whitspaces


3
72
78
80
True
in computer programming,  a string is traditionally  a sequence of characters.  
['In computer programming', '  a string is traditionally  a sequence of characters.  ']
In computer programming, a string is traditionally a sequence of characters.
In computer programming,  a string is   a sequence of characters.
In computer programming,  a string is a sequence of characters.


Formatting a string allows you to export or print data. For example, printing the string `Client name: %s` where `%s` is formatted to be the name of a client given as a string. Besides substituting strings at `%s`, other data types can also be formatted in to the string. See [here](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting) for a list of all formatting conversions. This includes formatting/rounding numbers.<br>
<br>
A general way to format a string is given below. Note the `%d` for an integer. In case of a single argument, the `( )` are not nessecary.

In [65]:
client_name = "Obelix"
client_age = 32                                                   # [years]
string = "Client %s is %d years old." % (client_name, client_age) # the format is: string % (arguments)
print(string)

Client Obelix is 32 years old.


<b>Exercise</b><br>
Use the appropriate format to make the following lines of code work.

In [121]:
value = 1.73456


print("%.0f" % value)                  # It does not work with d because than it leaves out all decimals in stead of rounding off
print("%.1f" % value)                  # 1.7 
print("%.2f" % value)                  # 1.73
print("% 7.2f" % value)                #Gives that the number should have a total lenght of 7, with 2 decimals
print("%07.2f" %value)                 # 0001.73  (see Flag '0')
print("%+0.2f" % value)                # +1.73    (see Flag '+')
print("%+07.2f" % value)               # +001.73
print("%0.2e" % value)                 # 1.73e+00 (exponential format)

2
1.7
1.73
   1.73
0001.73
+1.73
+001.73
1.73e+00


<h1>3. Regular expressions</h1>

Regular expressions are used to find patterns in text, without exactly specifying each character. For example to find words, to find numbers that were formatted in a particular way, etc.<br>
A single digit can for example be matched with `\d`. That would match at 4 locations in the string `The width of the car is 2m, and the height is 1.65m.`.<br>
Another example is that we can match a set of characters. This can be matched using `[xyz]`. That would match at 4 locations in the string `If x = 2y, than y = 6z.`.<br>
At [Python Regular Expressions](https://docs.python.org/3/library/re.html) more information can be found on matching string patterns in Python. Using this information, make the following assignment.<br>
<br>
<b>Exercise</b><br>
Consider the 12 lines in the code box below. You will have to find a pattern that:
<ul>
    <li>Matches the first 10 lines with a decimal number.</li>
    <li>Does not match the integer in the 11th line.</li>
    <li>Does not match the text in the 12th line.</li>
</ul>
<br>
1. Before you read the documentation, write the pattern down in words under 'pattern in words:'<br>
<br>
<i>Tip: The lines to match have a general pattern of 3 elements (first this, than that, and finally that).</i><br>
<br>

Open [regex101.com](https://regex101.com/).<br>
On the left-hand side, select the "Python" flavor.<br>
Copy the 12 lines from the code box below in the "TEST STRING" box.<br>
In the "REGULAR EXPRESSION" text box, test and write your pattern.<br>
<br>
2. Write the pattern down in code, regexp = ...<br>
<br>
<i>Tip: Some elements may be "zero or more of X", where X is some type.</i><br>
<i>Tip: Some elements may be "any of Y", where Y is a set of cases it can be.</i>

    0001,2345
    1,2345
    1,23
    ,2345
    1,
    001.2345
    1.2345
    1.23
    .2345
    1.
    1
    thisisnotanumber

In [109]:
import re

# Het reguliere expressie patroon
regexp = r'^\d*(?:[.,]\d*)?$'

# De lijst met waarden om te matchen
values = [
    "0001,2345",
    "1,2345",
    "1,23",
    ",2345",
    "1,",
    "001.2345",
    "1.2345",
    "1.23",
    ".2345",
    "1.",
    "1",
    "thisisnotanumber"
]

# Matchen van waarden tegen het patroon
for value in values:
    if re.match(regexp, value):
        print(f"Match: {value}")
    else:
        print(f"No match: {value}")


Match: 0001,2345
Match: 1,2345
Match: 1,23
Match: ,2345
Match: 1,
Match: 001.2345
Match: 1.2345
Match: 1.23
Match: .2345
Match: 1.
Match: 1
No match: thisisnotanumber


<b>pattern in words:</b><br>
**First**, we start with the regular expression ^, indicating the match must begin at the beginning of the input string; **then**, within this regular expression, we have \d*, matching zero or more digits, allowing for an optional integer part at the start of the input; and **finally**, the regular expression ends with $, signifying that the match must conclude at the end of the input string.

In [107]:
regexp = r'^\d*(?:[.,]\d*)?$'

<h1>4. Counting characters</h1>

<b>Exercise</b><br>
Print all non-zero frequencies of each character from the alphabet in the text given in the code box.
<ul>
<li>Treat accented characters as normal characters.</li>
<li>Combine uppercase and lowercase characters in a single count.</li>
<li>Print in alphabetical order.</li>
</ul>
<i>Hint: Have one step where you prepare and filter some data, and a second step with a loop.<i><br>
<i>Hint: sets have unique values, and lists are indexed and can thus be sorted (sort()).<i>

In [114]:
#Importeer unicode to "Treat accented characters as normal characters."
from unidecode import unidecode

# De tekst
text = "For the movie The Theory of Everything (2014), Jóhann Jóhannsson composed the song A Model of the Universe"

# Stap 1: Voorbereiden en filteren van de data, inclusief behandeling van geaccentueerde tekens
cleaned_text = ''.join(letter.lower() for letter in unidecode(text) if letter.isalpha())

# Stap 2: Tellen van de frequentie van elk letter
frequency_dict = {}
for letter in cleaned_text:
    if letter in frequency_dict:
        frequency_dict[letter] += 1
    else:
        frequency_dict[letter] = 1

# Stap 3: Afdrukken van de frequenties in alfabetische volgorde
alphabetical_letters = sorted(frequency_dict.keys())
for letter in alphabetical_letters:
    if frequency_dict[letter] > 0:
        print(f"{letter}: {frequency_dict[letter]}")

a: 3
c: 1
d: 2
e: 12
f: 3
g: 2
h: 8
i: 3
j: 2
l: 1
m: 3
n: 8
o: 12
p: 1
r: 4
s: 5
t: 6
u: 1
v: 3
y: 2


<h1>5. Good... afternoon?</h1>

The code below generates a random time in the day. Suppose we want to present a user a welcoming message when the user opens a program at that time.<br>
<br>
<b>Exercise</b><br>
<ul>
    <li>Print a message with the (pseudo) format: Good {part of day}, the time is hh:mm</li>
    <li>Parts of the day are night [0-5], morning [6-11], afternoon [12-17] or evening [18-23].</li>
    <li>Hour or minute values below 10 should have a leading 0.</li>
</ul>
<i>Hint: you can use if-elif-else for the part of the day, but you can also have a fixed list of parts of the day and use clever indexing from the hour value.</i>

In [119]:
import random

# Generate random hour and minute
h = random.randint(0, 23)  # hour of the day
m = random.randint(0, 59)  # minute within the hour

# Determine the part of the day
if h >= 0 and h < 6:
    part_of_day = "night"
elif h >= 6 and h < 12:
    part_of_day = "morning"
elif h >= 12 and h < 18:
    part_of_day = "afternoon"
else:
    part_of_day = "evening"

# Check if hour and minute are below 10 and add a leading 0 if necessary
hour_str = str(h).zfill(2)
minute_str = str(m).zfill(2)

# Print the welcome message
print(f"Good {part_of_day}, the time is {hour_str}:{minute_str}")


Good evening, the time is 22:54
