# Python Crash Course - Exercise 06

Topics covered:
* Importing packages in the jupyter notebook session
* Python from the CLI
* Running Python scripts from the CLI
* Running Python scripts in the jupyter notebook
* **if there is time** Text processing: regex & reading in txt files

Tasks:
* Task 0: packages setup
* Task 1: Python from the CLI
* Tasks 2-4: Running .py files
* **if there is time** Tasks 5 & 6: text processing

# Task 0 - packages setup

Make sure (with the help of the TAs) that you know how to load pandas and matplotlib.pyplot. If the code below runs without error messages, you're good to go! (We will use pandas and matplotlib in later lectures and exercises.)

In [None]:
# run this cell to check that you can import pandas and matplotlib correctly
import pandas as pd
import matplotlib.pyplot as plt

# create a pandas DataFrame object
df = pd.DataFrame(
    {
        "weekdays": ["mon", "tue", "wed", "thu", "fri", "sat", "sun"],
        "visitors": [30, 78, 19, 57, 29, 10, 0]
    }
)

fig, ax = plt.subplots(1,1) # create a figure object and an axis object

# draw a barplot on the axis object
ax.bar(
    x = df.weekdays, # x-position
    height = df.visitors, # y-height
    color = "pink" # change the color of the bar
)
ax.set_xlabel("Weekday") # label the x axis
ax.set_ylabel("Ticket count") # label the y axis
plt.title("Tickets sold") # add a plot title
plt.show() # show the plot below


## Task 1 - Back to basics (in the command line interface)

Navigate to your CLI. Type `python`, then enter. The `>>>` at the start of each line indicate that you are in a Python session now. See if you manage to execute the following Python statements in the CLI:

* $2*2 + 3*3$
* $5^{3} > 100$
* `(True or False) == True`
* `print("goodbye world")`

**Exit the Python session** by typing `exit()`, then enter.

## Task 2 - Running a .py script from the CLI

* Together with this notebook, we provided a Python script called `somecode.py`. Open it up (in the text editing software of your choice) and try to predict what the `print()` statement (last line of code) will output. 
* Now, open your CLI (command line interface) and navigate to the folder where you saved `somecode.py`
* Run the script by typing `python somecode.py`, then enter
* Check if the output is what you expected! 

In [None]:
# you can also try to run the code here with %run

**Comment:** In `somecode.py`, we define a function that takes an integer x as an input; creates a list of all numbers from 0 up to **and including** x; and returns the sum of that list. When given `4`, the function returns `0+1+2+3+4`, i.e. `10`

## Task 3 - Running your own .py script from the CLI

Let's create a program that will tell the user their age in minutes!
* Create a file called `minutes.py`, in the folder of your choice.
* Open up the CLI, navigate to the folder, and type `python minutes.py`, then enter
* Check if the output is what you expected!

in the script you should:
* ask the user for their age
* convert the user input into a numeric variable
* compute the user's age in minutes
* print out "Your age in minutes is...", inserting the minutes

You are also welcome to experiment with any other script that you want to try and run from the CLI!

**Comment**: See sample solution in `minutes.py` file attached. 

In [None]:
# you can try to run your minutes.py script from here, as well (with %run)
# %run minutes.py

## Task 4 - `smalltalk.py`

Write a `smalltalk.py` script that
* asks the user how they are doing "on a scale from 1 to 5"
* prints out an appropriate reply (depending on the user input)
* asks the user whether they like the weather today
* prints out an appropriate reply (depending on whether the user said "yes" or "no")
* says goodbye to the user in a polite way

**Notes** you can (but you don't have to) use function definitions within `smalltalk.py` to solve this task. You can (but you don't have to) implement some error catching, for example when the first user input is NOT a number from 1 to 5, or if the second user input is NOT either "yes" or "no".

In [None]:
# you can try to run smalltalk.py here with %run
# %run smalltalk.py

**Comment**: See sample solution in `smalltalk.py` file attached. 

# Task 5: String formatting - Capital cities

Below, we provide you with a dictionary `capitals`, that contains key-value pairs with countries as keys, and their capital cities as values. Let's do some data cleaning first:
* Some cities' names contain numbers; these need to be deleted
* Some cities' names consist of several words, but lack a white space; insert a white space where appropriate (for example, "AddisAbaba" needs to be formatted into "Addis Ababa").

Now, use the `f'{}'` syntax to generate a file where in each line contains one sentence: `The capital of <country> is <city>.`, inserting countries and capitals from the dictionary. Save the file to `capitals.txt`. 

In [None]:
capitals = {
    "Nigeria" : "Abuja",
    "Colombia" : "0Bo0gotá",
    "Gibraltar": "Gibr2altar",
    "Ethiopia": "AddisAb3aba",
    "United Arab Emirates": "AbuDhab7i"
}

In [None]:
# import re module
import re

# remove numbers from dictionary values with the help of the "\d" or "\d+" regex 

# loop through dictionary items
for key, value in capitals.items():
    # loop through all numbers that were found in the value string
    for item in re.findall("\d", value):
        # reassign the new value, where item (the number) is replaced by "" (an empty string)
        capitals[key] = capitals[key].replace(item, "")        
capitals

In [None]:
# add white spaces before capital letters 
# (you can use the regex "[A-Z]" to find capital letters)

# the way we know that a white space is missing:
# if there is a capital letter "in the middle" of the word,
# i.e. at position >0.

# as above, loop through dictionary items,
# this time inserting a white space BEFORE a capital letter,
# if it is at a position >0.
# loop through dictionary items
for key, value in capitals.items():
    # loop through all numbers that were found in the value string (excluding the first letter)
    for item in re.findall("[A-Z]", value[1:]):
        # replace the capital letter by (whitespace + the capital letter)
        capitals[key] = capitals[key].replace(item, " " + item)
capitals

In [None]:
# open up the file; with the opened file,
    # loop through the dictionary items;
    # create a sentence from keys and values (with string formatting) at each iteration step;
    # wrte the sentence + a linebreak (expressed as "\n") string to the file


# create the file
with open('capitals.txt', 'w') as opened_file:
    # loop through the dictionary items
    for key, value in capitals.items():
        # create the sentence for this key-value pair
        sentence = f"The capital of {key} is {value}."
        # write the sentence to the file
        opened_file.write(sentence)
        # add a line break
        opened_file.write("\n")

# Task 6: Text processing - Numbers in an article

In the file `article.txt`, we provide the text of this [Guardian article](https://www.theguardian.com/commentisfree/2025/apr/18/i-fear-im-doing-friendship-wrong-why-do-we-lose-the-art-of-just-hanging-out) by Carolin Würfel. Let's say we are **VERY** interested in all the **numbers** that she used in the article. Your tasks:

* `.read()` in the text file 
* find all the numbers (of one or more characters, with the regex `"\d+"`) mentioned in the text, and print them out
* `.split()` the text into separate sentences
* loop through the sentences, `.append()`ing only the ones that contain numbers to a list
* Additional challenge: try to write this list to a text file, so that every line in the text file is a sentence (with a number) from the article

In [None]:
# if not done yet, import the re module
import re

In [None]:
# .read() in the text file
with open('article.txt', 'r') as opened_file:
    my_text = opened_file.read()

In [None]:
# print out all numbers (regex "\d+") that you can find
re.findall("\d+", my_text)

In [None]:
# split into sentences 

# with the "." as separator
sentences = my_text.split(". ")
sentences # my sentences is now a list of strings, every string is a sentence (now without the ".")

In [None]:
# find all sentences that contain numbers, and append them to a list

sentences_with_numbers = []

for sentence in sentences:
    if re.search("\d+", sentence):
        sentences_with_numbers.append(sentence)

print(sentences_with_numbers)

In [None]:
# Challenge: write the sentences_with_numbers to a file

with open("sentences_with_numbers.txt", "w") as opened_file:
    for sentence in sentences_with_numbers:
        opened_file.write(sentence + "\n") # write each sentence plus a line break
