# Recap of Workshop 5

In this workshop, we covered two main topics:

1. Dictionaries in Python
2. Working with Word documents using `python-docx`


## Dictionaries in Python

A dictionary in Python is an ordered collection of items. Each item is a key-value pair, where each key is unique.

### Creating a Dictionary
You can create a dictionary by placing a comma-separated sequence of key-value pairs within curly braces `{}`, with a colon `:` separating keys and values.

In [None]:
my_dictionary = {"key": "value"}

They are very useful structures. In the data world, one common application is to hold config parameters, for example:

In [None]:
config = {
    "data_path": "/path/to/data",
    "output_path": "/path/to/output",
    "model_path": "/path/to/model",
    "batch_size": 32,
    "learning_rate": 0.001,
    "num_epochs": 10,
    "random_seed": 42
}

### Accessing Dictionary Items
You can access the items of a dictionary by referring to its key name, inside square brackets `[]`.

In [None]:
print(config["batch_size"])

### Modifying Dictionary Items
In an existing dictionary, you can update the value of a specific item by referring to its key.

In [None]:
config["batch_size"] = 16

print(config)

### Adding Items to a Dictionary
You can add new items to a dictionary by using a new key and assigning a value to it.

In [None]:
config["log_path"] = "/path/to/logs"

print(config)

### Deleting from a Dictionary
You can remove items from a dictionary using the `del` statement or the `pop()` method.

In [None]:
del config["log_path"]
config.pop("output_path")

print(config)

### Looping Through a Dictionary
You can loop through a dictionary in several ways: keys, values, and key-value pairs.

In [None]:
for key in config.keys():
    print(key)

In [None]:
for value in config.values():
    print(value)

In [None]:
for key, value in config.items():
    print(key, "=", value)

## Working with Word Documents using `python-docx`

The `python-docx` library allows you to create and update Microsoft Word (.docx) files.

### Installing `python-docx`
To install the `python-docx` package, you can use the following command:

In [None]:
!pip install python-docx

### Creating a Document
To create a new Word document, you first need to create an instance of the `Document` class.

In [None]:
from docx import Document

doc = Document()

### Adding Content to the Document

The Document object has method we can use to modify our word file:

#### Adding a Heading

In [None]:
doc.add_heading('Document Title', level=1)

#### Adding a Paragraph

In [None]:
doc.add_paragraph('This is a paragraph.')

### Saving the Document
Then to save the file, we just use the `save` method.

In [None]:
doc.save('word_docs/example.docx')

### Replacing Text in a Document

Sometimes you'll have a Word template for which you just need to update certain parts. For example in a monthly report you might just replace the dates and figures, and all the rest remains the same. 

With Python we can automate this!

First let's load the file word_docs/gp_registrations_template.docx

In [None]:
gp_regs = Document('word_docs/gp_registrations_template.docx')

To replace text, you first create a dictionary of the placeholders and what you want to replace them with:

In this file, `<DATE>` and `<GP_REGISTRATIONS>` are the placeholders we want to replace:

In [None]:
replacements = {
    "<DATE>": "01 November 2024",
    "<GP_REGISTRATIONS>": "63,669,331"
}

Then use the functions below to do a find and replace:

In [None]:
def replace_text_in_paragraph(paragraph, replacements):
    for old_text, new_text in replacements.items():
        if old_text in paragraph.text:
            for run in paragraph.runs:
                if old_text in run.text:
                    run.text = run.text.replace(old_text, new_text)

def replace_text_in_doc(doc, replacements):
    for paragraph in doc.paragraphs:
        replace_text_in_paragraph(paragraph, replacements)


replace_text_in_doc(gp_regs, replacements)

Then save the updated file:

In [None]:
gp_regs.save('word_docs/gp_registrations_november.docx')