# Overview – Part 4
1. Installing Modules  
2. Parsing XML with LXML  
3. Config Parser  
4. Threading  
5. NumPy  

# 1. Install Modules

#### Installing Modules
Apart from built-in modules, Python also allows importing external modules.  
Python is open-source, so there are an abundance of modules available for almost anything.  

**Examples of popular modules:**
- 🎮 `pygame` → For game development  
- 📊 `pandas` → For data manipulation  
- 🔢 `numpy` → For numerical calculations  
- 🌐 `selenium` → For web automation  

> Note: These modules do not come pre-installed with Python. Once installed, they can be imported using a simple `import` statement.  

---

#### Installing Modules – Methods
Before importing, you need to install the modules. This can be done via:  
1. **Command Prompt** → `pip install [library_name]`  
2. **Anaconda Prompt** → `conda install [library_name]`  
3. **Anaconda Navigator (Interactive)** → Select the library and click *Install*  


## 1.1 Exercise

✅ Try installing the following packages using your preferred method (Anaconda, CMD Prompt, or Anaconda Navigator):  
- `pandas`  
- `scipy`  
- `numpy`  
- `scikit-learn`  

# 2. LXML Module

#### LXML Module
Extensible Markup Language (**XML**) is a markup language that defines rules for encoding documents in a format that is both **human-readable** and **machine-readable**.  

- Used for structured webpages, catalogues, and more.  
- In Python, you can extract information from XML pages using the **`lxml`** library.  


## 2.1 Exercise 1

#### Exercise – LXML
- URL: [W3Schools XML Example](https://www.w3schools.com/xml/simple.xml)  
- The XML file contains a breakfast menu with:
  - Name  
  - Price  
  - Calories  
  - Description  

**Task:** Extract all information and store price, calories, and description against each food name.  


In [1]:
from lxml import etree

def breakfastXmlFile(xmlFile):
    with open(xmlFile) as objectt:
        xml = objectt.read()
    root = etree.fromstring(xml)

    food_dict = {}
    food_list = []

    for food in root.getchildren():
        print("Food ID: ", food.get('id'))
        print("--------------")
        
        for elem in food.getchildren():
            if elem.text:
                text = elem.text
            else:
                text = " "

            print(elem.tag + ':' + text)
            food_dict[elem.tag] = text

        if food.tag == "food":
            food_list.append(food_dict)
            print("")
            food_dict = {}

my_Food = breakfastXmlFile("1.1_Breakfast_File.xml")

Food ID:  f001
--------------
name:Pancakes
price:$5.95
calories:350
description:Delicious fluffy pancakes served with maple syrup and butter.

Food ID:  f002
--------------
name:Waffles
price:$6.95
calories:400
description:Crispy golden waffles with fresh strawberries and whipped cream.

Food ID:  f003
--------------
name:French Toast
price:$6.50
calories:380
description:Classic French toast sprinkled with cinnamon sugar.

Food ID:  f004
--------------
name:Omelette
price:$7.25
calories:420
description:Three-egg omelette with cheese, mushrooms, and vegetables.

Food ID:  f005
--------------
name:Fruit Salad
price:$4.50
calories:200
description:A refreshing mix of seasonal fruits and yogurt.



## 2.2 Exercise 2

#### LXML Example – Book Catalogue
Consider an XML file that contains a **catalogue of books**.  
The entire catalogue is wrapped inside `<catalogue></catalogue>` tags.  

Each `<book>` has the following descriptors:  

| Descriptor      | Description                        |
|-----------------|------------------------------------|
| ID              | Unique identifier for each book    |
| Author          | Name of the author                 |
| Title           | Title of the book                  |
| Genre           | Genre of the book                  |
| Price           | Price of the book                  |
| Publishing Year | Year of publication                |
| Description     | Summary or description of the book |

📖 **Source:** [Microsoft Docs](https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85))  

---

#### LXML – Parsing Example
Objective: **Read the content of the XML file and retrieve information while maintaining a dictionary for access.**  

**Steps:**
1. Read the XML file into Python.  
2. Use `lxml.etree.fromstring()` to convert the read file into an LXML object.  
3. The entire XML structure is called a **tree**, with the outermost tag as **root**.  
4. Access child elements using the `getchildren()` method.  

In [2]:
from lxml import etree

def parseBookXml(xmlfile):
    
    with open(xmlfile) as fobj:
        xml = fobj.read()
    root = etree.fromstring(xml)
    
    book_dict = {}
    books = []

    for book in root.getchildren():
        print("Book ID: ", book.get('id'))
        print("----------------")
            
        for elem in book.getchildren():
            if elem.text:
                text = elem.text
            else:
                text = " "
            if elem.tag == "author":
                last_name, first_name = text.split(',')
                print(elem.tag + ':', first_name, last_name)
            else:
                print(elem.tag + ':' + text)
            book_dict[elem.tag] = text

        if book.tag == 'book':
            books.append(book_dict)
            print(" ")
            book_dict = {}
            
    return books
my_books = parseBookXml("1.2_Book_File.xml")

Book ID:  bk101
----------------
author:  Matthew Gambardella
title:XML Developer's Guide
genre:Computer
price:44.95
publish_date:2000-10-01
description:An in-depth look at creating applications 
      with XML.
 
Book ID:  bk102
----------------
author:  Kim Ralls
title:Midnight Rain
genre:Fantasy
price:5.95
publish_date:2000-12-16
description:A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.
 
Book ID:  bk103
----------------
author:  Eva Corets
title:Maeve Ascendant
genre:Fantasy
price:5.95
publish_date:2000-11-17
description:After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.
 
Book ID:  bk104
----------------
author:  Eva Corets
title:Oberon's Legacy
genre:Fantasy
price:5.95
publish_date:2001-03-10
description:In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for 

### Program Flow – Example 2 (Book XML File)

##### Code Purpose
Reads **1.2_Book_File.xml**, extracts book details, prints them, and returns a list of dictionaries.

##### Flow
1. **Function Call**  
   `my_books = parseBookXml("1.2_Book_File.xml")`

2. **Open XML File**  
   - File is opened and contents are read into a string `xml`.

3. **Parse XML**  
   - `etree.fromstring(xml)` creates a root element (XML tree).

4. **Initialize Data Structures**  
   - `book_dict = {}` → stores details of one book.  
   - `books = []` → stores all book dictionaries.

5. **Loop Through Each `<book>` Node**  
   - For each book element under root:  
     - Print Book ID.  
     - Loop over its children (e.g., `<author>`, `<title>`, `<genre>`, `<price>`, etc.).

6. **Check Element Text**  
   - If child element has text, store it in `text`.  
   - If not, assign `" "` (blank).

7. **Special Case – `<author>` Tag**  
   - Split author string by comma → `last_name, first_name`.  
   - Print `Author: first_name last_name`.

8. **Normal Case – Other Tags**  
   - Print `tag : text`.  
   - Add tag-value pair into `book_dict`.

9. **End of `<book>` Node**  
   - Append `book_dict` into `books`.  
   - Reset `book_dict = {}` for the next book.

10. **Function End**  
    - `return books` → List of all book dictionaries is returned and stored in `my_books`.


---

#### ✅ Key Difference Between Example 1 and Example 2

- Example 1: Only prints and stores food data (**no return value**).  
- Example 2: Prints and **returns book data** (so you can reuse it later).  
- Example 2 has an extra special case for `<author>`, splitting first and last name.