This code imports the `xml.etree.ElementTree` library and parses a simple XML string to demonstrate accessing the root element's tag and attributes.

In [8]:
import xml.etree.ElementTree as ET

xml_string = """
<catalog>
    <product id="101">
        <name>Wireless Keyboard</name>
        <price currency="USD">29.99</price>
    </product>
</catalog>
"""

root = ET.fromstring(xml_string)
print(f"Root tag: {root.tag}")
print(f"Root attributes: {root.attrib}")

Root tag: catalog
Root attributes: {}


This code demonstrates how to parse an XML file named `products.xml` and print the tag of the root element.

In [9]:
# Parse an XML file
tree = ET.parse('products.xml')
root = tree.getroot()

print(f"Root element: {root.tag}")

Root element: catalog


This code shows how to use different methods (`find()`, `findall()`, and `iter()`) to navigate and search within an XML structure parsed from a string.

In [10]:
import xml.etree.ElementTree as ET

xml_data = """
<catalog>
    <product id="101">
        <name>Wireless Keyboard</name>
        <categories>
            <category>Electronics</category>
            <category>Accessories</category>
        </categories>
    </product>
    <product id="102">
        <name>USB Mouse</name>
        <categories>
            <category>Electronics</category>
        </categories>
    </product>
</catalog>
"""

root = ET.fromstring(xml_data)

# Method 1: find() - returns the FIRST matching element
first_product = root.find('product')
print(f"First product ID: {first_product.get('id')}")

# Method 2: findall() - returns ALL direct children that match
all_products = root.findall('product')
print(f"Total products: {len(all_products)}")

# Method 3: iter() - recursively finds ALL matching elements
all_categories = root.iter('category')
category_list = [cat.text for cat in all_categories]
print(f"All categories: {category_list}")

First product ID: 101
Total products: 2
All categories: ['Electronics', 'Accessories', 'Electronics']


This code demonstrates how to extract text content and attributes from elements within a parsed XML string.

In [11]:
xml_data = """
<catalog>
    <product id="101">
        <name>Wireless Keyboard</name>
        <price currency="USD">29.99</price>
        <stock>45</stock>
    </product>
</catalog>
"""

root = ET.fromstring(xml_data)
product = root.find('product')

# Get element text content
product_name = product.find('name').text
price_text = product.find('price').text
stock_text = product.find('stock').text

# Get attributes (two ways)
product_id = product.get('id')  # Method 1: .get()
product_id_alt = product.attrib['id']  # Method 2: .attrib dictionary

# Get nested attributes
price_element = product.find('price')
currency = price_element.get('currency')

print(f"Product: {product_name}")
print(f"ID: {product_id}")
print(f"Price: {currency} {price_text}")
print(f"Stock: {stock_text}")

Product: Wireless Keyboard
ID: 101
Price: USD 29.99
Stock: 45


This code defines a function `parse_product_catalog` that takes an XML file path as input and parses it to extract product information, including nested categories, returning a list of product dictionaries. It then demonstrates how to use this function with the `product.xml` file and print the extracted product details.

In [12]:
def parse_product_catalog(xml_file):
    """Parse an XML product catalog and return a list of product dictionaries."""
    tree = ET.parse(xml_file)
    root = tree.getroot()

    products = []

    for product_element in root.findall('product'):
        # Extract product data
        product = {
            'id': product_element.get('id'),
            'name': product_element.find('name').text,
            'price': float(product_element.find('price').text),
            'currency': product_element.find('price').get('currency'),
            'stock': int(product_element.find('stock').text),
            'categories': []
        }

        # Extract categories (nested elements)
        categories_element = product_element.find('categories')
        if categories_element is not None:
            for category in categories_element.findall('category'):
                product['categories'].append(category.text)

        products.append(product)

    return products

# Usage
products = parse_product_catalog('product.xml')

for product in products:
    print(f"\nProduct: {product['name']}")
    print(f"  ID: {product['id']}")
    print(f"  Price: {product['currency']} {product['price']}")
    print(f"  Stock: {product['stock']}")
    print(f"  Categories: {', '.join(product['categories'])}")


Product: Wireless Keyboard
  ID: 101
  Price: USD 29.99
  Stock: 45
  Categories: Electronics, Accessories

Product: USB Mouse
  ID: 102
  Price: USD 15.99
  Stock: 120
  Categories: Electronics


This code demonstrates how to safely handle potentially missing elements in the XML data by checking if an element exists before trying to access its text or attributes. It also shows how to provide a default value when getting an attribute if the attribute is missing.

In [13]:
xml_data = """
<catalog>
    <product id="101">
        <name>Wireless Keyboard</name>
        <price currency="USD">29.99</price>
    </product>
    <product id="102">
        <name>USB Mouse</name>
        <!-- Missing price element -->
    </product>
</catalog>
"""

root = ET.fromstring(xml_data)

for product in root.findall('product'):
    name = product.find('name').text

    # Safe way to handle potentially missing elements
    price_element = product.find('price')
    if price_element is not None:
        price = float(price_element.text)
        currency = price_element.get('currency', 'USD')  # Default value
        print(f"{name}: {currency} {price}")
    else:
        print(f"{name}: Price not available")


Wireless Keyboard: USD 29.99
USB Mouse: Price not available


**Note:** For more advanced XML navigation and selection, you can explore using XPath expressions. This is a powerful language for selecting nodes in an XML document and can be very useful for complex structures. We can cover this in another tutorial.