# Data Schema Transformer
This notebook demonstrates how to transform **raw sales data** into a **structured, schema-based dictionary**.  
It is part of my portfolio projects, showcasing **data structuring**, **Python built-in functions** (like `zip`), and **extensible code design**.

## Project Description
The main idea is to start with **raw sales data** (a list of tuples) and a **schema** (a list of field names).  
By combining these, we can build a **dictionary** where:  

- The **main key** is the product name.  
- The **value** is another dictionary containing attributes from the schema.  

This design is **extensible**: if the schema changes (e.g., adding "stock"), the code adapts with minimal modifications.

In [3]:
# Example raw data and schema
data = [
    ('item1', 10, 100.0),
    ('item2', 5, 25.0),
    ('item3', 100, 0.25)
]

schema = ['widget', 'num_sold', 'unit_price']


In [9]:
# Expected Output
result = {
    'item1' : {'num_sold': 10, 'unit_price': 100.0},
    'item2' : {'num_sold': 5, 'unit_price': 25.0},
    'item3' : {'num_sold': 100, 'unit_price': 0.25}
}

In [5]:
def transform_data(data, schema):
    """
    Transforms raw data into a structured dictionary based on the given schema.
    
    Parameters:
        data (list of tuples): Raw input data
        schema (list of str): Field names corresponding to tuple elements
    
    Returns:
        dict: Structured data with product name as main key
    """
    result = {}
    for row in data:
        # Create mapping between schema and row
        record = dict(zip(schema, row))
        # Use the first schema field (e.g., 'name') as main key
        key = record.pop(schema[0])
        result[key] = record
    return result


In [45]:
from pprint import pprint

def transform_data(data, schema):
    """
    Transforms raw data into a structured dictionary based on the given schema.
    
    Parameters:
        data (list of tuples): Raw input data
        schema (list of str): Field names corresponding to tuple elements
    
    Returns:
        dict: Structured data with product name as main key
    """
    # Build dictionary using dictionary comprehension:
    # - Use the first element of each row (row[0]) as the main key
    # - Zip the remaining schema fields with the remaining row values to create nested dict
    result = {
        row[0]: dict(zip(schema[1:], row[1:]))
        for row in data
    }

    return result

In [44]:
transform_data(data, schema)

{'item1': {'num_sold': 10, 'unit_price': 100.0},
 'item2': {'num_sold': 5, 'unit_price': 25.0},
 'item3': {'num_sold': 100, 'unit_price': 0.25}}