# Lume: Understanding Transformation Results

This cookbook walks you through results and the mapping logic that comes with the return of any job execution.

### Overview

This notebook covers the following topics:

- **Retrieving the Result Spec and Associated Mappings:** Learn how to access and interpret the result specification and associated mappings that are returned after a job execution. This section will guide you through understanding the structure of the results, including spec, mappings, and the confidence score.

- **Excel and PDF Data Extraction:** Extract data from PDF and Excel files via file paths alone. This section demonstrates how to use Lume's tools to efficiently pull data from these file types, handling various formats and layouts, and converting them into structured data for further processing.


In [1]:
pip install lume-py # install Lume's Python SDK

### Set up your lume-api-key

First let's define our api-key for making calls to the Lume API.

In [None]:
import lume_py as lume

lume.set_api_key("...")  # sets the api-key

### Prior Context

This cookbook assumes a pipeline has already been created, called `ecomm_test`. The existing pipeline is meant to map source ecommerce data to an internal ecommerce data model. The target schema used in the pipeline is in this cookbook's folder, as `target_schema.json`. The cell below loads the target schema. You can view it in detail in taret_schema.json within this directory.

In [3]:
import os
import json

target_data_path = os.path.join(os.getcwd(), "./data/target.json")
with open(target_data_path) as f:
    target_data = json.load(f)

source_data_path = os.path.join(os.getcwd(), "./data/source.json")
with open(source_data_path) as f:
    source_data = json.load(f)

##### 1. Retreive a result by id, or see all the results for your account

In [4]:
results = await lume.Result.get_results()
result = await lume.Result.get_by_id(results[0].id)

### For Any Result: Retrieve the Specifications Associated with a Result

#### Spec
A configuration or mapping schema that specifies how your source data was transformed, mapped, and defaulted.

**Certain Keys**

Each field under the top-level sections contains:

- **`@sources`**: A list of data paths (usually strings representing paths in a nested data structure) from which the value for that field should be sourced.
  
- **`@default_values`**: A list of default values to use if the sources do not provide a value.
  
- **`@lookup`**: A dictionary that maps specific source values to target values. This is used for data transformation or categorization.

- **`@confidence_scores`** (optional): Confidence score of the lookup table generated. This will only apply to target properties that run the classifier, namely, the ones that have an enum. Confidence values are separated in buckets of Confident, Very High, High, Medium, Low, Very Low, and Incorrect.



In [None]:
await result.get_spec()

##### Mappings:

A Mapping is the generated output from Lume. Each mapping corresponds to a source record, so a job with multiple source records will contain multiple mappings. A mapping contains:



In [8]:
mappings = await result.get_mappings()

In [None]:
mappings[0].mapped_record  # returns the mapped record from the source data

In [None]:
await result.get_mappings()  # Retrieves the list of associated mappings associated with a specific result.

Once a job run executes to completion, a Result object is returned. The Result provides a few pieces of key information:
1. Confidence Scores
2. Spec: the high-order mapping logic and lookup table of the pipeline used on this job.

In [None]:
# generate confidence from result
confidence = await result.generate_confidence_scores()

# grab spec
spec = await result.get_spec()

print("Confidence: ", confidence)
print("Spec:", spec)