# Structured extract with Box AI

In this notebook, we will learn how to use the Box AI API `/extract_structured` endpoint to extract key/value pairs from files in Box based on a pre-configured Box Metadata template. The files we will be using are 5 invoices. 

## Prerequisites

You must have completed the Setup notebook first. This will create all of the Box objects, folders, and files that you need, and will have created an environment file to help you get started and import all the libraries you will need.

## Workshop

The first step is to import all of the environment variables we need for this exercise.

In [None]:
import os
from dotenv import load_dotenv

load_dotenv(override=True)

BOX_CLIENT_ID=os.getenv('BOX_CLIENT_ID')
BOX_CLIENT_SECRET=os.getenv('BOX_CLIENT_SECRET')
BOX_USER_ID=os.getenv('BOX_USER_ID')
BOX_FOLDER_ID=os.getenv('EXERCISE5_FOLDER')
BOX_METADATA_TEMPLATE_KEY=os.getenv('METADATA_TEMPLATE_KEY')

Next we will grab the BoxClient object from the Python SDK to authenticate ourselves to the API. We'll print out the current user's information to ensure we are properly authenticated.

In [None]:
from box_sdk_gen import BoxClient, CCGConfig, BoxCCGAuth

ccg_config = CCGConfig(
    client_id=BOX_CLIENT_ID,
    client_secret=BOX_CLIENT_SECRET,
    user_id=BOX_USER_ID,
)

ccg_auth = BoxCCGAuth(ccg_config)

client = BoxClient(ccg_auth)

print(f"{client.users.get_user_me()}")

Now we will loop through the files in our folder and extract the values we need based on the provided metadata template. 

In [None]:
from box_sdk_gen import (
    AiItemBase,
    AiItemBaseTypeField,
    CreateAiExtractStructuredMetadataTemplate
)

files = client.folders.get_folder_items(BOX_FOLDER_ID).to_dict()

print("Processing files...")

for file in files["entries"]:
    box_ai_response = client.ai.create_ai_extract_structured(
        items=[
            AiItemBase(
                id=file["id"],
                type=AiItemBaseTypeField.FILE,
            )
        ],
        metadata_template=CreateAiExtractStructuredMetadataTemplate(
            scope="enterprise",
            template_key=BOX_METADATA_TEMPLATE_KEY
        )
    )

    print(f"File: {file['name']}")
    print(f"AI Extract Structured Metadata: {box_ai_response.answer}")
    print("--------------------------------------------------")

Extract Structured gives you a powerful extraction tool that you can run against all of your files without having to move files around. It also gives you a response in a format that you can easily push back into Box as Box Metadata, which makes finding files easier and more powerful, and powers your Box Apps.

To get you started, we've provided full, runnable files for this exercise. Running the following cells will generate the file for you in the exercise folders. Use it as-is, or use it as inspiration or a starting point for your workflows and applications.

In [None]:
%%writefile box_ai_structured_extract.py
import os
import asyncio
from dotenv import load_dotenv

from box_sdk_gen import (
    BoxClient,
    CCGConfig,
    BoxCCGAuth,
    AiItemBase,
    AiItemBaseTypeField,
    CreateAiExtractStructuredMetadataTemplate
)

load_dotenv(override=True)

BOX_CLIENT_ID=os.getenv('BOX_CLIENT_ID')
BOX_CLIENT_SECRET=os.getenv('BOX_CLIENT_SECRET')
BOX_USER_ID=os.getenv('BOX_USER_ID')
BOX_FOLDER_ID=os.getenv('EXERCISE5_FOLDER')
BOX_METADATA_TEMPLATE_KEY=os.getenv('METADATA_TEMPLATE_KEY')

def get_box_client():
    ccg_config = CCGConfig(
        client_id=BOX_CLIENT_ID,
        client_secret=BOX_CLIENT_SECRET,
        user_id=BOX_USER_ID,
    )

    ccg_auth = BoxCCGAuth(ccg_config)

    client = BoxClient(ccg_auth)

    return client

async def chat_with_ai(): 
    client = get_box_client()
    print(f"{client.users.get_user_me()}")
    
    files = client.folders.get_folder_items(BOX_FOLDER_ID).to_dict()

    print("Processing files...")

    for file in files["entries"]:
        box_ai_response = client.ai.create_ai_extract_structured(
            items=[
                AiItemBase(
                    id=file["id"],
                    type=AiItemBaseTypeField.FILE,
                )
            ],
            metadata_template=CreateAiExtractStructuredMetadataTemplate(
                scope="enterprise",
                template_key=BOX_METADATA_TEMPLATE_KEY
            )
        )

        print(f"File: {file['name']}")
        print(f"AI Extract Structured Metadata: {box_ai_response.answer}")
        print("--------------------------------------------------")

if __name__ == "__main__":
    asyncio.run(chat_with_ai())