# Setup

This notebook will help you set up your Box development environment for running the following workshops. There are a few things you will need in advance:

* A Box instance with Box AI APIs, Box AI Studio, and Box Hubs enabled
* A Box application with the `Manage AI` scope enabled and Client Credentials authentication
* The Box application enabled in the Box admin console
* The Box user ID for the user that created the app

In addition, you will want to create and activate a virtual environment if you haven't already. You can do so with the following commands at the command line. You will need to exit the Jupyter Notebook process and then re-start it when you are done.

```bash
    python -m venv .venv
    source .venv/bin/activate
```

Next, install the required dependencies:

In [4]:
!pip install box-sdk-gen python-dotenv

Collecting python-dotenv
  Using cached python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Using cached python_dotenv-1.1.1-py3-none-any.whl (20 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.1.1

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Setting up your environment variables

In this exercise, we will use a .env file and `python-dotenv` to share these variables across the exercises. By the time you complete this notebook, the .env file will contain:

* Key, secret, and user ID
* Box Folder IDs for each exercise
* Box Hubs ID for exercise 3
* Box Metadata Template key for Exercise 5
* Box AI Agent ID for exercise 6

> **We not recommend storing plain text keys and secrets in files in the real world. This is strictly for the purposes of this exercise**

The first step is to gather your key, secret, and user ID.

In [46]:
import getpass
import os
from dotenv import load_dotenv

BOX_CLIENT_ID=getpass.getpass("Enter your Box Client ID: ")
BOX_CLIENT_SECRET=getpass.getpass("Enter your Box Client Secret: ")
BOX_USER_ID=getpass.getpass("Enter your Box User ID: ")

print("BOX_CLIENT_ID={}".format(BOX_CLIENT_ID), file=open('.env', 'w'))
print("BOX_CLIENT_SECRET={}".format(BOX_CLIENT_SECRET), file=open('.env', 'a'))
print("BOX_USER_ID={}".format(BOX_USER_ID), file=open('.env', 'a'))

load_dotenv(override=True)

True

Now let's authenticate our app to Box to power the rest of our setup steps.

In [50]:
from box_sdk_gen import BoxClient, CCGConfig, BoxCCGAuth

ccg_config = CCGConfig(
    client_id=os.getenv('BOX_CLIENT_ID'),
    client_secret=os.getenv('BOX_CLIENT_SECRET'),
    user_id=os.getenv('BOX_USER_ID'),
)

ccg_auth = BoxCCGAuth(ccg_config)

client = BoxClient(ccg_auth)

print(f"{client.users.get_user_me()}")

<class 'box_sdk_gen.schemas.user_full.UserFull'> {'id': '19498290761', 'type': 'user', 'name': 'Scott Hurrey', 'login': 'shurrey+eplusadmin@boxdemo.com', 'created_at': '2022-05-26T10:57:52-07:00', 'modified_at': '2025-08-25T13:51:12-07:00', 'language': 'en', 'timezone': 'America/Los_Angeles', 'space_amount': 999999999999999, 'space_used': 3609201911, 'max_upload_size': 536870912000, 'status': 'active', 'job_title': '', 'phone': '', 'address': '', 'avatar_url': 'https://hurrey.app.box.com/api/avatar/large/19498290761'}


## Box setup

Now we will go through the steps to create all the Box objects required to run the exercises in this workshop.

### Create the root folder

First we will create the top level folder called "boxworks_masterclass". We'll create this in the root folder.

In [None]:
from box_sdk_gen import CreateFolderParent

folder = client.folders.create_folder("boxworks_masterclass", CreateFolderParent(id="0"))

parent_folder_id = folder.id
print(f"Created folder with ID: {parent_folder_id}")

print("BOX_PARENT_ID={}".format(parent_folder_id), file=open('.env', 'a'))

Created folder with ID: 337668789138


### Create the subfolders

In our newly create folder, we will now create folders for exercises 2 - 6. If you look at the `exercise_documents` folder in this project, you'll see we are mirroring that folder.

We'll create a dictionary with these folder IDs to use throughout the setup process and write the to our .env file.

In [None]:
exercise_folder_ids = {}

for i in range(2,7):
    folder_name = f"exercise{i}"
    folder = client.folders.create_folder(folder_name, CreateFolderParent(id=parent_folder_id))
    exercise_folder_ids[folder_name] = folder.id
    print("EXERCISE{}_FOLDER={}".format(i,folder.id), file=open('.env', 'a'))
    print(f"Created folder exercise{i} with ID: {folder.id}")

Created folder exercise2 with ID: 337670070298
Created folder exercise3 with ID: 337668815714
Created folder exercise4 with ID: 337666965541
Created folder exercise5 with ID: 337668018638
Created folder exercise6 with ID: 337668685481


### Upload files

Now that we have our folder structure in place, the next step is to upload the supporting files into the appropriate Box folders. 

We are using Python's `os` library and its built-in capabilities to walk the tree and upload the files.

Once this step is complete, all of our test content will be available in Box.

In [None]:
from box_sdk_gen import UploadFileAttributes, UploadFileAttributesParentField

def upload_sample_files(folder_name, folder_id):

    directory_path = f"exercise_documents/{folder_name}"  # Replace with the actual path

    # Get all entries (files and directories) in the specified path
    all_entries = os.listdir(directory_path)

    # Filter out only the files and upload them
    files_only = {}
    for entry in all_entries:
        full_path = os.path.join(directory_path, entry)
        if os.path.isfile(full_path):
            try:
                with open(full_path, "rb") as file:
                    uploaded_files = client.uploads.upload_file(
                        UploadFileAttributes(
                            name=entry, parent=UploadFileAttributesParentField(id=folder_id)
                        ),
                        file,
                )
                new_file = uploaded_files.to_dict()['entries']
                files_only[new_file[0].name] = new_file[0].id
            except Exception as e:
                print(f"Error uploading file {entry}: {e}")
                pass

    # Print the list of files
    print(files_only)
    return files_only

file_map = {}

for folder_name, folder_id in exercise_folder_ids.items():
    print(f"{folder_name}: {folder_id}")
    files = upload_sample_files(folder_name, folder_id)

    file_map[folder_name] = files

print (file_map)

exercise2: 337670070298
Error uploading file CFR-2024-title28-vol1-sec2-20.pdf: 	
Timestamp: 2025-08-25 18:18:22.914214
Underlying error: None
Message: 409 Item with the same name already exists; Request ID: o33fcxi4quifco1o
Request: 
	Method: POST
	URL: https://upload.box.com/api/2.0/files/content
	Query params: 
{}
	Headers: 
{       'Authorization': '---[redacted]---',
        'Content-Type': 'multipart/form-data; '
                        'boundary=5528a5aba97540d49a380ae1b7c5f1d0',
        'User-Agent': 'box-python-generated-sdk-1.16.0',
        'X-Box-UA': 'agent=box-python-generated-sdk/1.16.0; env=python/3.11.11'}
	Body: 
<MultipartEncoder: OrderedDict([('attributes', '{"name": "CFR-2024-title28-vol1-sec2-20.pdf", "parent": {"id": "337670070298"}}'), ('file', ('', <_io.BufferedReader name='exercise_documents/exercise2/CFR-2024-title28-vol1-sec2-20.pdf'>, None))])>
Response: 
	Status code: 409
	Headers: 
{       'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000',
      

### Create our Hub

In exercise 3, we will be using a Box Hub and the Box AI API to ask questions across a curated list of clinical trial documents. Box Hubs allow you to curate up to 20,000 documents and Box handles Retrieval Augmented Generation for you. We split the file into chunks, vectorize those chunks, and store them in a vector store so you don't have to. 

In addition, when you add files to the hub or to a folder that is assigned to the hub, we re-index those files. During the re-index, you will still have access to the use AI.

In this exercise, we are creating a Box Hub and then adding the exercise 3 folder. Creating the vectors and indexing for a new hub can take as long as an hour depending on how much content is being added. This one is fairly small, so it should only take a few minutes. If you get an error when you run exercise 3, give it a few minutes and try again.

Once the Hub is created, we will write the hubs ID to our .env file.

In [51]:
from box_sdk_gen import HubItemOperationV2025R0, HubItemOperationV2025R0ActionField, FolderReferenceV2025R0

hubs = client.hubs.create_hub_v2025_r0(
    "Exercise 3",
    description="Hub containing documents for Exercise 3",
)

print(f"Created hub with ID: {hubs.id}")
print("HUBS_ID={}".format(hubs.id), file=open('.env', 'a'))


hub_creation_response = client.hub_items.manage_hub_items_v2025_r0(
    hubs.id,
    operations=[
        HubItemOperationV2025R0(
            action=HubItemOperationV2025R0ActionField.ADD,
            item=FolderReferenceV2025R0(id=exercise_folder_ids["exercise3"]),
        )
    ],
)
print(f"Added folder to hub: {hub_creation_response}")

Created hub with ID: 508395911
Added folder to hub: <class 'box_sdk_gen.schemas.v2025_r0.hub_items_manage_response_v2025_r0.HubItemsManageResponseV2025R0'> {'operations': [{'action': 'add', 'item': {'id': '337668815714', 'type': 'folder'}, 'status': 200}]}


### Create Metadata Template

For exercise 5, we have a folder full of invoices, and we are going to loop through them and extract information from them using the `extract_structured` endpoint and this Metadata Template to specify what data we want.

Once the Metadata Template is created, we will write the template key to our .venv file.

In [None]:
from box_sdk_gen import (
    CreateMetadataTemplateFieldsTypeField,
    CreateMetadataTemplateFields
)

template = client.metadata_templates.create_metadata_template(
    "enterprise",
    "exercise5",
    template_key="exercise5",
    fields=[
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.STRING,
            key="companyName",
            display_name="Company Name",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.STRING,
            key="billedCompanyName",
            display_name="Billed Company Name",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.STRING,
            key="invoiceNumber",
            display_name="Invoice Number",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.DATE,
            key="invoiceDate",
            display_name="Invoice Date",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.DATE,
            key="dueDate",
            display_name="Due Date",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.STRING,
            key="orderNumber",
            display_name="Order Number",
        ),
        CreateMetadataTemplateFields(
            type=CreateMetadataTemplateFieldsTypeField.FLOAT,
            key="total",
            display_name="Total",
        ),
    ],
)

print(f"Created metadata template for exercise 5: {template.templateKey}")
print("METADATA_TEMPLATE_KEY={}".format(template.templateKey), file=open('.env', 'a'))

Created metadata template with ID: <class 'box_sdk_gen.schemas.metadata_template.MetadataTemplate'> {'id': '9a00dd16-696d-4d4e-944f-93ac7bbafbfc', 'type': 'metadata_template', 'scope': 'enterprise_899905961', 'templateKey': 'exercise5', 'displayName': 'exercise5', 'hidden': False, 'fields': [{'type': 'string', 'key': 'companyName', 'displayName': 'Company Name', 'hidden': False, 'id': '73675dad-2695-4fa8-9a76-c2a5d6f215b0'}, {'type': 'string', 'key': 'billedCompanyName', 'displayName': 'Billed Company Name', 'hidden': False, 'id': '418a9702-99d7-4486-a18a-ea95febae854'}, {'type': 'string', 'key': 'invoiceNumber', 'displayName': 'Invoice Number', 'hidden': False, 'id': '138c8a16-1ff2-4e75-b415-8a6097bd15b3'}, {'type': 'date', 'key': 'invoiceDate', 'displayName': 'Invoice Date', 'hidden': False, 'id': 'dd0b288d-87d0-4574-a305-9ba222d3a28b'}, {'type': 'date', 'key': 'dueDate', 'displayName': 'Due Date', 'hidden': False, 'id': '3c68ea2a-1732-4bfd-9640-c2a66f557971'}, {'type': 'string', 'ke

### AI Agent creation

The final step in our setup is to create a Box AI Studio agent. We will use this in exercise 6 to show how you can create agents and access them from your Platform App. Once this is complete, we will write the Agent ID to our .env file.

In [33]:
from box_sdk_gen import AiStudioAgentAsk, AiStudioAgentBasicTextTool

agent = client.ai_studio.create_ai_agent(
    "Exercise 6 Agent",
    "enabled",
    ask=AiStudioAgentAsk(
        access_state="enabled",
        description="Agent for Exercise 6",
        custom_instructions="""
        You are an merger and acquisition expert.

        Your role in this work is to analyze due diligence documentation
        and identify gaps and risks and make an assessment on the acquisition.
        
        Only present your findings. Do not ask follow up questions.
        """,
        basic_text=AiStudioAgentBasicTextTool(
            is_custom_instructions_included=True,
            model="google__gemini_2_5_pro",
        )
    ),
)

print (f"Created AI Agent with ID: {agent.id}")
print("AI_AGENT_ID={}".format(agent.id), file=open('.env', 'a'))

Created AI Agent with ID: 37801642
