# Getting Started - Lab 03 - Vectara Upload API

We'll now explore the Vectara Upload API.

This notebook will use our "lab" authentication profile, if you haven't set this up, please [Setup Authentication](./00_setup_authentication.ipynb).


In [None]:
from vectara.factory import Factory
from vectara.managers import CreateCorpusRequest
from pathlib import Path
import logging

logging.basicConfig(format='%(asctime)s:%(name)-35s %(levelname)s:%(message)s', level=logging.INFO, datefmt='%H:%M:%S %z')
logger = logging.getLogger(__name__)

client = Factory(profile="lab").build()

## Setup Corpus
We will setup a lab corpus below before we upload our data. We'll examine this in more depth in the following notebooks.

In [None]:
request = CreateCorpusRequest(name="Getting Started - Upload API", key="03-getting-started-upload-api")
response = client.lab_helper.create_lab_corpus(request)

logger.info(f"Our corpus key is [{response.key}]")

## Load Our Content
We'll now upload a research article from the Arxiv research repository.

In [None]:
upload_manager = client.upload_manager

upload_manager.upload(response.key, "./resources/arxiv/2409.05865v1.pdf")

## Check the data is in the Corpus
We can run a list on our corpus documents

In [None]:
client.documents.list_corpus(response.key)

## Upload with Different Name
Each document must have a unique name which we refer to as the "document Id". We can specify a different name for the document as per below.

In [None]:
upload_manager.upload(response.key, "./resources/arxiv/2409.05865v1.pdf", doc_id="Better_Models_for_Robotics")

## Re-Run the Document List
We'll now see two documents in the corpus.

In [None]:
client.documents.list(response.key, metadata_filter="doc.id = 'Better_Models_for_Robotics'")