<img src="https://imagedelivery.net/Dr98IMl5gQ9tPkFM5JRcng/3e5f6fbd-9bc6-4aa1-368e-e8bb1d6ca100/Ultra" alt="Image description" width="160" />

<br/>

## 📚 Ingesting Documents with Visual Documents 


 In this tutorial, we'll walk through how to ingest documents with visual content into Contextual's datastore using the API. The ingestion process will extract text, tables, and visual elements from documents like PDFs, making them searchable and queryable. We'll use the ContextualAI client to handle the ingestion, similar to the standard document ingestion process.

### Important Links
- [Ingesting documents API References](https://docs.contextual.ai/api-reference/datastores-documents/ingest-document).

In [None]:
import requests
import json 

API_HOST="https://api.app.contextual.ai"
TOKEN= "YOUR_API_KEY"
DATASTORE_NAME="DATASTORE_NAME"
DATASTORE_ID="DATASTORE_ID"


headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json",
}


### 🔍 Advanced Document Extraction with Visual Language Models 

 Contextual AI supports advanced document extraction using Visual Language Models (VLMs). This enables more sophisticated analysis of visual elements, tables, and layouts in documents, going beyond traditional text-only extraction.

 define the VLM_PROMPT here


In [13]:
VLM_PROMPT = """
            You are a specialized document chart and figure extractor.
    When given an image of a document page, you accurately extract a chart

    or figure into text for use in a text based search and retrieval system. A red
    bounding box is drawn around the chart or figure that you have to extract.


    Please provide your response in the following format:


    **High level description**

    - Short high level description of the chart or figure, or code snippet in a few sentences, with
    its context in the document.

    - Describe the title and subtitle of the chart or figure if present.

    - Describe the x-axis and y-axis labels including scale, range, and units if present.
    Please describe the text in the axes faithfully.

    - Include description of any legends or annotations present.

    - If colors indicate different categories or values, describe the color coding.

    - Extract ALL text content exactly as it appears in the image, maintaining the original formatting.

    - For code snippets: 
        1. First, identify the programming language in the image (Java, Python, etc.)

        2. Extract the COMPLETE code EXACTLY as it appears, with particular attention to:
            - Preserve ALL indentation (spaces and tabs) precisely as shown
            - Maintain ALL line breaks in their exact positions
            - Keep ALL comments with their original formatting
            - Preserve ALL variable/method names, syntax, and special characters exactly

        3. Format your response using triple backtick markdown with the language specified:
        ```java
        // Exact code goes here with precise indentation

    **Key data points**

    - Extract exact values or data points from the chart or figure that are important
    for search and retrieval.

    - Include all numerical data present in the chart or figure.

    - Make sure to relate the data point to the legend or axis labels.


    - DO NOT MAKE UP DATA POINTS that are not present in the chart or figure.
    
    When extracting code, prioritize ABSOLUTE ACCURACY over summarization. The goal is to reproduce the code EXACTLY as it appears so it could be copied and used directly. Never abbreviate or summarize code sections.
    **Important**:
    - Accuracy is the priority. Tif there is code in the image, the goal is to produce a copy‑and‑paste‑ready version of the code appearing in the IDE, and all text shown in the figure.
    - Do **not** summarize or omit parts of the code or text. 
    - Do **not** add or remove anything.

"""

#### Configure datastore to enable:
1. V2 extraction pipeline - Required for visual content extraction
2. Static captioning with custom prompt - For consistent image descriptions

In [None]:

datastore_config = {
      "datastore_type": "UNSTRUCTURED",
      "name": DATASTORE_NAME,
      "configuration": {
          "enable_v2_extraction_pipeline": True,  # Enable V2 pipeline for visual content
          "extraction": {
              "static_captioning_prompt": VLM_PROMPT  # Custom prompt for image captioning
                  }
          }
}
response = requests.put(f"{API_HOST}/v1/datastores/{DATASTORE_ID}", headers=headers, data=json.dumps(datastore_config))
assert response.status_code == 200, response.text
print(response.json())