# Set Field Description Via API 

* Author: docai-incubator@google.com

## Disclaimer

This tool is not supported by the Google engineering team or product team. It is provided and supported on a best-effort basis by the **DocAI Incubator Team**. No guarantees of performance are implied.


## Objective

This document guides how to add prompt description to an entity field via API Calling

## Prerequisites
* Vertex AI Notebook
* Permission for Vertex AI Notebook.
* DocumentAI API
* DocumentAI Processor
* Python

## Step by Step procedure 

### 1.Importing Required Modules

In [None]:
!wget https://raw.githubusercontent.com/GoogleCloudPlatform/document-ai-samples/main/incubator-tools/best-practices/utilities/utilities.py

In [None]:
from google.cloud import documentai_v1beta3
from google.api_core.client_options import ClientOptions

### 2.Setup the inputs

* `project_id` : Project ID of the GCP project
* `location` : Location of the processor: us or eu
* `processor_id` : Processor id of processor

In [None]:
project_id = "<<project_id>>"  # Project ID of the project
location = "<<location>>"  # location of the processor: us or eu
processor_id = "<<processor_id>>"  # Processor id of processor

### 3.Run the required functions

In [None]:
def get_dataset_schema(processor_name: str) -> object:
    """
    Retrieves the dataset schema for the specified processor.

    Args:
        processor_name (str): The name of the processor from which the dataset schema is to be retrieved.

    Returns:
        object: The response object containing the dataset schema from the `get_dataset_schema` API request.
    """
    # Initialize request argument(s)
    request = documentai_v1beta3.GetDatasetSchemaRequest(
        name=processor_name + "/dataset/datasetSchema",
    )
    # Make the request
    print("Got Dataset Schema From the Processor")
    # print(request)
    response = client.get_dataset_schema(request=request)
    return response


def update_dataset_schema(schema: object) -> object:
    """
    Updates the dataset schema in the processor with the provided schema object.

    Args:
        schema (object): The schema object containing the dataset name and document schema
                         that will be used to update the processor.

    Returns:
        object: The response object from the `update_dataset_schema` API request,
                which confirms the update of the dataset schema.
    """
    # Initialize request argument(s)
    request = documentai_v1beta3.UpdateDatasetSchemaRequest(
        dataset_schema={"name": schema.name, "document_schema": schema.document_schema}
    )
    # Make the request
    print("Updated Dataset Schema into the Processor")
    # print(request)
    response = client.update_dataset_schema(request=request)
    # Handle the response
    return response


def modify_schema(schema: object, changes: list[dict]) -> object:
    """
    Modifies the document schema by applying the specified changes.
    The function locates the relevant entity in the schema and updates its description.

    Args:
        schema (object): The original dataset schema object, which contains schema information.
        changes (list[dict]): A list of dictionaries where each dictionary includes:
            - "change_type" (str): Type of change to apply (e.g., "set_description").
            - "entity_name" (str): Name of the entity whose description needs to be updated.
            - "description" (str): The new description to set for the entity.

    Returns:
        object: The modified dataset schema object after applying the changes.
    """
    for item in changes:
        change_type = item.get("change_type")
        entity_name = item.get("entity_name")
        description = item.get("description")
        # print(change_type, entity_name, description)
        if change_type == "set_description":
            for entity_type in schema.document_schema.entity_types:
                for prop in entity_type.properties:
                    if prop.name == entity_name:
                        print(
                            f"FOUND {prop.name} and setting description: {description}"
                        )
                        prop.description = description
                        break
    return schema

### 4.Run the code

In [None]:
def main():
    processor_name = (
        f"projects/{project_id}/locations/{location}/processors/{processor_id}"
    )
    client = documentai_v1beta3.DocumentServiceClient(
        client_options=ClientOptions(
            api_endpoint=f"{location}-documentai.googleapis.com"
        )
    )

    # Get schema from your source
    schema = response_document_schema = get_dataset_schema(processor_name)

    # Define schema changes
    changes = [
        {
            "change_type": "set_description",
            "entity_name": "entity_1",
            "description": "The entity description on the document for the entity1.",
        },
        {
            "change_type": "set_description",
            "entity_name": "entity_2",
            "description": "The entity description/location on the document for the entity2.",
        },
    ]

    # Apply changes to the schema
    updated_schema = modify_schema(schema, changes)

    # Update dataset schema in your system
    response_update = update_dataset_schema(updated_schema)


main()

### 5.Output

Showing the difference between before and after updated the entity description

#### Before Updating the Entity Field Description
<img src="./Images/before_setting_field_desc.png" width=800 height=400 ></img>
#### After Updating the Entity Field Description
<img src="./Images/after_setting_field_desc.png" width=800 height=400 ></img>