This below code snippet imports the necessary libraries and sets up the execution role for Amazon SageMaker to interact with Hugging Face models. It performs the following tasks:

1. **Imports**:
   - `json`: To handle JSON data.
   - `sagemaker`: The SageMaker Python SDK to interact with AWS SageMaker services.
   - `boto3`: AWS SDK for Python, used for interacting with AWS services like IAM (Identity and Access Management).
   - `HuggingFaceModel`: A class from SageMaker to deploy Hugging Face models.
   - `get_huggingface_llm_image_uri`: A helper function to retrieve the appropriate Docker image URI for Hugging Face models.

2. **Execution Role Retrieval**:
   - The code first attempts to get the SageMaker execution role using `sagemaker.get_execution_role()`, which allows SageMaker to access required resources.
   - If the role cannot be retrieved (e.g., when the script is run outside of SageMaker), it uses `boto3` to query IAM and fetch the `Arn` (Amazon Resource Name) for the `sagemaker_execution_role`.

This ensures that the script can dynamically determine the required IAM role, whether running within a SageMaker notebook instance or an external environment.


In [317]:
pip install requests_aws4auth

Note: you may need to restart the kernel to use updated packages.


In [318]:
pip install opensearch-py

Note: you may need to restart the kernel to use updated packages.


In [319]:
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

This code snippet attempts to retrieve the Amazon SageMaker execution role ARN using two methods:

Primary Method: It first calls sagemaker.get_execution_role(), which is designed to fetch the execution role when running within a SageMaker environment. However, this function may raise a ValueError if executed outside of SageMaker-managed environments, such as local machines or non-SageMaker AWS services. 

Fallback Method: If a ValueError occurs, the code initializes a Boto3 IAM client and retrieves the ARN of a role named "sagemaker_execution_role" using iam.get_role(). This approach assumes that a role with this specific name exists in the AWS account. 

This dual-method strategy ensures that the appropriate SageMaker execution role ARN is obtained, regardless of the execution environment.

In [320]:
try:
    service_role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client("iam")
    service_role = iam.get_role(RoleName="sagemaker_execution_role")["Role"]["Arn"]

In [321]:
service_role

'arn:aws:iam::675379425271:role/service-role/AmazonSageMaker-ExecutionRole-20231211T123709'

This snippet generates random id for maintaining unique resource names

In [322]:
import random
import string

generated_uuid = ''.join(random.choices(string.ascii_lowercase + string.digits, k=10))
generated_uuid


'uvkkpbq2tv'

This snippet attaches necessary policies to an IAM role for enabling SageMaker to access AWS services like OpenSearch and Amazon Bedrock.

1. Create Inline Policy: It defines a custom inline policy with permissions to manage IAM, OpenSearch, Amazon OpenSearch Serverless, and KMS keys. This policy allows full access to these services.

2. Attach Inline Policy: The inline policy is attached to the specified IAM role (`service_role_name`) using `iam_client.put_role_policy()`.

3. Attach Managed Policy: The managed policy `AmazonBedrockFullAccess` is attached to the IAM role using `iam_client.attach_role_policy()`.

This setup ensures the role has the necessary permissions to interact with SageMaker, OpenSearch, and Amazon Bedrock services.

In [323]:
#Attach necessary policy to the sagemaker service role
service_role_name = role.split('/')[-1]  # Replace with your IAM Role Name
policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "iam:*",
                "es:*",
                "aoss:*",
                "kms:GenerateDataKey",
                "kms:Decrypt"
            ],
            "Resource": "*"
        }
    ]
}

# Create or update the inline policy
policy_name = "opensearch_full_access"
iam_client.put_role_policy(
    RoleName=service_role_name,
    PolicyName=policy_name,
    PolicyDocument=json.dumps(policy_document)
)

bedrock_policy_arn = "arn:aws:iam::aws:policy/AmazonBedrockFullAccess"

# Attach the managed policy
iam_client.attach_role_policy(
    RoleName=service_role_name,
    PolicyArn=bedrock_policy_arn
)


{'ResponseMetadata': {'RequestId': '8f2c65ca-1715-479a-8065-25d68238ed00',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Tue, 04 Mar 2025 13:47:13 GMT',
   'x-amzn-requestid': '8f2c65ca-1715-479a-8065-25d68238ed00',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

This code snippet is used to configure and create a Hugging Face model using Amazon SageMaker. The steps performed are as follows:

1. **Model Configuration**:
   - The `model_id` and `model_name` variables define the specific Hugging Face model to be used. In this case, the model being referenced is `"deepseek-ai/DeepSeek-R1-Distill-Qwen-14B"`, and the model name is `"deepseek-v1"`.
   - The `hub` dictionary contains key-value pairs that configure the model for deployment, specifically:
     - `"HF_MODEL_ID"`: Points to the model's identifier on Hugging Face Hub.
     - `"SM_NUM_GPUS"`: Specifies the number of GPUs (in this case, 4) for the SageMaker model deployment.

2. **Model Class Creation**:
   - The `HuggingFaceModel` class is instantiated with the following parameters:
     - `image_uri`: The URI for the appropriate Hugging Face Docker image (version `3.0.1` in this case).
     - `env`: The environment configuration containing the model ID and GPU count.
     - `role`: The IAM role that gives SageMaker access to resources.

3. **GPU Configuration**:
   - The number of GPUs per replica is set depending on the model being deployed. Below is the GPU configuration for each model:
   
   | Model ID                                      | Instance Type   | # of GPUs per Replica |
   |-----------------------------------------------|-----------------|-----------------------|
   | deepseek-ai/DeepSeek-R1-Distill-Llama-70B     | ml.g6.48xlarge  | 8                     |
   | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B      | ml.g6.12xlarge  | 4                     |
   | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B      | ml.g6.12xlarge  | 4                     |
   | deepseek-ai/DeepSeek-R1-Distill-Llama-8B      | ml.g6.2xlarge   | 1                     |
   | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B       | ml.g6.2xlarge   | 1                     |
   | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B     | ml.g6.2xlarge   | 1                     |

   The GPU count for each model can be adjusted by modifying the `"SM_NUM_GPUS"` value in the `hub` dictionary as needed.


In [324]:
model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B"
model_name = "deepseek-v1"

# Hub Model configuration. https://huggingface.co/models
hub = {
    "HF_MODEL_ID": model_id,
    "SM_NUM_GPUS": json.dumps(4)
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    image_uri=get_huggingface_llm_image_uri("huggingface", version="3.0.1"),
    env=hub,
    role=role,
)

This code snippet is used to deploy a Hugging Face model to Amazon SageMaker for inference. The steps performed are as follows:

1. **Endpoint Configuration**:
   - The `endpoint_name` is dynamically set using the `model_name` variable. This will create an endpoint name that matches the model being deployed.

2. **Model Deployment**:
   - The `huggingface_model.deploy()` method is used to deploy the model to SageMaker Inference. The parameters include:
     - `endpoint_name`: Specifies the name of the endpoint (in this case, the model name is used as the endpoint name).
     - `initial_instance_count`: Sets the initial number of instances for the endpoint, which is `1` in this case.
     - `instance_type`: Defines the type of instance to use for deployment. In this case, it's set to `"ml.g6.12xlarge"`.
     - `container_startup_health_check_timeout`: Sets the timeout for the health check during container startup, which is set to 2400 seconds (40 minutes).

This deployment configuration ensures that the model is hosted on SageMaker and ready for inference with the specified instance type and health check settings.


In [325]:
endpoint_name = f"{model_name}"

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    endpoint_name=endpoint_name,
    initial_instance_count=1,
    instance_type="ml.g6.12xlarge",
    container_startup_health_check_timeout=2400,
)



------------!

This snippet sends prompt to the model to retrieve response using predict() method

In [326]:
prompt = "You are on a game show with three doors. Behind one door is a car; behind the other two are goats. You pick a door. The host, who knows what's behind each door, opens one of the remaining doors, revealing a goat. You are given the choice to stick with your original pick or switch to the other unopened door. What should you do to maximize your chances of winning, and why?"

predictor.predict({"inputs": prompt})



[{'generated_text': "You are on a game show with three doors. Behind one door is a car; behind the other two are goats. You pick a door. The host, who knows what's behind each door, opens one of the remaining doors, revealing a goat. You are given the choice to stick with your original pick or switch to the other unopened door. What should you do to maximize your chances of winning, and why? Use probability concepts to explain.\nAlright, so I'm trying to figure out this probability problem from the game show. It's called the Monty Hall problem, right? Let me recall what it's about.\n\nOkay, there are three doors: one has a car, the other two have goats. I pick a door, say door number 1. Then the host, who knows what's behind each door, opens another door, say door number 3, revealing a goat. Now, the host asks me if I want to stick with my original choice or switch to the remaining unopened door, door number 2. What should I do?\n\nHmm, I've heard conflicting things about this. Some pe

This snippet creates an encryption policy for an OpenSearch Serverless collection. It uses the `boto3` client to initialize the OpenSearch Serverless service, defines a policy with an encryption rule for the specific collection, and then creates the policy using `create_security_policy()`. The policy is passed as a JSON string. Finally, it prints the response from the `create_security_policy` API call.

In [327]:
import boto3
import json

# Initialize the OpenSearch Serverless client
opensearch_client = boto3.client('opensearchserverless')

#Specify collection name
opensearch_collection_name = 'deepseek-'+str(generated_uuid)

# Define the encryption policy
policy = {
    "Rules": [
        {
            "Resource": [f"collection/{opensearch_collection_name}"],
            "ResourceType": "collection"
        }
    ],
     "AWSOwnedKey": True
}
encryption_policy =  json.dumps(policy)
# Create the encryption policy
Create_security_policy = opensearch_client.create_security_policy(
    name="encryption-policy"+str(generated_uuid),
    type="encryption",
    policy=encryption_policy # Convert dict to JSON string
)

# Print the response
print(Create_security_policy)


{'securityPolicyDetail': {'type': 'encryption', 'name': 'encryption-policyuvkkpbq2tv', 'policyVersion': 'MTc0MTA5NjUzNTU5OV8x', 'policy': {'Rules': [{'Resource': ['collection/deepseek-uvkkpbq2tv'], 'ResourceType': 'collection'}], 'AWSOwnedKey': True}, 'createdDate': 1741096535599, 'lastModifiedDate': 1741096535599}, 'ResponseMetadata': {'RequestId': '08479009-9de0-4c06-91b5-d55cf9487f86', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '08479009-9de0-4c06-91b5-d55cf9487f86', 'date': 'Tue, 04 Mar 2025 13:55:35 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '299', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


This snippet defines a network access policy for an OpenSearch Serverless collection. It sets rules to allow public access to the collection and dashboard resources. The policy is then created using the `create_security_policy()` method, and the response is printed. 

In [328]:
#Define network access policy
network_access_policy =[
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{opensearch_collection_name}"
        ],
        "ResourceType": "dashboard"
      },
      {
        "Resource": [
          f"collection/{opensearch_collection_name}"
        ],
        "ResourceType": "collection"
      }
    ],
    "AllowFromPublic": True
  }
]
print(network_access_policy)
network_access_policy_creation = opensearch_client.create_security_policy(
    name="deepseek-network"+str(generated_uuid),  # Name of the policy
    type="network",  # Network access policy
    policy=json.dumps(network_access_policy)
)

print(network_access_policy_creation)

[{'Rules': [{'Resource': ['collection/deepseek-uvkkpbq2tv'], 'ResourceType': 'dashboard'}, {'Resource': ['collection/deepseek-uvkkpbq2tv'], 'ResourceType': 'collection'}], 'AllowFromPublic': True}]
{'securityPolicyDetail': {'type': 'network', 'name': 'deepseek-networkuvkkpbq2tv', 'policyVersion': 'MTc0MTA5NjU0NDA3Ml8x', 'policy': [{'Rules': [{'Resource': ['collection/deepseek-uvkkpbq2tv'], 'ResourceType': 'dashboard'}, {'Resource': ['collection/deepseek-uvkkpbq2tv'], 'ResourceType': 'collection'}], 'AllowFromPublic': True}], 'createdDate': 1741096544072, 'lastModifiedDate': 1741096544072}, 'ResponseMetadata': {'RequestId': 'cc2e6539-d9fb-46c6-8da1-dd44245911e0', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'cc2e6539-d9fb-46c6-8da1-dd44245911e0', 'date': 'Tue, 04 Mar 2025 13:55:44 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '376', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


This snippet creates an OpenSearch Serverless collection of type `VECTORSEARCH` with standby replicas disabled. It uses the `boto3` client to call the `create_collection()` method and then prints the response.

In [329]:
import boto3
create_collection = opensearch_client.create_collection(
    name=opensearch_collection_name,
    type='VECTORSEARCH',
    standbyReplicas='DISABLED'
)

print(create_collection)

{'createCollectionDetail': {'id': '32cyd54lqv995lcitz90', 'name': 'deepseek-uvkkpbq2tv', 'status': 'CREATING', 'type': 'VECTORSEARCH', 'arn': 'arn:aws:aoss:ap-south-1:675379425271:collection/32cyd54lqv995lcitz90', 'kmsKeyArn': 'auto', 'standbyReplicas': 'DISABLED', 'createdDate': 1741096554515, 'lastModifiedDate': 1741096554515}, 'ResponseMetadata': {'RequestId': 'a672dee3-f446-4378-a457-8fe5341aa31c', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'a672dee3-f446-4378-a457-8fe5341aa31c', 'date': 'Tue, 04 Mar 2025 13:55:54 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '313', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


This snippet checks the status of an OpenSearch Serverless collection by filtering collections based on its name (`opensearch_collection_name`). It uses the `list_collections()` method and prints the collection status.

In [330]:
check_collection_status = opensearch_client.list_collections(
    collectionFilters={
        'name': opensearch_collection_name,
    },
)
print("collection_status : ", check_collection_status)

collection_status :  {'collectionSummaries': [{'id': '32cyd54lqv995lcitz90', 'name': 'deepseek-uvkkpbq2tv', 'status': 'CREATING', 'arn': 'arn:aws:aoss:ap-south-1:675379425271:collection/32cyd54lqv995lcitz90'}], 'ResponseMetadata': {'RequestId': '67c24fdb-16e3-4322-850a-2962231b4792', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '67c24fdb-16e3-4322-850a-2962231b4792', 'date': 'Tue, 04 Mar 2025 13:56:01 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '181', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


This snippet extracts and prints key details from the created OpenSearch collection. It retrieves the `collection_id` from the creation response, constructs the `opensearch_endpoint`, extracts the `collection_Arn` from the collection status, and extracts the `account_id` from the ARN.

In [331]:
collection_id =  create_collection['createCollectionDetail']['id']
print("collection_id :",collection_id)
opensearch_endpoint =f"{collection_id}.ap-south-1.aoss.amazonaws.com"
print("opensearch_endpoint :",opensearch_endpoint)

collection_Arn = check_collection_status['collectionSummaries'][0]['arn']
print("collection_Arn :",collection_Arn)

account_id = collection_Arn.split(":")[4]
print("Account ID :",account_id)

collection_id : 32cyd54lqv995lcitz90
opensearch_endpoint : 32cyd54lqv995lcitz90.ap-south-1.aoss.amazonaws.com
collection_Arn : arn:aws:aoss:ap-south-1:675379425271:collection/32cyd54lqv995lcitz90
Account ID : 675379425271


This snippet defines variables for an embedding model and a knowledge base. Specifies the ARN for the Titan embedding model (`titan_embedding_arn`), names the knowledge base (`knowledge_base_name`), and defines a bucket name (`bucket_name`) for storage.

In [332]:
Titan_embedding_model_dimension= 1024
titan_embedding_arn = 'arn:aws:bedrock:ap-south-1::foundation-model/amazon.titan-embed-text-v2:0'

#specify knowledge base name
knowledge_base_name =  'deepseek-demo'+str(generated_uuid)

#change bucket name if needed 
bucket_name = "deepseek-"+str(generated_uuid)
s3_client =  boto3.client('s3')
bucket_creation = s3_client.create_bucket(Bucket=bucket_name,CreateBucketConfiguration={"LocationConstraint": "ap-south-1"})

Add the documents to the S3 Bucket created 

This snippet creates an IAM policy that allows invoking a Bedrock foundation model. It defines the policy document to permit the `bedrock:InvokeModel` action on the specified `titan_embedding_arn`, then creates the policy using `iam_client.create_policy()`. The policy ARN is extracted from the response and printed.

In [333]:
iam_client = boto3.client('iam')

# Define the policy documents
policy_name_1 = 'AmazonBedrockFoundationModelPolicyForKnowledgeBasedeepseek1'+str(generated_uuid)
policy_document_1 = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "BedrockInvokeModelStatement",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel"
            ],
            "Resource": [
                titan_embedding_arn
            ]
        }
    ]
}

# print(policy_document_1)
policy_document_json_1 = json.dumps(policy_document_1)

policy_response_1 = iam_client.create_policy(
    PolicyName=policy_name_1,
    PolicyDocument=policy_document_json_1,
    Description='This policy allows accessing foundation model from bedrock for '+knowledge_base_name
)
policy_arn_1 = policy_response_1['Policy']['Arn']
print(f"Policy 1 created successfully: {policy_arn_1}")

Policy 1 created successfully: arn:aws:iam::675379425271:policy/AmazonBedrockFoundationModelPolicyForKnowledgeBasedeepseek1uvkkpbq2tv


This snippet creates an IAM policy that grants access to an OpenSearch Serverless collection. It defines the policy document to allow `aoss:APIAccessAll` on the specified `collection_Arn`, then creates the policy using `iam_client.create_policy()`. The policy ARN is extracted from the response and printed.

In [334]:

policy_name_2 = 'AmazonBedrockOSSPolicyForKnowledgeBasedeepseek1'+str(generated_uuid)
policy_document_2 = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "OpenSearchServerlessAPIAccessAllStatement",
            "Effect": "Allow",
            "Action": [
                "aoss:APIAccessAll"
            ],
            "Resource": [
                collection_Arn
            ]
        }
    ]
}
policy_document_json_2 = json.dumps(policy_document_2)
policy_response_2 = iam_client.create_policy(
    PolicyName=policy_name_2,
    PolicyDocument=policy_document_json_2,
    Description='This policy allows knowledge base agent '+knowledge_base_name +' to access open search serverless collection'
)
policy_arn_2 = policy_response_2['Policy']['Arn']
print(f"Policy 2 created successfully: {policy_arn_2}")


Policy 2 created successfully: arn:aws:iam::675379425271:policy/AmazonBedrockOSSPolicyForKnowledgeBasedeepseek1uvkkpbq2tv


This snippet creates an IAM policy that grants access to an S3 bucket. It defines the policy document to allow `s3:ListBucket` and `s3:GetObject` actions on the specified `bucket_name`, with conditions to restrict access to the specified `account_id`. The policy is then created using `iam_client.create_policy()`, and the policy ARN is printed.

In [335]:

policy_name_3 = 'AmazonBedrockS3PolicyForKnowledgeBasedeepseek1'+str(generated_uuid)
policy_document_3 = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ListBucketStatement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::"+bucket_name
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": [
                        account_id
                    ]
                }
            }
        },
        {
            "Sid": "S3GetObjectStatement",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                f"arn:aws:s3:::{bucket_name}/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": [
                        account_id
                    ]
                }
            }
        }
    ]
}




# Convert policy documents to JSON format
policy_document_json_3 = json.dumps(policy_document_3)
policy_response_3 = iam_client.create_policy(
    PolicyName=policy_name_3,
    PolicyDocument=policy_document_json_3,
    Description='This policy allows knowledge base agent '+knowledge_base_name +' to access s3'
)
policy_arn_3 = policy_response_3['Policy']['Arn']
print(f"Policy 3 created successfully: {policy_arn_3}")

Policy 3 created successfully: arn:aws:iam::675379425271:policy/AmazonBedrockS3PolicyForKnowledgeBasedeepseek1uvkkpbq2tv


This snippet creates an IAM role with a trust policy allowing Amazon Bedrock to assume the role. The trust relationship is defined to permit `sts:AssumeRole` from the specified `account_id` and `knowledge-base` ARN. The role is created using `iam_client.create_role()`, and the role ARN is printed.

In [336]:
# Define the role trust relationship document (assume-role policy)
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockKnowledgeBaseTrustPolicy",
            "Effect": "Allow",
            "Principal": {
                "Service": "bedrock.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": account_id
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:bedrock:ap-south-1:"+account_id+":knowledge-base/*"
                }
            }
        }
    ]
}
# # Convert assume role policy to JSON format
assume_role_policy_json = json.dumps(assume_role_policy_document)

# # Define role name
role_name = 'knowledge_base_role'+str(generated_uuid)
role_response = iam_client.create_role(
        RoleName=role_name,
        AssumeRolePolicyDocument=assume_role_policy_json,
        Description='Role for knowledge base agent ' +knowledge_base_name
    )
role_arn = role_response['Role']['Arn']
print(f"Role created successfully: {role_arn}")

Role created successfully: arn:aws:iam::675379425271:role/knowledge_base_roleuvkkpbq2tv


This snippet attaches three previously created IAM policies (`policy_arn_1`, `policy_arn_2`, and `policy_arn_3`) to the specified IAM role (`role_name`) using `iam_client.attach_role_policy()`. It then prints a success message.

In [337]:

# Step 3: Attach the created policies to the role
iam_client.attach_role_policy(RoleName=role_name, PolicyArn=policy_arn_1)
iam_client.attach_role_policy(RoleName=role_name, PolicyArn=policy_arn_2)
iam_client.attach_role_policy(RoleName=role_name, PolicyArn=policy_arn_3)

print(f"All policies attached successfully to the role {role_name}")



All policies attached successfully to the role knowledge_base_roleuvkkpbq2tv


This snippet extracts the ARN of a newly created IAM role (`role_arn`) from the `role_response` and defines a variable (`Opensearch_indexname`) with the name of an OpenSearch index.

In [338]:
role_arn = role_response['Role']['Arn']

#specify index name
Opensearch_indexname = "deepseek-index"+str(generated_uuid)

This snippet creates an OpenSearch data access policy by defining permissions for interacting with a collection and index. It grants specific actions (like create, update, delete, and read) on the OpenSearch collection and index resources to a specified role and service role. The policy is then applied using the `create_access_policy` method from the OpenSearch client.

In [None]:
import time
opensearch_policy_name = f"kb-policy-"+str(generated_uuid)
#Define data access policy
opensearch_policy_document = [
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{opensearch_collection_name}"
        ],
        "Permission": [
          "aoss:CreateCollectionItems",
          "aoss:DeleteCollectionItems",
          "aoss:UpdateCollectionItems",
          "aoss:DescribeCollectionItems"
        ],
        "ResourceType": "collection"
      },
      {
        "Resource": [
          f"index/{opensearch_collection_name}/*"
        ],
        "Permission": [
          "aoss:CreateIndex",
          "aoss:DeleteIndex",
          "aoss:UpdateIndex",
          "aoss:DescribeIndex",
          "aoss:ReadDocument",
          "aoss:WriteDocument"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
        role_arn,
        service_role        
    ],
    "Description": "Rule 1"
  }
]
opensearch_client.create_access_policy(
    name=opensearch_policy_name,
    type='data',
    description='Access policy for knowledge base',
    policy=json.dumps(opensearch_policy_document)
)
time.sleep(20)

This snippet connects to an OpenSearch endpoint using AWS credentials and AWS4Auth for secure access. It defines a request body to create an OpenSearch index with KNN (K-Nearest Neighbors) search enabled, using a specified model dimension and FAISS as the search engine. It then creates the index on the OpenSearch server and prints the response.

In [340]:
from requests_aws4auth import AWS4Auth
from opensearchpy import OpenSearch, RequestsHttpConnection
import boto3

session = boto3.Session()
credentials = session.get_credentials()
awsauth = AWS4Auth(
        credentials.access_key,
        credentials.secret_key,
        'ap-south-1',
        'aoss',
        session_token=credentials.token
    )
index_client = OpenSearch(              
        hosts=[{'host': opensearch_endpoint, 'port': 443}],
        http_auth=awsauth,
        use_ssl=True,
        verify_certs=True,
        http_compress=True,  # enables gzip compression for request bodies
        connection_class=RequestsHttpConnection
    )
# Define the request body
request_body ={
"settings": {
"index": {
  "knn": True,
  "knn.algo_param.ef_search": 512
}
},
"mappings": {
"properties": {
  Opensearch_indexname: {
    "type": "knn_vector",                        
    "dimension": Titan_embedding_model_dimension,
    "method": {
      "name": "hnsw",
      "engine": "faiss",
      "parameters": {},
      "space_type": "l2"
    }
  }
}            
}
}   
opensearch_index_response = index_client.indices.create(
      index=Opensearch_indexname,                                  
      body=request_body
  )
print(opensearch_index_response)

{'acknowledged': True, 'shards_acknowledged': True, 'index': 'deepseek-indexuvkkpbq2tv'}


This snippet uses the `boto3` library to create a knowledge base in AWS Bedrock. It configures the knowledge base as a vector-based system using the provided embedding model ARN and associates it with an OpenSearch Serverless collection's index. It also specifies the knowledge base role ARN and OpenSearch configuration for storage. The knowledge base creation response is printed.

In [341]:
bedrock_client = boto3.client('bedrock-agent')

knowledge_base_creation = bedrock_client.create_knowledge_base(
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': titan_embedding_arn
        }
    },
    name=knowledge_base_name,
    roleArn=role_arn,
    storageConfiguration={
        'opensearchServerlessConfiguration': {
            'collectionArn': collection_Arn,
            'fieldMapping': {
                'metadataField': 'AMAZON_BEDROCK_METADATA',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'vectorField': Opensearch_indexname
            },
            'vectorIndexName': Opensearch_indexname
        },

        'type': 'OPENSEARCH_SERVERLESS'
    },
)
print(knowledge_base_creation)


{'ResponseMetadata': {'RequestId': 'c202b4d9-bf68-44dc-978f-94d70982a8ad', 'HTTPStatusCode': 202, 'HTTPHeaders': {'date': 'Tue, 04 Mar 2025 13:59:36 GMT', 'content-type': 'application/json', 'content-length': '899', 'connection': 'keep-alive', 'x-amzn-requestid': 'c202b4d9-bf68-44dc-978f-94d70982a8ad', 'x-amz-apigw-id': 'G54S5Gu6hcwEOoA=', 'x-amzn-trace-id': 'Root=1-67c70745-2412c4650a8462a16a390751'}, 'RetryAttempts': 0}, 'knowledgeBase': {'createdAt': datetime.datetime(2025, 3, 4, 13, 59, 33, 461757, tzinfo=tzlocal()), 'knowledgeBaseArn': 'arn:aws:bedrock:ap-south-1:675379425271:knowledge-base/55VU1H823F', 'knowledgeBaseConfiguration': {'type': 'VECTOR', 'vectorKnowledgeBaseConfiguration': {'embeddingModelArn': 'arn:aws:bedrock:ap-south-1::foundation-model/amazon.titan-embed-text-v2:0'}}, 'knowledgeBaseId': '55VU1H823F', 'name': 'deepseek-demouvkkpbq2tv', 'roleArn': 'arn:aws:iam::675379425271:role/knowledge_base_roleuvkkpbq2tv', 'status': 'CREATING', 'storageConfiguration': {'opensea

This snippet extracts and prints the `knowledgeBaseId` from the response of the `create_knowledge_base` API call in AWS Bedrock.

In [342]:
knowledge_base_id = knowledge_base_creation['knowledgeBase']['knowledgeBaseId']
print(knowledge_base_id)

55VU1H823F


This snippet retrieves and prints the status of the knowledge base using its `knowledgeBaseId` with the `get_knowledge_base` API call in AWS Bedrock.

In [343]:

knowledge_base_status = bedrock_client.get_knowledge_base(
                    knowledgeBaseId=knowledge_base_id
                )
print(knowledge_base_status)

{'ResponseMetadata': {'RequestId': '0222f7dc-c0ed-430e-a690-1824c9788344', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 04 Mar 2025 13:59:53 GMT', 'content-type': 'application/json', 'content-length': '897', 'connection': 'keep-alive', 'x-amzn-requestid': '0222f7dc-c0ed-430e-a690-1824c9788344', 'x-amz-apigw-id': 'G54WDE7ShcwEP5g=', 'x-amzn-trace-id': 'Root=1-67c70759-689d9e004b25bbe2203a0fd4'}, 'RetryAttempts': 0}, 'knowledgeBase': {'createdAt': datetime.datetime(2025, 3, 4, 13, 59, 33, 461757, tzinfo=tzlocal()), 'knowledgeBaseArn': 'arn:aws:bedrock:ap-south-1:675379425271:knowledge-base/55VU1H823F', 'knowledgeBaseConfiguration': {'type': 'VECTOR', 'vectorKnowledgeBaseConfiguration': {'embeddingModelArn': 'arn:aws:bedrock:ap-south-1::foundation-model/amazon.titan-embed-text-v2:0'}}, 'knowledgeBaseId': '55VU1H823F', 'name': 'deepseek-demouvkkpbq2tv', 'roleArn': 'arn:aws:iam::675379425271:role/knowledge_base_roleuvkkpbq2tv', 'status': 'ACTIVE', 'storageConfiguration': {'opensearc

This snippet creates a data source for a knowledge base in AWS Bedrock, with the source being an S3 bucket. It defines a fixed-size chunking strategy for ingestion with specified token limit and overlap percentage. The data source is associated with the knowledge base using its `knowledgeBaseId`.

In [344]:
#Datasource creation in Knowledge base
max_tokens = 1024

#specify datasource_name
datasource_name ='deepseek_datasource'+str(generated_uuid)


datasource_response = bedrock_client.create_data_source(
                    dataDeletionPolicy='DELETE',
                    dataSourceConfiguration={
                        's3Configuration': {
                            'bucketArn': f'arn:aws:s3:::{bucket_name}'
                        },
                        'type': 'S3'
                    },
                    description=f'Datasource for knowledge base',
                    knowledgeBaseId=knowledge_base_id,
                    name=datasource_name,

                    vectorIngestionConfiguration={
                        'chunkingConfiguration': {
                            'chunkingStrategy': 'FIXED_SIZE',
                            'fixedSizeChunkingConfiguration': {
                                'maxTokens': max_tokens,
                                'overlapPercentage': 20
                            }
                        }
                    }
                )

This snippet extracts the datasource_id from datasource_response

In [345]:
datasource_id = datasource_response['dataSource']['dataSourceId']

This snippet starts an ingestion job to ingest the documents in the specified s3 bucket, into the knowledge base created, using the specified data source ID. It retrieves the ingestion job ID from the response to track the job.

In [346]:
client_bedrock_agent = boto3.client('bedrock-agent')
start_ingestion_response = client_bedrock_agent.start_ingestion_job(                                            
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=datasource_id,
            )
ingestion_job_id = start_ingestion_response['ingestionJob']['ingestionJobId']
            

This snippet checks the status of an ingestion job in AWS Bedrock by querying the ingestion job using the data source ID, job ID, and knowledge base ID. It then prints the job's status.

In [347]:
 response = client_bedrock_agent.get_ingestion_job(
                dataSourceId=datasource_id,
                ingestionJobId=ingestion_job_id,
                knowledgeBaseId=knowledge_base_id
            )
status = response['ingestionJob']['status']
print(f"Status for ingestion job ID {ingestion_job_id}: {status}")

Status for ingestion job ID HBBYOR0YUK: COMPLETE


This snippet retrieves the top 10 results from a knowledge base in AWS Bedrock using a user's query. It performs a hybrid vector search to return relevant results.

In [354]:
retrieve_client = boto3.client("bedrock-agent-runtime")
user_question = "kuching tourist places list"
#retrieve
response_chunks = retrieve_client.retrieve(
            retrievalQuery={                                                                                
            'text': user_question
            },
            knowledgeBaseId = knowledge_base_id,
            retrievalConfiguration={
                'vectorSearchConfiguration': {
                    'numberOfResults': 10,                                                                                              
                    'overrideSearchType': 'HYBRID'
                }
                }
            )  

This snippet extracts the text content from the retrieval results of the previous Bedrock query and appends each chunk's text into a list (`chunks_list`).

In [355]:
chunks_list = []
for chunks in response_chunks['retrievalResults']:  
    chunks_list.append(chunks['content']['text'])

In [356]:
chunks_list

['travel_destination,itinerary_name,overview,tourist_attraction_names,tourist_attraction_description,address\r Kuching & Outskirts,Foreword - Introduction,"Welcome to Kuching: City of Unity and the beautiful capital of Sarawak. Come experience the city\'s breadth of culture, adventure, nature, food and festival offerings and discover many aspects that make this city truly unique. Kuching\'s cityscape is a beautiful mixture of classic Colonial Era shophouses and monuments, as well as modern high risers, contrasted with evergreen trees and shrubs lining the city streets, keeping Kuching green and pleasant. Enjoy the sights and sounds of Kuching City. Revel in its exemplary racial and religious harmony, and unity in diversity. Relish iconic cuisines, some of which are world famous. Venture further out to discover pristine nature and a world of adventures, creating precious memories that will forever be etched in your heart. Come spend a weekend in Kuching and discover extraordinary experi

In [357]:
prompt = f'''
        You are a highly skilled and knowledgeable assistant that performs RAG from the retrieved document chunks. 
        Answer the user question with the help of retrieved chunks.
        retrieved chunks: {chunks_list}
        user_question: {user_question}
        '''

predictor.predict({"inputs": prompt})

[{'generated_text': '\n        You are a highly skilled and knowledgeable assistant that performs RAG from the retrieved document chunks. \n        Answer the user question with the help of retrieved chunks.\n        retrieved chunks: [\'travel_destination,itinerary_name,overview,tourist_attraction_names,tourist_attraction_description,address\\r Kuching & Outskirts,Foreword - Introduction,"Welcome to Kuching: City of Unity and the beautiful capital of Sarawak. Come experience the city\\\'s breadth of culture, adventure, nature, food and festival offerings and discover many aspects that make this city truly unique. Kuching\\\'s cityscape is a beautiful mixture of classic Colonial Era shophouses and monuments, as well as modern high risers, contrasted with evergreen trees and shrubs lining the city streets, keeping Kuching green and pleasant. Enjoy the sights and sounds of Kuching City. Revel in its exemplary racial and religious harmony, and unity in diversity. Relish iconic cuisines, s

This snippet deletes the deployed model and its associated endpoint using the `delete_model()` and `delete_endpoint()` methods of the `predictor` object.

In [358]:
predictor.delete_model()
predictor.delete_endpoint()