# File Name: simple_knwl_bases_clean_up.ipynb
### Location: Chapter 7
### Purpose: 
#####             1. Cleaning resources helps reduce unnecessary expenses. 
#####                a) listing and deleting all data sources associated with a specified KnowledgeBase in Bedrock
#####                b) Delete Amazon OpenSearch serverless Collection using its ARN 
##### Dependency: All the above code of Chapter 7. 
# <ins>-----------------------------------------------------------------------------------</ins>

# <ins>Amazon SageMaker Classic</ins>
#### Those who are new to Amazon SageMaker Classic. Follow the link for the details. https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html

# <ins>Environment setup of Kernel</ins>
##### Fill "Image" as "Data Science"
##### Fill "Kernel" as "Python 3"
##### Fill "Instance type" as "ml-t3-medium"
##### Fill "Start-up script" as "No Scripts"
##### Click "Select"

###### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html for details.

# <ins>Mandatory installation on the kernel through pip</ins>

##### This lab will work with below software version. But, if you are trying with latest version of boto3, awscli, and botocore. This code may fail. You might need to change the corresponding api. 

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell. 

In [None]:
%pip install --no-build-isolation --force-reinstall -q \
    "boto3>=1.34.84" \
    "opensearch-py>=2.7.1" \
    "retrying>=1.3.4" \
    "ragas" \
    "ipywidgets>=7.6.5" \
    "iprogress>=0.4" \
    "langchain>=0.2.16" \
    "langchain_community>=0.2.17" \
    "awscli>=1.32.84" \
    "botocore>=1.34.84" \
    "langchain-aws>=0.1.7"    

# <ins>Disclaimer</ins>

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell.

# <ins>Restart the kernel</ins>

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# <ins>Python package import</ins>

##### boto3 offers various clients for Amazon Bedrock to execute various actions.
##### botocore is a low-level interface to AWS tools, while boto3 is built on top of botocore and provides additional features

In [None]:
import boto3
import time
import warnings
import os
import json

### Ignore warning 

In [None]:
warnings.filterwarnings('ignore')

### %store magic command to retrive all the variable value from other notebook. 
### Here, simple_knwl_bases_building.ipynb

In [None]:
%store -r 

## Define important environment variable

In [None]:
%%time

# Try-except block to handle potential errors
try:
    # Create a new Boto3 session to interact with AWS services
    boto3_session_name = boto3.session.Session()

    # Create a Bedrock Agent client using the current session and region
    bedrock_agent_client = boto3_session_name.client('bedrock-agent', region_name=aws_region_name)


    # Create an S3 client to interact with Amazon S3
    s3_client = boto3.client('s3')

    # Create an STS client to interact with AWS Security Token Service (STS)
    sts_client = boto3.client('sts')

    # Create an OpenSearch Serverless (AOSS) client using the current session
    aoss_client = boto3_session_name.client('opensearchserverless')


    # Create an IAM client to interact with Identity and Access Management (IAM) service
    iam_client = boto3_session_name.client('iam')

    # Retrieve the current AWS account number and ARN of the caller
    sts_client = boto3.client('sts')
    identity_arn = sts_client.get_caller_identity().get('Arn')
    
    # Store all variables in a dictionary
    variables_store = {
        "aws_region_name": aws_region_name,
        "bedrock_agent_client": bedrock_agent_client,
        "opensearch_service_name": opensearch_service_name,
        "s3_client": s3_client,
        "sts_client": sts_client,
        "aws_account_id": aws_account_id,
        "aoss_client": aoss_client,
        "vector_store_name": vector_store_name,
        "index_name": index_name,
        "iam_client": iam_client,
        "sts_client": sts_client,
        "identity_arn": identity_arn,
        "bedrock_knowledge_bases_name": bedrock_knowledge_bases_name,
        "aoss_collectionarn": aoss_collectionarn,
        "aoss_collection_host": aoss_collection_host,
        "genaibookedbedrocksagemakerexecutionrolearn": genaibookedbedrocksagemakerexecutionrolearn,
        "knowledgeBaseId": knowledgeBaseId,
        "security_policy_name": security_policy_name,
        "network_policy_name": network_policy_name,
        "access_policy_name": access_policy_name
    }


    # Print all variables
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

except Exception as e:
    print(f"An unexpected error occurred: {e}")


# Delete Amazon Bedrock Knowledge Bases and corresponding data sources

##### This code effectively manages the process of listing and deleting all data sources associated with a specified KnowledgeBase in Bedrock.

In [None]:
%%time

def list_data_sources(client, knowledge_base_id):
    """
    Fetch the data sources associated with the given KnowledgeBase.
    
    Parameters:
    - client: Bedrock client instance
    - knowledge_base_id: The ID of the KnowledgeBase
    
    Returns:
    - A list of data sources summaries or None if an error occurs.
    """
    try:
        response = client.list_data_sources(knowledgeBaseId=knowledge_base_id)
        data_sources_list = response.get('dataSourceSummaries', [])
        print(f"Successfully retrieved {len(data_sources_list)} data source(s).")
        return data_sources_list
    except Exception as e:
        print(f"Error listing data sources: {e}")
        return None

def delete_data_source(client, data_source_id, knowledge_base_id):
    """
    Delete a specific data source from the KnowledgeBase.
    
    Parameters:
    - client: Bedrock client instance
    - data_source_id: The ID of the data source to delete
    - knowledge_base_id: The ID of the KnowledgeBase
    
    Returns:
    - True if deletion was successful, False otherwise.
    """
    try:
        client.delete_data_source(dataSourceId=data_source_id, knowledgeBaseId=knowledge_base_id)
        print(f"Successfully deleted data source: {data_source_id}")
        return True
    except Exception as e:
        print(f"Error deleting data source {data_source_id}: {e}")
        return False

def delete_all_data_sources(client, knowledge_base_id):
    """
    Delete all data sources associated with the specified KnowledgeBase.
    
    Parameters:
    - client: Bedrock client instance
    - knowledge_base_id: The ID of the KnowledgeBase
    """
    # Step 1: List all data sources
    data_sources_list = list_data_sources(client, knowledge_base_id)
    if data_sources_list is None:
        return  # Exit if listing data sources failed

    # Step 2: Delete each data source
    for idx, ds in enumerate(data_sources_list):
        data_source_id = ds.get("dataSourceId")
        kb_id = ds.get("knowledgeBaseId")

        # Deleting the data source
        if delete_data_source(client, data_source_id, kb_id):
            # Wait for 10 seconds between each deletion to avoid rate limiting issues
            time.sleep(10)

# Assuming `bedrock_agent_client` and `knowledgeBaseId` are initialized
delete_all_data_sources(bedrock_agent_client, knowledgeBaseId)

# Delete Amazon OpenSearch serverless Collection using its ARN 

In [None]:
%%time

def extract_collection_id_from_arn(arn):
    """Extracts the collection ID from the OpenSearch Serverless ARN."""
    try:
        # The collection ID is the last segment after the "/" in the ARN
        collection_id = arn.split("/")[-1]
        return collection_id
    except Exception as e:
        print(f"Error extracting collection ID from ARN: {e}")
        return None

def delete_opensearch_collection(aoss_client, collection_arn):
    """Deletes an OpenSearch Serverless collection using its ARN."""
    # Step 1: Extract the collection ID from the ARN
    collection_id = extract_collection_id_from_arn(collection_arn)
    
    if collection_id is None:
        print("Invalid collection ARN. Cannot proceed with deletion.")
        return
    
    # Step 2: Use the OpenSearch client to delete the collection by ID
    try:
        response = aoss_client.delete_collection(id=collection_id)
        print(f"Collection {collection_id} deleted successfully.")
    except Exception as e:
        print(f"Error deleting collection: {e}")

# Call the function to delete the collection
delete_opensearch_collection(aoss_client, aoss_collectionarn)

In [None]:
%%time

# Finally delete the knowledge base


try:
    # Attempt to delete the specified knowledge base
    response = bedrock_agent_client.delete_knowledge_base(knowledgeBaseId=knowledgeBaseId)
    
    # Log the successful response
    print(f"Knowledge Base deleted successfully: {response}")

except bedrock_agent_client.exceptions.ResourceNotFoundException as e:
    # Handle the case where the Knowledge Base does not exist
    print(f"Knowledge Base with ID {knowledgeBaseId} not found: {e}")

except bedrock_agent_client.exceptions.AccessDeniedException as e:
    # Handle insufficient permissions
    print(f"Access denied when attempting to delete Knowledge Base: {e}")

except bedrock_agent_client.exceptions.ValidationException as e:
    # Handle invalid input or parameters
    print(f"Invalid input for Knowledge Base deletion: {e}")

except Exception as e:
    # Catch any other unexpected errors
    print(f"An unexpected error occurred: {e}")

# End of NoteBook 

#### <ins>Step 1</ins> 

##### Please ensure that you close the kernel after using this notebook to avoid any potential charges to your account.

##### Process: Go to "Kernel" at top option. Choose "Shut Down Kernel". 
##### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/studio-ui.html
