## AI Integration Demo
The following notebook contains a comprehensive demonstration of the fundamental AI integration techniques I used during my time at SilverEdge Government Solutions from 2024-present. These techniques include:
- Document text extraction via Apache Tika
- Web Retrieval Augmented Generation (RAG) via Tavily
- Token count estimation via Tiktoken
- Creation and deployment of AI agents via LangChain

### 1) Text Extraction via Apache Tika

<i>tika 3.1.0 (https://pypi.org/project/tika/)</i>

The first step in creating a reliable AI agent is to implement a means of providing context to the Large Language Model (LLM). This is especially useful for cases where a user queries an agent about current events or something specific to their organization where information supporting a response is outside of the LLM's training data. First, we'll extract text from a local PDF file.

In [None]:
import tika
from tika import parser

tika.initVM()

# Extract text and metadata from a local PDF file
parsed = parser.from_file('NLPPaper.pdf')

pdf_metadata = parsed['metadata']
pdf_content = parsed['content']

print(pdf_metadata)
print(pdf_content)

Now that the text has been successfully extracted, it's time to parse the Tika data into a LangChain-compatible object using the <i>LangChain Documents</i> module (https://reference.langchain.com/python/langchain_core/documents/)

In [None]:
from langchain.schema import Document
import re

# Strip content of extra whitespaces
cleaned_pdf_content = re.sub(r'\s+', ' ', pdf_content)
pdf_document = Document(page_content = cleaned_pdf_content, metadata = {'source': pdf_metadata.get('resourceName')})

print(pdf_document)

### 2) Web RAG via Tavily

<i> tavily 1.1.0 https://pypi.org/project/tavily/)</i>

Next, we will generate more context for the LLM by performing a simple web search based on URL snippets.

In [None]:
from tavily import TavilyClient

tavily = TavilyClient(api_key = '')

# User query to LLM
query = 'What is NLP?'

web_results = tavily.search(
    query = query,
    search_depth = 'basic',
    max_results = 10,
    include_raw_content = True
)

print(web_results)

Now that we have the web results, let's parse them into a list of LangChain Documents like before.

In [None]:
documents = []

results = web_results.get('results', [])

for result in results:
    documents.append(Document(page_content = result.get('content'), metadata = {'source': result.get('title')}))

print(documents)

Finally, we'll add the LangChain Document containing the extracted text from the PDF file. This will ensure the LLM has both internal and external context to support the user query.

In [None]:
documents.append(pdf_document)

print(documents)

### 3) Token Estimation via Tiktoken

<i>tiktoken 0.12.0 (https://pypi.org/project/tiktoken/)</i>

Now that we've finished the context preprocessing, we can execute a quick token estimation. Most LLMs have a maximum token parameter that constrains the input tokens (i.e., query and context) to a fixed number. Unfortunately, Tiktoken is primarily used for OpenAI models, so it won't be accurate for all models. However, this can still be quite useful in warning users when they may be approaching the model limits.

In [None]:
import tiktoken
from typing import List

def get_document_content(documents: List[Document]) -> str:
    document_content = ''
    
    for document in documents:
        document_content += document.page_content

    cleaned_document_content = re.sub(r'\s+', ' ', document_content)
    return cleaned_document_content

def num_tokens_from_string(string: str) -> int:
    # Byte Pair Encoding (BPE) tokenization technique (https://www.geeksforgeeks.org/nlp/byte-pair-encoding-bpe-in-nlp/)
    encoding = tiktoken.get_encoding('o200k_base')
    num_tokens = len(encoding.encode(string))
    return num_tokens

# Build context string for token estimate
context_str = query + ' ' + get_document_content(documents)

# Get token estimate
print(f"Token Estimate: {num_tokens_from_string(context_str)}")


### 4) AI Agent Creation via LangChain

<i>langchain 1.2.6 (https://pypi.org/project/langchain/) </i>

LangChain is an open-source framework for building applications powered by LLMs. Users can integrate various commercial providers (e.g., Amazon, Azure, Oracle), access the latest LLMs, and connect them to external data through its comprehensive API (Python/JavaScript).

In the following section, we're going to set up several LLM implementations and stream a response based on the query and context combination.

In [None]:
from enum import StrEnum, auto

from langchain_aws import ChatBedrock
from langchain_oci import ChatOCIGenAI
from oci.generative_ai_inference import GenerativeAiInferenceClient
from langchain_openai import AzureChatOpenAI

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage

# LLM Provider Options
class LLMProvider(StrEnum):
    AMAZON_BEDROCK = auto()
    AZURE_OPENAI = auto()
    ORACLE_CLOUD_INFRASTRUCTURE = auto()

# LLM Provider Credentials

# AWS Static Credentials
aws_access_key = ''
aws_secret_access_key = ''

# Azure OpenAI Credentials
azure_api_key = ''
azure_endpoint = ''

# Oracle Cloud Infrastructure Credentials
oci_service_endpoint = ''
oci_compartment_id = ''
oci_user = ''
oci_tenancy = ''
oci_fingerprint = ''
oci_region = ''
oci_private_key = ''

def get_llm_instance(provider: str):
    if provider == LLMProvider.AMAZON_BEDROCK:
        return ChatBedrock(
            model_id = 'us.anthropic.claude-sonnet-4-20250514-v1:0',
            aws_access_key_id = aws_access_key,
            aws_secret_access_key = aws_secret_access_key
        )
    if provider == LLMProvider.AZURE_OPENAI:
        return AzureChatOpenAI(
            azure_deployment = 'gpt-4o-mini',
            api_version = '2025-01-01-preview',
            azure_endpoint = azure_endpoint,
            api_key = azure_api_key
        )
    if provider == LLMProvider.ORACLE_CLOUD_INFRASTRUCTURE:
        oracle_config = {
            'user': oci_user,
            'tenancy': oci_tenancy,
            'fingerprint': oci_fingerprint,
            'region': oci_region,
            'key_content': oci_private_key
        }

        client = GenerativeAiInferenceClient(
            config = oracle_config,
            service_endpoint = oci_service_endpoint
        )

        return ChatOCIGenAI(
            client = client,
            model_id = 'xai.grok-4',
            compartment_id = oci_compartment_id
        )

prompt = ChatPromptTemplate.from_messages([
    ('system',
     'You are a helpful assistant, skilled in finding facts and explaining the meaning behind them.\n'
     'You can draw on your own knowledge and use any documents provided.\n'
    ),
    ('user', 'Uploaded Sources:\n{context}\n\nQuery: {query}')
])

formatted_prompt = prompt.invoke({
    'query': query,
    'context': documents
})

llm = get_llm_instance(LLMProvider.ORACLE_CLOUD_INFRASTRUCTURE)

response = llm.invoke(formatted_prompt)
print(response)

### Conclusion

Through this walkthrough, we explored:

- Unstructured data ingestion using Apache Tika to extract text from documents
- Web-based Retrieval Augmented Generation (RAG) to enrich prompts with up-to-date external context
- Token estimation to better understand performance and cost tradeoffs across different prompt sizes
- Agent-based orchestration with LangChain, enabling flexible composition of tools and LLM interactions

Taken together, these components reflect common patterns found in production AI systems: ingesting diverse data sources, grounding responses in external knowledge, managing operational constraints, and abstracting complexity through reusable agents.

While the examples here are intentionally lightweight, the same patterns can be extended to support enterprise-scale use cases such as knowledge assistants, document analysis pipelines, decision-support tools, and AI-augmented workflows. The goal of this demo is not just to show what is possible with modern LLM tooling, but how these pieces fit together in a maintainable and scalable way.

Future enhancements could include persistent vector stores, multi-agent collaboration, streaming responses, or integration with external APIs and databases. This notebook serves as a foundation for experimenting with and iterating on those ideas.