<a href="https://colab.research.google.com/github/aswinaus/Agents/blob/main/Routing_Agent_with_Semantic_Caching.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [9]:
!pip install llama-index -q
!pip install llama-index-tools-wolfram-alpha -q
!pip install langchain -q
!pip install langchain_experimental -q

In [10]:
import os
import nest_asyncio
nest_asyncio.apply()

In [11]:
from google.colab import userdata
# Set the OpenAI API key as an environment variable
os.environ["OPENAI_API_KEY"] =  userdata.get('OPENAI_API_KEY')

In [12]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings
# Setup OpenAI Model and Embeddings used for indexing the documents
Settings.llm = OpenAI(model='gpt-4o-mini', temperature=0.2)
Settings.embed_model = OpenAIEmbedding(model='text-embedding-3-small')
Settings.chunk_size = 1024

In [13]:
from google.colab import drive
drive.mount('/content/drive')
data_dir = '/content/drive/MyDrive' # Input a data dir path from your mounted Google Drive

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [14]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core import SimpleDirectoryReader
from llama_index.core import StorageContext, load_index_from_storage
from llama_index.core import VectorStoreIndex, SummaryIndex

In [15]:
# In order to avoid repeated calls to LLMs we can store the documents index and load it if present else create it
PERSIST_INDEX_DIR = f"/{data_dir}/RAG/data/"
def get_index(index_name, doc_file_path):
  index = None
  if not os.path.exists(f"{PERSIST_INDEX_DIR}{index_name}/"):
    # Load the documents
    documents = SimpleDirectoryReader(input_files=[doc_file_path]).load_data()
    index = VectorStoreIndex.from_documents(documents)
    # Store the index to disk
    index.storage_context.persist(f"{PERSIST_INDEX_DIR}{index_name}/")
  else: # Load index from disk
    storage_context = StorageContext.from_defaults(persist_dir=f"{PERSIST_INDEX_DIR}{index_name}/")
    index = load_index_from_storage(storage_context)

  return index

In [16]:
# Load OECD guidelines documents for Transfer Pricing
docs_OECD_guidelines = SimpleDirectoryReader(f"{data_dir}/RAG/data/OECD/").load_data()

In [17]:
#initialise a storage context and use that for both Vector Index and Summary Index
oecd_nodes = Settings.node_parser.get_nodes_from_documents(docs_OECD_guidelines)
oecd_storage_context = StorageContext.from_defaults()
oecd_storage_context.docstore.add_documents(oecd_nodes)

# Setup Vector and Summary Index from Storage Context
oecd_summary_index = SummaryIndex(oecd_nodes, storage_context=oecd_storage_context)
oecd_vector_index = VectorStoreIndex(oecd_nodes, storage_context=oecd_storage_context)

# Setup Indices.In order to avoid repeated calls to LLMs we can store the documents index and load it if present else create it
OECD_index = get_index("uber_10k",f"{data_dir}/RAG/data/OECD/OECD_Transfer_Pricing_Guidelines.pdf")
form990_guidelines_index = get_index("lyft_10k",f"{data_dir}/RAG/data/Form990/Form990_Guidelines.pdf")

In [18]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
# Create the query engines
OECD_engine = OECD_index.as_query_engine(similarity_top_k=3)
form990_guidelines_engine = form990_guidelines_index.as_query_engine(similarity_top_k=3)
# Create tools for the query engines
OECD_query_tool = QueryEngineTool(
                      query_engine=OECD_engine,
                      metadata=ToolMetadata(
                          name="OECD_QueryEngineTool_2022",
                          description="Provides information about Transfer Pricing Guidelines for Organization from OECD for year 2022"
                      )
                    )

Form990_query_tool = QueryEngineTool(
                      query_engine=form990_guidelines_engine,
                      metadata=ToolMetadata(
                          name="centralJersey_form990_2022",
                          description="Provides information about Central Jersey Form990 filling for the year 2022"
                      )
                    )

tools = [OECD_query_tool, Form990_query_tool]

filing_engine = RouterQueryEngine(
                      selector= LLMSingleSelector.from_defaults(),
                      query_engine_tools=tools
                      )

In [None]:
OECD_query="What documentation and approach is required for the valuation of assets in Transfer Pricing?"
Form990_query="An organization receives contributions worth of $15000 which Form within Form990 should the organization complete before filing returns to IRS"

In [None]:
response = filing_engine.query(OECD_query)
print (response)
print("\n.------------------.\n")
response = filing_engine.query(Form990_query)
print (response)
print("\n.------------------.\n")

The valuation of assets in Transfer Pricing typically requires comprehensive documentation that supports the valuation methods used. This includes:

1. **Comparable Analysis**: Gathering data on comparable transactions or assets to establish a benchmark for valuation.
2. **Valuation Methods**: Clearly outlining the valuation methods applied, such as the income approach, market approach, or cost approach.
3. **Financial Statements**: Providing relevant financial statements and reports that reflect the asset's performance and market conditions.
4. **Market Research**: Conducting market research to understand the economic environment and industry standards that may affect asset value.
5. **Legal and Regulatory Compliance**: Ensuring that the valuation complies with local and international regulations governing Transfer Pricing.

An organized approach that documents each step taken in the valuation process is essential for transparency and compliance with Transfer Pricing regulations.

.--

In [None]:
# Define the Summary and Vector query engines for Paul Graham text corpa

summary_query_engine = oecd_summary_index.as_query_engine(response_mode= "tree_summarize")
vector_query_engine = oecd_vector_index.as_query_engine()

# Now Create the query engine tools from the above query engines
summary_tool = QueryEngineTool(
                query_engine=summary_query_engine,
                metadata=ToolMetadata(
                    name="OECD_Summary",
                    description="Summarizes the OECD guidelines for Transfer Pricing"
                    )
                )
vector_tool = QueryEngineTool(
                query_engine=vector_query_engine,
                metadata = ToolMetadata(
                    name="OECD_Guidelines_QA",
                    description="Retrieves answers for questions on OECD"
                    )
                )
oecd_tools = [summary_tool, vector_tool]
# Now define the Router Query Engine
oecd_engine = RouterQueryEngine(
                      selector= LLMSingleSelector.from_defaults(),
                      query_engine_tools=oecd_tools
                      )


In [None]:
# Summarization routing
response = oecd_engine.query("Summarize the OECD guidelines for Transfer Pricing?")
print (response)


The OECD guidelines for Transfer Pricing establish a framework for multinational enterprises (MNEs) to ensure that intercompany transactions are priced in accordance with the arm's length principle. This principle mandates that the terms and conditions of transactions between associated enterprises should reflect those that would be agreed upon by unrelated parties under comparable circumstances.

Key elements of the guidelines include:

1. **Arm's Length Principle**: Central to the guidelines, it requires that intercompany pricing mirrors that of independent entities.

2. **Functional Analysis**: A detailed examination of the functions performed, assets used, and risks assumed by each party is essential for determining appropriate pricing.

3. **Transfer Pricing Methods**: Several methods are outlined for establishing arm's length prices, including Comparable Uncontrolled Price (CUP), Resale Price Method, Cost Plus Method, Transactional Net Margin Method (TNMM), and Profit Split Metho

In [None]:
# To Confirm whether the Summary Engine was used
print (response.metadata["selector_result"])

selections=[SingleSelection(index=0, reason='The question specifically asks for a summary of the OECD guidelines for Transfer Pricing, which corresponds directly to choice 1.')]


In [None]:
# Now route to a Question Answer engine
response = oecd_engine.query("What would be the best possible way to evaluate the Intangibles in Transfer Pricing?")
print (response)

The best possible way to evaluate intangibles in transfer pricing involves a structured and comprehensive approach that adheres to the arm's length principle. Key steps include:

1. **Functional Analysis**: Conduct a thorough analysis to identify the functions performed, assets used, and risks assumed by each party involved in the transaction. This understanding is crucial for assessing contributions to the intangible's value.

2. **Identification of Intangibles**: Clearly define the intangibles involved, including their nature, legal protections, and economic value. This encompasses both registered and unregistered intangibles.

3. **Valuation Techniques**: Employ appropriate valuation methods such as:
   - **Discounted Cash Flow Method**: Estimate the value based on projected future cash flows, considering associated risks.
   - **Comparable Uncontrolled Price (CUP) Method**: Use this method if reliable comparable transactions exist to benchmark pricing.
   - **Cost Approach**: Asses

In [None]:
# Confirm that Vector Engine was used.
print (response.metadata["selector_result"])

selections=[SingleSelection(index=0, reason='The question pertains to evaluating intangibles in Transfer Pricing, which is directly related to the OECD guidelines for Transfer Pricing that provide frameworks and methodologies for such evaluations.')]


In [None]:
!pip install -U faiss-cpu sentence_transformers transformers

In [None]:
import faiss
import json
import numpy as np
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM
import time

class SemanticCaching:
    def __init__(self, json_file='cache.json'):
        # Initialize Faiss index  with Euclidean distance
        self.index =faiss.IndexFlatL2(768)  # Use IndexFlatL2 with Euclidean distance
        if self.index.is_trained:
            print('Index trained')

        # Initialize Sentence Transformer model
        self.encoder = SentenceTransformer('all-mpnet-base-v2')


        # Uncomment the following lines to use DialoGPT for question generation
        # self.tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
        # self.model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large")

        # Set Euclidean distance threshold
        self.euclidean_threshold = 0.3
        self.json_file = json_file
        self.load_cache()

    def load_cache(self):
        # Load cache from JSON file, creating an empty cache if the file is not found
        try:
            with open(self.json_file, 'r') as file:
                self.cache = json.load(file)
        except FileNotFoundError:
            self.cache = {'questions': [], 'embeddings': [], 'answers': [], 'response_text': []}

    def save_cache(self):
        # Save the cache to the JSON file
        with open(self.json_file, 'w') as file:
            json.dump(self.cache, file)

    def ask(self, question: str) -> str:
        # Method to retrieve an answer from the cache or generate a new one
        start_time = time.time()
        try:
            l = [question]
            embedding = self.encoder.encode(l)

            # Search for the nearest neighbor in the index
            D, I = self.index.search(embedding, 1)

            if D[0] >= 0:
                if I[0][0] != -1 and D[0][0] <= self.euclidean_threshold:
                    row_id = int(I[0][0])
                    print(f'Found cache in row: {row_id} with score {1 - D[0][0]}') #score inversed to show similarity
                    end_time = time.time()
                    elapsed_time = end_time - start_time
                    print(f"Time taken: {elapsed_time} seconds")
                    return self.cache['response_text'][row_id]

            # Handle the case when there are not enough results or Euclidean distance is not met
            answer, response_text = self.generate_answer(question)

            self.cache['questions'].append(question)
            self.cache['embeddings'].append(embedding[0].tolist())
            self.cache['answers'].append(answer)
            self.cache['response_text'].append(response_text)

            self.index.add(embedding)
            self.save_cache()
            end_time = time.time()
            elapsed_time = end_time - start_time
            print(f"Time taken: {elapsed_time} seconds")

            return response_text
        except Exception as e:
            raise RuntimeError(f"Error during 'ask' method: {e}")

    def generate_answer(self, question: str) -> str:
        # Method to generate an answer using a separate function (make_prediction in this case)
        try:
            result = oecd_engine.query(question)
            #response_text = result['data']['response_text']
            response_text=result.json()

            return result, response_text
        except Exception as e:
            raise RuntimeError(f"Error during 'generate_answer' method: {e}")

In [None]:
cache = SemanticCaching()

Index trained


In [None]:
import markdown
from IPython.display import Markdown

def print_gpt_markdown(markdown_text):
    display(Markdown(markdown_text))

In [None]:
# Now Cache the request and response using Semantic Caching wherein the question is routed to a Question Answer engine
response = cache.ask("What would be the best possible way to evaluate the Intangibles in Transfer Pricing?")
print(response)


Found cache in row: 0 with score 1.0
Time taken: 0.21566176414489746 seconds
The best possible way to evaluate intangibles in transfer pricing involves a comprehensive approach that includes several key steps:

1. **Functional Analysis**: Conduct a detailed functional analysis to identify the functions performed, assets used, and risks assumed by each party involved in the transaction. This helps delineate the actual transaction and understand the contributions of each entity.

2. **Identification of Intangibles**: Clearly identify the specific intangibles involved, such as patents, trademarks, or proprietary technology, and understand their unique characteristics and economic value.

3. **Valuation Techniques**: Utilize appropriate valuation methods to determine the arm's length price for the intangibles. Common methods include:
   - **Comparable Uncontrolled Price (CUP)**: This method compares prices from similar transactions between independent parties.
   - **Discounted Cash Flow (