# LangChain demo

## Links
1. [samwit - colab - YT LangChain - Agents.ipynb, React & Serpai ](https://colab.research.google.com/drive/1QpvUHQzpHPvlMgBElwd5NJQ6qpryVeWE?usp=sharing)
1. [samwit - github youtube samples](https://github.com/samwit/langchain-tutorials/tree/main)
1. [Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)](https://www.youtube.com/watch?v=biS8G8x8DdA): [samwit - colab - YT LangChain Custom Tools & Agents.ipynb](https://colab.research.google.com/drive/1FYsa3x3PzziL57EHEIuIqa5rkCAxCbin?usp=sharing) uses chat-conversational-react-description

## Demo use case

### Tools
1. IncidentTool. Simple text
1. KnowledgeBaseTool: Vectorstore 
1. NewRelicTool: PandasAI
1. OrgSearchTool: Team, people, rules of engagement

### Flow
1. Get inc
1. Describe steps for user to create/trigger
    1. Narrative
    1. Bullet point flow
1. Describe what happened
    1. Narrative
    1. Bullet point flow
1. Team to direct inc to
    1. Team/person
    1. How to engage (i.e., email, Absa Snow with configs)
1. Extraction

In [141]:
import os
import json

# Load setting from Json outside of project.
f = open('../../env/ai.json')
settingsJson = json.load(f)

for key in settingsJson:
    os.environ[key] = settingsJson[key]
    
# # OR manually set them
# os.environ['REQUESTS_CA_BUNDLE'] = '../../env/ZCert.pem'
# os.environ['HUGGING_FACE_API_KEY'] = 'Get here: https://huggingface.co/settings/tokens'
# os.environ['OPENAI_API_KEY'] = 'Get here: https://platform.openai.com/account/api-keys'
# os.environ["SERPAPI_API_KEY"] = 'serpapi KEY, Get here: https://serpapi.com/manage-api-key'


In [142]:
#!pip -q install langchain huggingface_hub openai google-search-results tiktoken wikipedia

In [143]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI



In [144]:
llm = OpenAI(temperature=0)

## Create custom tool

In [145]:
from langchain.agents import Tool
from langchain.tools import BaseTool

# Simple Text Tool
Returns simple text

In [146]:
# Simple Text Tool

def incident_search(incident_number: str):
    f = open('data/incident-store.txt')
    with open('data/incident-store.txt') as f:
        lines = f.readlines()
        
    return  lines

incident_search_tool = Tool(
    name='Incident tool', 
    func= incident_search,    
    description="""
        Useful for queries about incidents.         
        Use the date to order incidents.
        The 'State' of an incident is 'Open' when the State field is not 'Resolved', 'Closed', 'Cancelled'        
        """
)

# Vectorstore 
Vectorstore
https://python.langchain.com/docs/integrations/toolkits/vectorstore \
\
Markdown \
Try this later on \
https://python.langchain.com/docs/modules/data_connection/document_loaders/markdown \
https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/markdown_header_metadata 



In [147]:

#!pip install unstructured
# I had issues here so ran the following to make PC trust PIP
# pip install --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org pip setuptools
# pip config set global.trusted-host "pypi.org files.pythonhosted.org pypi.python.org"

#!pip install markdown
#!pip install urllib3==1.25.11

#!pip install pypdf

In [148]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("data/knowledge-base.pdf")
pages = loader.load_and_split()

# loader = PyPDFLoader("data/south-africa-sarb-currency-and-exchanges-guidelines-for-business-entities.pdf")
# pages.append(loader.load_and_split())

# loader = PyPDFLoader("data/south-africa-sarb-currency-and-exchanges-guidelines-for-business-entities.pdf")
# pages.append(loader.load_and_split())

# south-africa-currency-and-exchanges-manual-for-authorised-dealers.pdf

# for i in pages:
#     print(i)
#     print('**************************')


In [149]:
#!pip install chromadb
# For chromadb you have to have C++ builders intalled see link below: "I have faced similar issue on Windows OS, while doing..."
#   https://stackoverflow.com/questions/73969269/error-could-not-build-wheels-for-hnswlib-which-is-required-to-install-pyprojec

#!pip install pydantic-settings

In [150]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embeddings = OpenAIEmbeddings()
# print(pages[0])
knowledgeStore = Chroma.from_documents(
    pages, embeddings, collection_name="knowledge-base"
)

Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "c:\sc\ai\lc-simple-agent-3\.venv\Lib\site-packages\langchain\vectorstores\chroma.py", line 80, in __init__
    import chromadb
  File "c:\sc\ai\lc-simple-agent-3\.venv\Lib\site-packages\chromadb\__init__.py", line 4, in <module>
    import chromadb.config
  File "c:\sc\ai\lc-simple-agent-3\.venv\Lib\site-packages\chromadb\config.py", line 12, in <module>
    from pydantic import BaseSettings, validator
  File "c:\sc\ai\lc-simple-agent-3\.venv\Lib\site-packages\pydantic\__init__.py", line 210, in __getattr__
  File "c:\sc\ai\lc-simple-agent-3\.venv\Lib\site-packages\pydantic\_migration.py", line 289, in wrapper
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.3/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.3/u/import-error

During handling of the above exception, another ex

In [151]:
import markdown
from langchain.document_loaders import UnstructuredMarkdownLoader
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import CharacterTextSplitter, TokenTextSplitter

#loader = UnstructuredMarkdownLoader("data/knowledge-base.md")
loader = UnstructuredMarkdownLoader("data/test.md")
documents = loader.load()

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
print(len(texts))
for i in texts:
    print(i)
    print('**************************')
# print(texts[0])
# print('********************************')
# print(texts[1])
# print('********************************')
# print(texts[2])
# print('********************************')
# print(texts[3])
# print('********************************')
# print(texts[4])
# print('********************************')
# print(texts[5])
# print('********************************')
# print(texts[6])

# # https://python.langchain.com/docs/integrations/vectorstores/starrocks
# loader = DirectoryLoader(
#     "./data", glob="**/*.md", loader_cls=UnstructuredMarkdownLoader
# )
# documents = loader.load()

# # load text splitter and split docs into snippets of text
# text_splitter = TokenTextSplitter(chunk_size=400, chunk_overlap=50)
# # split_docs = text_splitter.split_documents(documents)

# # tell vectordb to update text embeddings
# #update_vectordb = True


1
page_content='Heading 1\n\nabc\n\nHeading 2\n\nefg\n\nHeading 3\n\nhij\n\nHeading 4\n\nklm' metadata={'source': 'data/test.md'}
**************************


In [152]:

# from langchain.text_splitter import (
#     RecursiveCharacterTextSplitter,
#     Language,
# )

# loader = UnstructuredMarkdownLoader("data/knowledge-base.md")
# documents = loader.load()
# # print(type(documents[0]))
# # print(documents[0])

# # https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter#markdown
# md_splitter = RecursiveCharacterTextSplitter.from_language(
#     language=Language.MARKDOWN, chunk_size=60, chunk_overlap=0
# )

# md_docs = md_splitter.create_documents([documents])

# md_docs = md_splitter.create_documents([markdown_text])
# md_docs

# embeddings = OpenAIEmbeddings()
# state_of_union_store = Chroma.from_documents(
#     texts, embeddings, collection_name="state-of-union"
# )

#############################################################################

# from langchain.document_loaders import TextLoader

# loader = TextLoader("../../../state_of_the_union.txt")
# documents = loader.load()
# text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
# texts = text_splitter.split_documents(documents)

# embeddings = OpenAIEmbeddings()
# state_of_union_store = Chroma.from_documents(
#     texts, embeddings, collection_name="state-of-union"
# )

# Setup agent

In [153]:
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.agents import initialize_agent

tools = [incident_search_tool]

#del memory 

# conversational agent memory
memory = ConversationBufferWindowMemory(memory_key='chat_history', k=3, return_messages=True)

# Set up the turbo LLM
turbo_llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')

# create our agent
conversational_agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=turbo_llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=memory
)

In [154]:
question = """
You are a support engineer who is helping troubleshoot software issues. 
You responses need to be specific and factful, they can only contain information that you get back from the tools and observations.
If you don't know the answer then say so and don't make up anything. 

Question: How do I troubleshoot a user who's application keeps crashing when they click detach?
"""
# Question: What did the user do to create the latest incident?

answer = conversational_agent(question)
print(answer)
x = 1



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Incident tool",
    "action_input": "application crash when clicking detach"
}[0m
Observation: [36;1m[1;3m['INC123\n', 'Short Description: Rates not available\n', 'Date: 2023-08-20:  \n', 'Description: Rates are not ticking for USD/UG3/SP. Please investigate\n', 'State: Closed\n', 'Issue: All cross pair rates for TD or tenor 20230901 are not available from upstream due to an issue in the Static Data Store Service\n', 'Customer Impact: From 2023-08-20 start of day until 11:35 all clients would not have been able to get any rates or book deals for any cross pairs (all non-USD pairs like EURZAR) for the tenor TD or broken date 20230901.\n', 'Root Cause: A bug was found in Static Data Store Service which was triggered by the US holiday on 20230904 which caused rates to fail for 20230901.\n', 'Root Cause Fix: N/A - FX Data Team is still investigating.\n', 'Runbook: Rates Error Mapping Runbook\n', 'Mitigation fi