<a href="https://colab.research.google.com/github/mrcrchln/Custom-ChatBot-OpenAI/blob/main/Quickstart_langchain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Quickstart in Building a custom AI Chatbot built on OpenAI ChatGPT API

In [32]:
import warnings
warnings.filterwarnings('ignore')

In [33]:
!pip install -Uqqq pip --progress-bar off
!pip install -qqq langchain==0.0.139 --progress-bar off
!pip install -qqq openai==0.27.4 --progress-bar off
!pip install -Uqqq watermark==2.3.1 --progress-bar off
!pip install -Uqqq chromadb==0.3.21 --progress-bar off
!pip install -Uqqq tiktoken==0.3.3 --progress-bar off

[0m

In [2]:
%load_ext watermark

In [3]:
import os
import textwrap

import chromadb
import langchain
import openai
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import WebBaseLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.indexes import VectorstoreIndexCreator
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.vectorstores import Chroma

In [4]:
%watermark --iversions -v -m

Python implementation: CPython
Python version       : 3.10.12
IPython version      : 7.34.0

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 5.15.109+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit

openai   : 0.27.4
langchain: 0.0.139
chromadb : 0.3.21



In [5]:
def print_response(response: str):
    print("\n".join(textwrap.wrap(response, width=100)))

In [21]:
os.environ["OPENAI_API_KEY"] = "your OPENAI API KEY"

In [22]:
model = OpenAI(temperature=0)

In [34]:
print(
    model(
        "You're Barack Obama. Suggest 5 places to visit in your Hometown."
    )
)



1. The Art Institute of Chicago
2. Millennium Park
3. The Field Museum
4. The Museum of Science and Industry
5. The Adler Planetarium


## Q&A Over a Document

In [15]:
loader = WebBaseLoader(
    "https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm"
)

In [16]:
documents = loader.load()
len(documents)

1

In [17]:
document = documents[0]
document.__dict__.keys()

dict_keys(['page_content', 'metadata'])

In [18]:
document.page_content[:100]

"\n\n\n\n\nTwitter's Recommendation Algorithm\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nEngineering\n\n\n\n\n\nB"

In [19]:
document.metadata

{'source': 'https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm'}

In [24]:
index = VectorstoreIndexCreator().from_loaders([loader])



In [38]:
query = """
You're Barack Obama.
Explain the Twitter recommendation algorithm in 5 sentences using analogies from your presidency.
"""
print_response(index.query(query))

 I like to think of the Twitter recommendation algorithm as a kind of filter that helps to distill
the vast amount of information available on Twitter into a manageable selection of the most relevant
and interesting content. It's like having a team of advisors who can quickly sift through the news
of the day and present me with the most important stories. The algorithm uses a combination of
heuristics and embedding spaces to identify the most relevant content, much like I had to use a
combination of data and intuition to make decisions during my presidency. The algorithm also uses
graph traversals to identify out-of-network content, which is like having a network of contacts who
can provide me with information from outside of my usual sources. Finally, the algorithm uses a
logistic regression model to rank the resulting Tweets, which is like having a team of experts who
can evaluate the information and prioritize it for me.


## References

- [Twitter's Recommendation Algorithm](https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm)