In [1]:
import os
import langchain
import json

from langchain.llms import OpenAI, Cohere, HuggingFaceHub
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator

from langchain.chat_models import ChatOpenAI
from langchain import LLMChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationChain,SimpleSequentialChain

from langchain.agents import load_tools
from langchain.agents import initialize_agent

In [2]:
with open('secrets.json') as f:
    secrets = json.load(f)


In [3]:
os.environ['OPENAI_API_KEY'] = secrets['open_ai_key']

In [4]:
os.environ["HUGGINGFACEHUB_API_TOKEN"] =  secrets['hf_hub_key']

In [5]:
! head -n 20 datasets/IntroJapan.txt


  
  
    
      
        Japan and Its People
        Its famous bullet trains zip through the country at up to
        300 km (186 miles) per hour. Its factories feature the latest
        generation of industrial robots that dont eat, dont sleep, and never
        strike. Its high-tech consumer electronics companies have placed
        affordable — and notoriously reliable — electronic products in
        households around the world.
        But peel back a layer and a different picture starts to
        emerge. In many ways, Japan is not yet a truly modern country. Its
        social roots still lie deeply in its past as a feudal society of
        countless closely knit agricultural communities dominated by a small
        political elite.
        Japan is still trying to define its place in a world in
        which it is constantly accused of being an economic whale but a
        political minnow — just one of the pieces of the puzzle that is modern


In [30]:
# loading a text file
loader = TextLoader('datasets/IntroJapan.txt')

In [31]:
type(loader.load()),type(loader)

(list, langchain.document_loaders.text.TextLoader)

> We will vector Store Indexer for faster access and search on embeddings.  Chroma is the vector store we will be using. Other options include AtlasDB, Redis, Faiss etc

Further information about Chroma can be found here:https://www.trychroma.com/

In [37]:
# indexing the text file
index = VectorstoreIndexCreator().from_loaders([loader])

Using embedded DuckDB without persistence: data will be transient
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).


In [42]:
index.query('how fast is the bullet train')

' The bullet train can reach speeds of up to 300 km (186 miles) per hour.'

In [43]:
index.query('how is the climate in japan')

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).


' Japan has a temperate climate of unusually distinct seasons, with bitter winters and hot, humid summers. Winters are milder and sunnier on the Pacific coast. The northern tip of Hokkaido offers snowy conditions, while the subtropical region of southern Kyushu and Okinawa has popular coral reefs.'

> Lets check the answer to 'how is the climate in japan' directly using chatgpt instead of textfile like being done above

In [44]:
chatgpt = ChatOpenAI(model_name='gpt-3.5-turbo')

In [45]:
chatgpt([HumanMessage(content='how is the climate in japan')])

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).


AIMessage(content="As an AI language model, I do not have personal opinions or feelings, but I can provide you with factual information. Japan's climate is generally temperate, with four distinct seasons. The summers are warm and humid, with temperatures averaging around 30°C (86°F). In winter, temperatures can drop to below freezing, especially in the northern regions. Japan experiences monsoon-like weather in June and July, with heavy rainfall and occasional typhoons. Overall, Japan's climate varies depending on the region as it stretches across a long north-south axis.", additional_kwargs={})

In [52]:
flan = HuggingFaceHub(repo_id="google/flan-t5-base")

In [53]:
flan('how is the climate in japan')

'very cold'

> Going back to loading documents and answering questions

During vector store index creation: There are three main steps going on after the documents are loaded:

- Splitting documents into chunks

- Creating embeddings for each document

- Storing documents and embeddings in a vectorstore

**Exploring chains in langchain**

In [77]:
human_message_prompt = HumanMessagePromptTemplate(
        prompt=PromptTemplate(
            template="What are some advantages of {product}? Limit to two advantages.",
            input_variables=["product"],
        )
    )
human_message_prompt

HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['product'], output_parser=None, partial_variables={}, template='What are some advantages of {product}? Limit to two advantages.', template_format='f-string', validate_template=True), additional_kwargs={})

In [78]:
chat_prompt_template = ChatPromptTemplate.from_messages([human_message_prompt])
chat_prompt_template

ChatPromptTemplate(input_variables=['product'], output_parser=None, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['product'], output_parser=None, partial_variables={}, template='What are some advantages of {product}? Limit to two advantages.', template_format='f-string', validate_template=True), additional_kwargs={})])

In [79]:
chat = ChatOpenAI(temperature=0.9)
chain = LLMChain(llm=chat, prompt=chat_prompt_template)
print(chain.run("binoculars"))

1. Increased magnification: Binoculars provide a magnified view of distant objects that is not possible with the naked eye, allowing users to see greater detail and clarity.

2. Portability: Binoculars are lightweight and portable, making them easy to carry on outdoor adventures or to events where distant objects might be viewed, such as sporting events or concerts.


In [80]:
second_prompt = PromptTemplate(
    input_variables=["input_advantages"],
    template="""For each of the advantages listed here: 
    {input_advantages}
    ,
    what could be some disadvantages? explain.
    """,
)
chain_two = LLMChain(llm=chat, prompt=second_prompt)

In [81]:
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

after_chained_response = overall_chain.run("binoculars")
print(after_chained_response)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m1. Magnification: Binoculars allow you to see objects in greater detail and at a greater distance than you would be able to with just your naked eye.

2. Portability: Binoculars are generally small and lightweight, making them easy to carry around and use on-the-go. They can be used for a variety of activities such as hiking, bird watching, and sports events.[0m
[33;1m[1;3m1. Magnification: One disadvantage of binoculars with high magnification is that they may not offer a wide field of view, meaning that you may have to scan across the area you are observing to get a full picture. Additionally, higher magnification also means that any hand movement or slight shake can cause the image to blur or become distorted.

2. Portability: While binoculars are generally portable, some models may be heavier or bulkier than others. This can be inconvenient if you have limited space or are trying to minimize the weight of your 

**Working with langchain Agents**

In [86]:
gpt3 = OpenAI(model_name='text-davinci-003')

In [88]:
tools = load_tools(["wikipedia", "llm-math"], llm=gpt3)

In [89]:
agent = initialize_agent(tools, llm=gpt3, agent="zero-shot-react-description", verbose=True)

In [91]:
agent.run("What is the area of Japan? give me the area of Tokyo and Osaka together as percentage of the total area of Japan.")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should look up the areas of Tokyo and Osaka, and then compare them to the total area of Japan.
Action: Wikipedia
Action Input: Area of Japan[0m
Observation: [36;1m[1;3mPage: Geography of Japan
Summary: Japan is an archipelagic country comprising a stratovolcanic archipelago over 3,000 km (1,900 mi) along the Pacific coast of East Asia. It consists of 14,125 islands. The five main islands are Hokkaido, Honshu, Kyushu, Shikoku and Okinawa. The other 14,120 islands are classified as "remote islands" by the Japanese government. The Ryukyu Islands and Nanpō Islands are south and east of the main islands.
The territory covers 377,973.89 km2 (145,936.53 sq mi). It is the fourth largest island country in the world and the largest island country in East Asia. The country has the 6th longest coastline at 29,751 km (18,486 mi) and the 8th largest Exclusive Economic Zone of 4,470,000 km2 (1,730,000 sq mi) in the world.The terrain i

'Tokyo and Osaka together make up 15.49% of the total area of Japan.'

> using python REPL instead of LLM math

In [96]:
(2193+1905)/377975

0.01084198690389576

In [92]:
tools = load_tools(["wikipedia", "python_repl"], llm=gpt3)

In [93]:
agent = initialize_agent(tools, llm=gpt3, agent="zero-shot-react-description", verbose=True)

In [94]:
agent.run("What is the area of Japan? give me the area of Tokyo and Osaka together as percentage of the total area of Japan.")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find the area of Japan, Tokyo, and Osaka and calculate the percentage.
Action: Wikipedia
Action Input: "Japan area"[0m
Observation: [36;1m[1;3mPage: List of metropolitan areas in Japan
Summary: This is a list of metropolitan areas (都市圏, toshiken) in Japan by population as defined by the Statistics Bureau of Japan (SBJ) and the Center for Spatial Information Service of the University of Tokyo. The region containing most of the people in Japan between Tokyo and Fukuoka is often called the Taiheiyō Belt.

Page: List of Japanese prefectures by area
Summary: 

Page: List of dialing codes in Japan
Summary: These tables list the dialing codes (area codes) for calling land lines for various cities and districts in Japan, when dialing from within Japan. The leading 0 is omitted when calling from outside Japan. Cell phones use the dialing codes of 070, 080 or 090. IP-based phone services use the 050 dialing code.[0m
Thou

'The area of Tokyo and Osaka together is 0.19% of the total area of Japan.'