In [31]:
!pip -q install -U langchain pydantic chromadb pypdf sentence_transformers google-search-results wikipedia openai Xformers

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [1]:
!pip show langchain

Name: langchain
Version: 0.0.179
Summary: Building applications with LLMs through composability
Home-page: https://www.github.com/hwchase17/langchain
Author: 
Author-email: 
License: MIT
Location: /home/ec2-user/anaconda3/envs/pytorch_p39/lib/python3.9/site-packages
Requires: aiohttp, async-timeout, dataclasses-json, numexpr, numpy, openapi-schema-pydantic, pydantic, PyYAML, requests, SQLAlchemy, tenacity
Required-by: 


In [2]:
import re
import sys
import json
import torch
import logging
from random import choice
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.llms import HuggingFacePipeline

In [3]:
logger = logging.getLogger('api')
logger.setLevel(logging.INFO)

logHandler = logging.StreamHandler(sys.stdout)
logger.addHandler(logHandler)

In [4]:
def setup_model(model_name: str, cache_dir: str = None):
    tokenizer = AutoTokenizer.from_pretrained(
        model_name,
        cache_dir=cache_dir
    )
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        load_in_8bit=True,
        torch_dtype=torch.float16,
        device_map='auto',
        low_cpu_mem_usage=True,
        cache_dir=cache_dir,
    )
    return tokenizer, model

In [5]:
model_name='TheBloke/wizardLM-7B-HF'
cache_dir='/home/ec2-user/SageMaker/.cache'
tokenizer, model = setup_model(model_name, cache_dir)


Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
bin /home/ec2-user/anaconda3/envs/pytorch_p39/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p39/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p39/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...


Either way, this might cause trouble in the future:
If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
  warn(msg)


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [6]:
!nvidia-smi

Fri May 26 04:43:05 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A10G         On   | 00000000:00:1E.0 Off |                    0 |
|  0%   27C    P0    62W / 300W |  18574MiB / 23028MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [7]:
pipe = pipeline(
    task='text-generation',
    model=model,
    tokenizer=tokenizer, 
    max_length=2048,
    temperature=0,
    top_p=0.95,
    repetition_penalty=1.15,   
)

In [8]:
llm = HuggingFacePipeline(pipeline=pipe)
print(llm('What is the capital of England?'))


London is the capital city of England.


## Setup Langchain
- https://serpapi.com/ for SERPAPI_API_KEY

In [9]:
import os
from langchain.agents import load_tools
from langchain.agents import initialize_agent

In [10]:
os.environ["SERPAPI_API_KEY"] = ""

## Run below for use chatgpt

```python
from langchain.llms import OpenAI

# https://platform.openai.com/account/api-keys
os.environ["OPENAI_API_KEY"] = ""

llm = OpenAI(temperature=0)
```

In [33]:
agent = initialize_agent(tools, 
                         llm, 
                         agent="zero-shot-react-description", 
                         verbose=True)

In [34]:
logger.info(agent.agent.llm_chain.prompt.template)

Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Wikipedia: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
terminal: Run shell commands on this Linux machine.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator, Wikipedia, terminal]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!



## Run

In [36]:
agent.run("Who is the United States President? What is his current age raised divided by 2?")

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..




[1m> Entering new AgentExecutor chain...[0m


Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..


KeyboardInterrupt: 

## Upgrade agents

In [21]:
tools = load_tools(["serpapi", "llm-math", "wikipedia", "terminal"], llm=llm)

In [22]:
agent = initialize_agent(tools, 
                         llm, 
                         agent="zero-shot-react-description", 
                         verbose=True)

In [23]:
logger.info(agent.agent.llm_chain.prompt.template)

Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Wikipedia: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
terminal: Run shell commands on this Linux machine.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator, Wikipedia, terminal]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!



In [24]:
agent.run("Who is the head of DeepMind")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the search function
Action: Search
Action Input: "Who is the head of DeepMind"[0m
Observation: [36;1m[1;3mDemis Hassabis is the co-founder and CEO of DeepMind, a neuroscience-inspired AI company, bought by Google in Jan 2014 in their largest European acquisition to date.[0m
Thought:[32;1m[1;3m That's correct.
Action: Next?
Action Input: None[0m
Observation: Next? is not a valid tool, try another one.
Thought:[32;1m[1;3m I could use the calculator to find out how much money DeepMind has raised in funding.
Action: Calculator
Action Input: What is the value of x + y = 5[0m

ValueError: unknown format from LLM: when x = 4 and y = 3?

```text
x + y = 5
x = 4, y = 3
```
...numexpr.evaluate("x + y = 5")...

In [25]:
agent.run("Where is DeepMind's office?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the calculator to find the distance between London and San Francisco.
Action: 
Action Input: [0m
Observation:  is not a valid tool, try another one.
Thought:



[32;1m[1;3m I should use the search function to look up "DeepMind's office location".
Action: 
Action Input: [0m
Observation:  is not a valid tool, try another one.
Thought:[32;1m[1;3m I should use the Wikipedia wrapper to look up "DeepMind" and then click on the "Office" link in the infobox.
Action: 
Action Input: [0m
Observation:  is not a valid tool, try another one.
Thought:[32;1m[1;3m I should use the terminal to run a command to open the DeepMind website and then navigate to the "About Us" page.
Action: 
Action Input: [0m
Observation:  is not a valid tool, try another one.
Thought:

KeyboardInterrupt: 