In [29]:
# %%bash
# pip install langchain -U -i https://pypi.douban.com/simple
# pip install sentence-transformers -i https://pypi.douban.com/simple
# pip install pydantic==1.10.8 -i https://pypi.douban.com/simple
# pip install chromadb #-i https://pypi.douban.com/simple

# RAG

## Internet

https://zhuanlan.zhihu.com/p/643233392

### 1. Get the Doc

In [4]:
import requests
from bs4 import BeautifulSoup

# url = "https://en.wikipedia.org/wiki/GPT-4"
url = "https://openai.com/research/gpt-4"
response = requests.get(url)

soup = BeautifulSoup(response.content, 'html.parser')

# find the content div
content_div = soup.find('div', {'class': 'ui-block--text'})

# remove unwanted elements from div
unwanted_tags = ['sup', 'span', 'table', 'ul', 'ol']
for tag in unwanted_tags:
    for match in content_div.findAll(tag):
        match.extract()

article_text = content_div.get_text()

print(article_text)

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. We’ve spent 6 months iteratively aligning GPT-4 using lessons from our adversarial testing program as well as ChatGPT, resulting in our best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.Over the past two years, we rebuilt our entire deep learning stack and, together with Azure, co-designed a supercomputer from the ground up for our workload. A year ago, we trained GPT-3.5 as a first “test run” of the system. We found and fixed some bugs and improved our 

In [23]:
article_text = """
We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. We’ve spent 6 months iteratively aligning GPT-4 using lessons from our adversarial testing program as well as ChatGPT, resulting in our best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.Over the past two years, we rebuilt our entire deep learning stack and, together with Azure, co-designed a supercomputer from the ground up for our workload. A year ago, we trained GPT-3.5 as a first “test run” of the system. We found and fixed some bugs and improved our theoretical foundations. As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our first large model whose training performance we were able to accurately predict ahead of time. As we continue to focus on reliable scaling, we aim to hone our methodology to help us predict and prepare for future capabilities increasingly far in advance—something we view as critical for safety.We are releasing GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input capability for wider availability, we’re collaborating closely with a single partner to start. We’re also open-sourcing OpenAI Evals, our framework for automated evaluation of AI model performance, to allow anyone to report shortcomings in our models to help guide further improvements.
"""

### 2. Split the Doc

In [24]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 100,
    chunk_overlap  = 20,
    length_function = len,
)

texts = text_splitter.create_documents([article_text])
print(texts[0])
print(texts[1])

page_content='We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is'
page_content='learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text'


### 3. Text Chunks to Embeddings

In [25]:
from langchain.embeddings import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings(model_name='shibing624/text2vec-base-chinese')

In [None]:
# vector_store = FAISS.load_local(vs_path, self.embeddings)

from langchain.vectorstores import Chroma

# use the text chunks and the embeddings model to fill our vector store
db = Chroma.from_documents(texts, embedding)

In [3]:
from langchain import PromptTemplate

user_question = "ChatGPT and GPT4, which on is more stable?"

In [None]:
# use our vector store to find similar text chunks
results = db.similarity_search(
    query=user_question,
    n_results=5
)

In [5]:
results = """
[Document(page_content='As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our'),
 Document(page_content='GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input'),
 Document(page_content='We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a'),
 Document(page_content='GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that,')]
"""

In [9]:
results

"\n[Document(page_content='As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our'),\n Document(page_content='GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input'),\n Document(page_content='We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a'),\n Document(page_content='GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that,')]\n"

In [10]:
# define the prompt template
template = """
You are a chat bot who loves to help people! Given the following context sections, answer the
question using only the given context. If you are unsure and the answer is not
explicitly writting in the documentation, say "Sorry, I don't know how to help with that."

Context sections:
{context}

Question:
{users_question}

Answer:
"""

prompt = PromptTemplate(template=template, input_variables=["context", "users_question"])

# fill the prompt template
prompt_text = prompt.format(context=results, users_question=user_question)

In [16]:
results

[Document(page_content='As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our'),
 Document(page_content='GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input'),
 Document(page_content='We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a'),
 Document(page_content='GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that,')]

### 4. Go to LLM

In [7]:
from load_model import *

  from pandas.core.computation.check import NUMEXPR_INSTALLED


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


The endpoint attribute has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


In [11]:
# ask the defined LLM
llm(prompt_text)

'GPT-4 is more stable as per the given context.'

In [12]:
llm(user_question)

'It is difficult to determine which version of GPT (GPT3 or GPT4) is more stable as it depends on the specific context. However, GPT4 has been released recently and has some new features and improvements over GPT3. It is also designed to run more efficiently on hardware, which could make it more stable.'