<a href="https://colab.research.google.com/github/softmurata/colab_notebooks/blob/main/llm/mpt7b_instruct.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Installation

In [None]:
!CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers
!pip install langchain

Download model

In [None]:
!wget https://huggingface.co/TheBloke/MPT-7B-Instruct-GGML/resolve/main/mpt-7b-instruct.ggmlv3.q5_0.bin

Load model

In [None]:
from ctransformers.langchain import CTransformers
llm = CTransformers(model='/content/mpt-7b-instruct.ggmlv3.q5_0.bin',
                    model_type='mpt')

Inference

In [6]:
from langchain import PromptTemplate, LLMChain
template = """Question: {question}
Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)

In [None]:
response = llm_chain.run("What is AI?")
print(response)

In [None]:
response = llm_chain.run("日本について教えて")
print("response:", response)

Retrieval QA

In [None]:
!pip install InstructorEmbedding sentence_transformers
!pip install faiss-cpu

In [1]:
from langchain.vectorstores import FAISS
from ctransformers.langchain import CTransformers
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.document_loaders import TextLoader

In [None]:
# faiss.txt
"""
In this article, we tried out ImageBind, a MultiModality AI toolset from MetaAI. The examples provided on their official website for Image, Audio, and Text showed impressive accuracy in classification. It is a remarkable achievement. We also attempted to use Depth data with the help of the DPT large depth estimator, but encountered some issues that need to be resolved. However, this algorithm could be widely applicable if the issues are resolved. It is a good point that MetaAI is not solely relying on LLM to solve their problems.

We plan to continue posting articles on LLM, Diffusion model, Image Analysis, and 3D in the future, so please stay tuned for more.
"""

In [None]:
llm = CTransformers(model='/content/mpt-7b-instruct.ggmlv3.q5_0.bin',
                    model_type='mpt')
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2",
                                                      model_kwargs={"device": "cpu"})

# ドキュメントの読み込み
loader = TextLoader("/content/faiss.txt")
documents = loader.load()

db = FAISS.from_documents(documents, instructor_embeddings)
# db = FAISS.load_local("faiss_index", instructor_embeddings)
retriever = db.as_retriever(search_kwargs={"k": 1})

qa_chain = RetrievalQA.from_chain_type(llm=llm,
                                  chain_type="stuff",
                                  retriever=retriever)

In [None]:
qa_chain.run("What is ImageBind？")