# Deeplake on GPT and GGUF
https://python.langchain.com/docs/use_cases/question_answering/how_to/code/code-analysis-deeplake

This notebook demostrate the difference between ChatGPT and Llama2

In [1]:
# openai
# !pip3 install openai tiktoken

## Data Preparation

In [2]:
from langchain.document_loaders import TextLoader
import os
root_dir = "mydata/langchain-sourcecode/libs"

docs = []
files = []
for dirpath, dirnames, filenames in os.walk(root_dir):
    for file in filenames:
        if file.endswith(".py") and "*venv/" not in dirpath:
            try:
                filepath=os.path.join(dirpath, file)
                loader = TextLoader(filepath, encoding="utf-8")
                files.append(filepath)
                docs.extend(loader.load_and_split())
            except Exception as e:
                pass
print(f"load {len(docs)} docs from {len(files)} *.py")


load 3317 docs from 1953 *.py


### CharacterTextSplitter VS RecursiveCharacterTextSplitter
#### RecursiveCharacterTextSplitter
> https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter
> tries to split on them in order until the chunks are small enough. The default list is \["\n\n", "\n", " ", ""\]

#### CharacterTextSplitter
> https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/character_text_splitter
> This is the simplest method. This splits based on characters (by default "\n\n") and measure chunk length by number of characters.
>
> 

In [3]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=2048, chunk_overlap=0)
split_docs = text_splitter.split_documents(docs)
print(f"There are {len(split_docs)} documents after split")

Created a chunk of size 2456, which is longer than the specified 2048
Created a chunk of size 2222, which is longer than the specified 2048
Created a chunk of size 3323, which is longer than the specified 2048
Created a chunk of size 2719, which is longer than the specified 2048
Created a chunk of size 2638, which is longer than the specified 2048
Created a chunk of size 2405, which is longer than the specified 2048
Created a chunk of size 3173, which is longer than the specified 2048
Created a chunk of size 2573, which is longer than the specified 2048
Created a chunk of size 2248, which is longer than the specified 2048
Created a chunk of size 2286, which is longer than the specified 2048
Created a chunk of size 2196, which is longer than the specified 2048
Created a chunk of size 2175, which is longer than the specified 2048
Created a chunk of size 3268, which is longer than the specified 2048
Created a chunk of size 2800, which is longer than the specified 2048
Created a chunk of s

There are 5853 documents after split


In [11]:
# Store data locally
# https://python.langchain.com/docs/integrations/vectorstores/activeloop_deeplake#deep-lake-locally
def make_local_vectorstore(split_docs,embeddings, dataset_path="./.my_deeplake/"):
    from langchain.vectorstores import DeepLake
    local_vectorstore=DeepLake.from_documents(split_docs, dataset_path=dataset_path, embedding=embeddings, overwrite=True) #, read_only=True
    return local_vectorstore 


In [5]:
# Store data on activeloop hub
def make_hub_vectorstore(split_docs,embedding):
    from langchain.vectorstores import DeepLake
    activeloop_key=""
    username = ""
    with open('./mydata/activeloop_key.txt', 'r') as file:
        username = file.readline().strip()
        activeloop_key = file.readline().strip()
    
    os.environ['ACTIVELOOP_TOKEN'] = activeloop_key

    hub_vectorstore = DeepLake.from_documents(
        split_docs, embeddings, dataset_path=f"hub://{username}/langchain-code", runtime={"tensor_db": True} #, overwrite=True
    )
    hub_vectorstore

In [6]:
# Our inference test function
from timeit import default_timer as timer
def inferenceQA(chat_model, vectorstore):
    qa = ConversationalRetrievalChain.from_llm(chat_model, retriever=vectorstore.as_retriever())
    questions = [
        "What is the class hierarchy?",
        "What classes are derived from the Chain class?",
        "What kind of retrievers does LangChain have?",
    ]
    chat_history = []
    qa_dict = {}
    
    for question in questions:
        print(f"-> **Question**: {question} \n")
        start=timer()
        result = qa({"question": question, "chat_history": chat_history})
        end=timer()
        print(f"**{int((end-start)*100)/100.0} secs**\n")
        print(f"**Answer**: {result['answer']} \n")
        chat_history.append((question, result["answer"]))
        qa_dict[question] = result["answer"]
    qa_dict

## OpenAI

In [7]:
openai_api_key=""
with open('./mydata/openai_api_key.txt', 'r') as file:
    openai_api_key = file.read().strip()

In [8]:
# ======= OpenAI Transformer==========
from langchain.embeddings.openai import OpenAIEmbeddings

gpt_embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
gpt_embeddings

OpenAIEmbeddings(client=<class 'openai.api_resources.embedding.Embedding'>, model='text-embedding-ada-002', deployment='text-embedding-ada-002', openai_api_version='', openai_api_base='', openai_api_type='', openai_proxy='', embedding_ctx_length=8191, openai_api_key='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', openai_organization='', allowed_special=set(), disallowed_special='all', chunk_size=1000, max_retries=6, request_timeout=None, headers=None, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False)

In [12]:
# gpt_vectorstore = make_hub_vectorstore(split_docs, gpt_embeddings)
gpt_vectorstore = make_local_vectorstore(split_docs, gpt_embeddings)



creating embeddings: 100% 920/920 [11:12<00:00,  1.37it/s]
100% 5853/5853 [00:04<00:00, 1276.16it/s]


Dataset(path='./.my_deeplake/', tensors=['text', 'metadata', 'embedding', 'id'])

  tensor      htype       shape       dtype  compression
  -------    -------     -------     -------  ------- 
   text       text      (5853, 1)      str     None   
 metadata     json      (5853, 1)      str     None   
 embedding  embedding  (5853, 1536)  float32   None   
    id        text      (5853, 1)      str     None   




In [13]:
# ======= OpenAI Model==========
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain

chat_model = ChatOpenAI(openai_api_key=openai_api_key)

In [14]:
inferenceQA(chat_model, gpt_vectorstore)

-> **Question**: What is the class hierarchy? 

**3.08 secs**

**Answer**: The class hierarchy for Memory is as follows:

BaseMemory --> BaseChatMemory --> <name>Memory

The class hierarchy for ChatMessageHistory is as follows:

BaseChatMessageHistory --> <name>ChatMessageHistory

The class hierarchy for Document Transformers is as follows:

BaseDocumentTransformer --> <name> 

-> **Question**: What classes are derived from the Chain class? 

**7.63 secs**

**Answer**: The following classes are derived from the Chain class:

- APIChain
- OpenAPIEndpointChain
- AnalyzeDocumentChain
- MapReduceDocumentsChain
- MapRerankDocumentsChain
- ReduceDocumentsChain
- RefineDocumentsChain
- StuffDocumentsChain
- ConstitutionalChain
- ConversationChain
- ChatVectorDBChain
- ConversationalRetrievalChain
- FlareChain
- ArangoGraphQAChain
- GraphQAChain
- GraphCypherQAChain
- FalkorDBQAChain
- HugeGraphQAChain
- KuzuQAChain
- NebulaGraphQAChain
- NeptuneOpenCypherQAChain
- GraphSparqlQAChain
- Hypothe

## Llama2 (GGUF on CTransformer)

In [15]:
# ======== GGUF =========
from langchain.llms import CTransformers
import os

model_id=os.path.abspath('./models/Llama-2-7b-Chat-GGUF')

# context_length must be > chunk_size=1000 of text_splitter
# If context length is too short, the output would be poor.
config = {'max_new_tokens': 2048, 'repetition_penalty': 1.1,'context_length':4096}
# https://api.python.langchain.com/en/latest/llms/langchain.llms.ctransformers.CTransformers.html
cTransformers_llm = CTransformers(model=model_id, model_file="llama-2-7b-chat.Q4_K_M.gguf", config=config)

In [16]:
# embedding data
from mylib.MyModelUtils import MyModelUtils
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
import os
llama2_embeddings=HuggingFaceEmbeddings(
    model_name=os.path.abspath("./models/sentence-transformers/all-mpnet-base-v2"), 
    model_kwargs={"device": MyModelUtils.device()}
)

In [17]:
# llama2_vectorstore = make_hub_vectorstore(split_docs, llama2_embeddings)
llama2_vectorstore = make_local_vectorstore(split_docs, llama2_embeddings)



creating embeddings: 100% 920/920 [03:27<00:00,  4.44it/s]
100% 5853/5853 [00:04<00:00, 1284.31it/s]


Dataset(path='./.my_deeplake/', tensors=['text', 'metadata', 'embedding', 'id'])

  tensor      htype       shape      dtype  compression
  -------    -------     -------    -------  ------- 
   text       text      (5853, 1)     str     None   
 metadata     json      (5853, 1)     str     None   
 embedding  embedding  (5853, 768)  float32   None   
    id        text      (5853, 1)     str     None   




In [None]:
inferenceQA(cTransformers_llm, gpt_vectorstore)

-> **Question**: What is the class hierarchy? 

**220.42 secs**

**Answer**:  The class hierarchy for Memory is shown below.

` class Hierarchy  Memory

BaseMemory --> BaseChatMemory --> <name>Memory  # Examples: ZepMemory, MotorheadMemory

Main helpers:

class BaseChatMessageHistory

Class hierarchy for Chat Message History:

.. code-block::

    BaseChatMessageHistory --> <name>ChatMessageHistory  # Example: ZepChatMessageHistory

Main helpers:

class AIMessage, BaseMessage, HumanMessage




















 

-> **Question**: What classes are derived from the Chain class? 

**189.13 secs**

**Answer**:  I don't know. The code snippet provided doesn't include the definition of the Chat Message History class, so I can't determine the hierarchy without further context. 

-> **Question**: What kind of retrievers does LangChain have? 



In [None]:
inferenceQA(cTransformers_llm, llama2_vectorstore)