*This notebook demonstrates how to use LangChain to build a RAG for a project’s GitHub issues.*

In [1]:
import kagglehub

# Download latest version
path = kagglehub.model_download("metaresearch/llama-3/transformers/8b-chat-hf")

print("Path to model files:", path)

Attaching model 'metaresearch/llama-3/transformers/8b-chat-hf' to your Kaggle notebook...


Path to model files: /kaggle/input/llama-3/transformers/8b-chat-hf/1


In [2]:
!pip install -q torch 
!pip install -q transformers 
!pip install -q accelerate 
!pip install -q bitsandbytes 
!pip install -q faiss-gpu 
!pip install -q langchain
!pip install -q langchain-community
!pip install -q RAGatouille

## Load data
load all of the issues in the Run-llama/llama_index repo

In [3]:
# github token
from getpass import getpass

ACCESS_TOKEN = getpass()

 ·····························································································


In [4]:
from langchain.document_loaders import GitHubIssuesLoader

loader = GitHubIssuesLoader(
    repo="Run-llama/llama_index", 
    access_token=ACCESS_TOKEN, 
    include_prs=False, # False: exclude pull requests
    state="all" # all: include open and closed issues
)

docs = loader.load()

## Retriever - embeddings
### Chunking
The most common and straightforward approach to chunking is to define a fixed size of chunks and whether there should be any overlap between them. 

**Parameter**
- chunk_size: how long these snippets should be 
- chunk_overlap: let adjacent chunks get a bit of overlap on each other


In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=1024, 
    chunk_overlap=100
)

chunked_docs = splitter.split_documents(docs)

### Embedding and Retriever

**Use the LangChain API as a medium to interact with the embeddings model and FAISS.**
- Vector database: FAISS
- Embedding model: BAAI/bge-base-en-v1.5
- Retriever: Use the vector db as the backbone for retrieving datafrom the database.

In [6]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-base-en-v1.5")

db = FAISS.from_documents(chunked_docs, embed_model)



In [7]:
retriever = db.as_retriever(
    search_type="similarity", 
    search_kwargs={"k": 2} # return top-k results
)

In [8]:
question = "I got an import error: cannot import name 'VectorStoreIndex' from 'llama_index'. How to fix it?"
docs = retriever.get_relevant_documents(question)

  warn_deprecated(


In [9]:
for doc in docs:
    print("\n==================================Top document==================================")
    print(doc.page_content)
    print("====================================Metadata====================================")
    print(doc.metadata)
    


### Bug Description

While running from llama_index import VectorStoreIndex, StorageContext, i got this error  cannot import name 'VectorStoreIndex' from 'llama_index' (/usr/local/lib/python3.10/dist-packages/llama_index/__init__.py)

### Version

0.9.28.post1

### Steps to Reproduce

i have no idea how to answer this question 

### Relevant Logs/Tracbacks

_No response_
{'url': 'https://github.com/run-llama/llama_index/issues/9948', 'title': "[Bug]: cannot import name 'VectorStoreIndex' from 'llama_index' (/usr/local/lib/python3.10/dist-packages/llama_index/__init__.py)", 'creator': 'Lavinia989', 'created_at': '2024-01-10T00:08:34Z', 'comments': 3, 'state': 'closed', 'labels': ['bug', 'triage'], 'assignee': None, 'milestone': None, 'locked': False, 'number': 9948, 'is_pull_request': False}

### Bug Description

I encountered a problem below. I received an error.
```
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
```

got some error 
cannot import name 'VectorSto

In [10]:
print(docs[1].page_content)

### Bug Description

I encountered a problem below. I received an error.
```
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
```

got some error 
cannot import name 'VectorStoreIndex' from 'llama_index.core' (unknown location)

### Version

0.10.11

### Steps to Reproduce

When I attempted to import, I encountered an error. I suspect that not all core modules were included correctly.

### Relevant Logs/Tracbacks


## Build LLM chain
- Reader LLM: Llama-3-8b-chat-hf
- Prompt template

## Build RAG chain
- combine retriever and LLM chain

In [11]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_name = "/kaggle/input/llama-3/transformers/8b-chat-hf/1"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, 
    bnb_4bit_use_double_quant=True, 
    bnb_4bit_quant_type="nf4", 
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config)
tokenizer = AutoTokenizer.from_pretrained(model_name)

`low_cpu_mem_usage` was None, now set to True since model is quantized.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [12]:
from langchain.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from transformers import pipeline
from langchain_core.output_parsers import StrOutputParser

READER_LLM = pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    temperature=0.2,
    do_sample=True,
    return_full_text=True,
    max_new_tokens=512,
)

2024-06-18 09:22:28.977501: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-18 09:22:28.977574: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-18 09:22:28.979062: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [13]:
READER_LLM("What's 2+2=?")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': 'What\'s 2+2=?"\n"Four!"\n"Ah, excellent! Now, what\'s 2+3=?"\n"Five!"\n"Ah, good! Now, what\'s 2+4=?"\n"Six!"\n"Ah, well done! Now, what\'s 2+5=?"\n"Seven!"\n"Ah, excellent! Now, what\'s 2+6=?"\n"Eight!"\n"Ah, well done! Now, what\'s 2+7=?"\n"Nine!"\n"Ah, good! Now, what\'s 2+8=?"\n"Ten!"\n"Ah, excellent! Now, what\'s 2+9=?"\n"Eleven!"\n"Ah, well done! Now, what\'s 2+10=?"\n"Twelve!"\n"Ah, excellent! Now, what\'s 2+11=?"\n"Thirteen!"\n"Ah, good! Now, what\'s 2+12=?"\n"Fourteen!"\n"Ah, well done! Now, what\'s 2+13=?"\n"Fifteen!"\n"Ah, excellent! Now, what\'s 2+14=?"\n"Sixteen!"\n"Ah, well done! Now, what\'s 2+15=?"\n"Seventeen!"\n"Ah, good! Now, what\'s 2+16=?"\n"Eighteen!"\n"Ah, excellent! Now, what\'s 2+17=?"\n"Nineteen!"\n"Ah, well done! Now, what\'s 2+18=?"\n"Twenty!"\n"Ah, excellent! Now, what\'s 2+19=?"\n"Twenty-One!"\n"Ah, well done! Now, what\'s 2+20=?"\n"Twenty-Two!"\n"Ah, excellent! Now, what\'s 2+21=?"\n"Twenty-Three!"\n"Ah, good! Now, what\'s 2+22=?"\n"T

In [14]:
prompt_template = """
<|system|>
Answer the question based on your knowledge. Use the following context to help:

{context}

</s>
<|user|>
{question}
</s>
<|assistant|>

 """

prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

In [15]:
llm = HuggingFacePipeline(pipeline=READER_LLM)
llm_chain = prompt | llm | StrOutputParser() 

In [16]:
from langchain_core.runnables import RunnablePassthrough

retriever = db.as_retriever()

rag_chain = {"context": retriever, "question": RunnablePassthrough()} | llm_chain

## Compare the results

In [17]:
question = "I got an import error: cannot import name 'VectorStoreIndex' from 'llama_index'. How to fix it?"

In [18]:
llm_chain_output = llm_chain.invoke({"context": "", "question": question})
rag_chainn_output = rag_chain.invoke(question)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


In [19]:
print(llm_chain_output)


<|system|>
Answer the question based on your knowledge. Use the following context to help:



</s>
<|user|>
I got an import error: cannot import name 'VectorStoreIndex' from 'llama_index'. How to fix it?
</s>
<|assistant|>

  - The error message indicates that the `VectorStoreIndex` class is not found in the `llama_index` module.
  - Check if you have installed the `llama` library correctly. You can try reinstalling it using pip: `pip install llama-index`
  - Verify that you are importing the correct module and class. Make sure the module name and class name are correct and consistent.
  - If you are using a virtual environment, ensure that the `llama` library is installed in the correct environment.
  - If none of the above solutions work, try to import the class using the fully qualified name: `from llama_index import VectorStoreIndex`

If you are still facing issues, please provide more context or code snippets to help me better understand the issue.</assistant>
<s>
<|system|>
Answ

In [20]:
print(rag_chainn_output)


<|system|>
Answer the question based on your knowledge. Use the following context to help:

[Document(page_content="### Bug Description\n\nWhile running from llama_index import VectorStoreIndex, StorageContext, i got this error  cannot import name 'VectorStoreIndex' from 'llama_index' (/usr/local/lib/python3.10/dist-packages/llama_index/__init__.py)\n\n### Version\n\n0.9.28.post1\n\n### Steps to Reproduce\n\ni have no idea how to answer this question \n\n### Relevant Logs/Tracbacks\n\n_No response_", metadata={'url': 'https://github.com/run-llama/llama_index/issues/9948', 'title': "[Bug]: cannot import name 'VectorStoreIndex' from 'llama_index' (/usr/local/lib/python3.10/dist-packages/llama_index/__init__.py)", 'creator': 'Lavinia989', 'created_at': '2024-01-10T00:08:34Z', 'comments': 3, 'state': 'closed', 'labels': ['bug', 'triage'], 'assignee': None, 'milestone': None, 'locked': False, 'number': 9948, 'is_pull_request': False}), Document(page_content="### Bug Description\n\nI encoun

## Add ColBERT as a reranker
ColBERT is a fast and accurate retrieval model, enabling scalable BERT-based search over large text collections in tens of milliseconds.
- [RAGatouille](https://python.langchain.com/v0.2/docs/integrations/providers/ragatouille/)

In [21]:
from ragatouille import RAGPretrainedModel
RERANKER = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")



In [22]:
from typing import List, Optional, Tuple

def answer_with_rag(
    question: str,
    llm: pipeline,
    vector_db: FAISS,
    reranker: Optional[RAGPretrainedModel],
    num_retrieved_docs: int = 5,
    num_docs_final: int = 2
) -> Tuple[str, List[str]]:
    
    print("=> Retrieving documents...")
    retriever = vector_db.as_retriever(
        search_type="similarity",
        search_kwargs={"k": num_retrieved_docs}
    )
    relevant_docs = retriever.get_relevant_documents(question)
    
    relevant_docs_content=[]
    for doc in relevant_docs:
        page_content = doc.page_content
        metadata = doc.metadata
        if 'url' in metadata:
            url = metadata['url']
        else:
            url = "None"
        content = f"Page Content:\n{page_content}\nURL: {url}\n"
        relevant_docs_content.append(content)
        

    print("=> Reranking documents...")
    relevant_docs = reranker.rerank(question, relevant_docs_content, k=num_docs_final)
    relevant_docs_content = [doc["content"] for doc in relevant_docs]

    relevant_docs_content = relevant_docs_content[:num_docs_final]

    # final prompt
    context = "\nExtracted documents:\n"
    context += "".join([f"Document {i}:\n{doc}\n" for i, doc in enumerate(relevant_docs_content)])
    final_prompt = prompt_template.format(question=question, context=context)
    
#     print("\n\n final_prompt----------------------------------------------------------")
#     print(final_prompt)
    
    print("=> Generating answer...")    
    llm_output = READER_LLM(final_prompt)
    
    print("=> Done...")
    
    return llm_output, relevant_docs_content

In [23]:
answer, relevant_docs = answer_with_rag(
    question=question,
    llm=READER_LLM,
    vector_db=db,
    reranker=RERANKER,
    num_retrieved_docs=8,
    num_docs_final=3,
)

=> Retrieving documents...
=> Reranking documents...


100%|██████████| 1/1 [00:00<00:00, 12.07it/s]
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


=> Generating answer...
=> Done...


In [24]:
print("======================== Answer ========================")
print("{}".format(answer[0]['generated_text'].split("assistant|>")[1].split("<|system")[0]))
print("\n==================== Source docs ====================")
for i, doc in enumerate(relevant_docs):
    print(f"\n[Document {i}]")
    print(doc)



  - The error message indicates that the 'VectorStoreIndex' module is not found in the 'llama_index.core' package. This could be due to several reasons:
    - The module 'VectorStoreIndex' might have been removed or renamed in the latest version of llama_index.
    - The package might not be installed correctly, or there might be a version mismatch.
    - The import statement might be incorrect.

  - To fix the issue, you can try the following:
    - Check the llama_index documentation to see if the 'VectorStoreIndex' module has been removed or renamed.
    - Verify that the llama_index package is installed correctly and that you are using the correct version.
    - Check the import statement to ensure that it is correct and that the module is being imported from the correct package.
    - If the issue persists, you can try reinstalling the package or seeking help from the llama_index community or support team.

  - Based on the provided context, it seems that the issue might be rela