### Mistral 7B + FAISS using Langchain

#### Poojitha Venkatram

# Basic RAG
Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. It’s useful to answer questions or generate content leveraging external knowledge. There are two main steps in RAG: 1) retrieval: retrieve relevant information from a knowledge base with text embeddings stored in a vectore store; 2) generation: insert the relevant information to the prompt for the LLM to generate information.



### Import all the needed packages




In [1]:
! pip install faiss-cpu==1.7.4 mistralai==0.0.12

Collecting mistralai==0.0.12
  Using cached mistralai-0.0.12-py3-none-any.whl (14 kB)
Installing collected packages: mistralai
  Attempting uninstall: mistralai
    Found existing installation: mistralai 0.0.11
    Uninstalling mistralai-0.0.11:
      Successfully uninstalled mistralai-0.0.11
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-mistralai 0.0.4 requires mistralai<0.0.12,>=0.0.11, but you have mistralai 0.0.12 which is incompatible.[0m[31m
[0mSuccessfully installed mistralai-0.0.12


In [2]:
! git lfs install

Git LFS initialized.


In [3]:
! git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

fatal: destination path 'Mistral-7B-Instruct-v0.2' already exists and is not an empty directory.


In [4]:
cd Mistral-7B-Instruct-v0.2

/content/Mistral-7B-Instruct-v0.2


In [5]:
!pip install langchain langchain-mistralai==0.0.4

Collecting mistralai<0.0.12,>=0.0.11 (from langchain-mistralai==0.0.4)
  Using cached mistralai-0.0.11-py3-none-any.whl (14 kB)
Installing collected packages: mistralai
  Attempting uninstall: mistralai
    Found existing installation: mistralai 0.0.12
    Uninstalling mistralai-0.0.12:
      Successfully uninstalled mistralai-0.0.12
Successfully installed mistralai-0.0.11


In [6]:
! pip install huggingface_hub



In [7]:
! pip install 'huggingface_hub[cli,torch]'



In [8]:
!python -c "from huggingface_hub.hf_api import HfFolder; HfFolder.save_token('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')"

In [9]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

### Get data



In [10]:
# Connect to Google Drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [11]:
! pip install PyPDF2



In [12]:
import PyPDF2

# Open the PDF file
with open("/content/drive/MyDrive/Sample.pdf", "rb") as file:
    pdf = PyPDF2.PdfReader(file)

    # Initialize a variable to hold all text
    full_text = ""

    # Iterate through each page and extract text
    for page in pdf.pages:
        full_text += page.extract_text()

# Printing the extracted text first 500 characters to check the extraction
print(full_text[:500])

Response Prediction of Structural System Subject to  Earthquake 
Motions using Artificial Neural Network 
 
S. Chakraverty*,  T. Marwala** , Pallavi Gupta* and  Thando Tettey**  
 
*B.P.P.P. Division, Central Building Research Institu te 
Roorkee-247 667, Uttaranchal, India 
e-mail :sne_chak@yahoo.com 
 
** School of Electrical and Information Engineering, 
University of the Witwatersrand, Private Bag 3 
Wits, 2050,Republic of South Africa  
  
Abstract 
This paper uses Artificial Neural Network


We can also save the essay in a local file:

In [13]:
len(full_text)

24311

## Split document into chunks

In a RAG system, it is crucial to split the document into smaller chunks so that it’s more effective to identify and retrieve the most relevant information in the retrieval process later.

In [14]:
chunk_size = 2048
chunks = [full_text[i:i + chunk_size] for i in range(0, len(full_text), chunk_size)]

In [15]:
len(chunks)

12

In [16]:
from transformers import AutoModel, AutoTokenizer
import torch

# Load the model and tokenizer
model_name = "sentence-transformers/all-MiniLM-L6-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

def get_text_embedding(text):
    # Tokenize the input text and convert to input tensors
    tokens = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

    # Generate embeddings
    with torch.no_grad():
        model_output = model(**tokens)

    # You might want to take the mean of the last hidden state as the embedding
    embeddings = model_output.last_hidden_state.mean(dim=1).numpy()
    return embeddings

In [17]:
import numpy as np
text_embeddings = np.array([get_text_embedding(chunk) for chunk in chunks])

In [18]:
text_embeddings.shape

(12, 1, 384)

In [19]:
text_embeddings

array([[[-0.17309807, -0.10922703,  0.11997147, ...,  0.11830396,
         -0.079504  , -0.11430095]],

       [[-0.18128201, -0.14044508,  0.1545095 , ...,  0.10432599,
          0.00846069, -0.04908906]],

       [[-0.20292331, -0.10011439,  0.12901923, ...,  0.08570072,
         -0.02853273, -0.10067203]],

       ...,

       [[-0.17451054, -0.00692823,  0.17774832, ...,  0.04262985,
         -0.02326536, -0.09262711]],

       [[-0.03157317, -0.13334677,  0.07569563, ...,  0.01338292,
         -0.08685192, -0.1011356 ]],

       [[-0.03780447, -0.08106704,  0.04931236, ...,  0.01181052,
         -0.11106861, -0.05282832]]], dtype=float32)

In [20]:
import numpy as np

def get_text_embedding(text):
    # This should return a real embedding based on your model
    return np.random.rand(384)

# Example list of texts
texts = ["text1", "text2", "text3"]

# Generate embeddings for each text
text_embeddings = np.array([get_text_embedding(text) for text in texts])

print("Generated embeddings shape:", text_embeddings.shape)

Generated embeddings shape: (3, 384)


In [21]:
import faiss

# Assuming text_embeddings is your array with shape (3, 384)
d = text_embeddings.shape[1]  # Dimension of the embeddings (should be 384)

# Creating the FAISS index
index = faiss.IndexFlatL2(d)

# Adding the embeddings to the index
index.add(text_embeddings)

### Load into a vector database
Once we get the text embeddings, a common practice is to store them in a vector database for efficient processing and retrieval. Using an open-source vector database Faiss, which allows for efficient similarity search.  

With Faiss, we instantiate an instance of the Index class, which defines the indexing structure of the vector database. We then add the text embeddings to this indexing structure.


In [22]:
d = text_embeddings.shape[1]
index = faiss.IndexFlatL2(d)
index.add(text_embeddings)



### Create embeddings for a question
Whenever users ask a question, we also need to create embeddings for this question using the same embedding models as before.


In [23]:
question = "How to write an abstract similar to the response prediction using ANN paper?"
question_embeddings = np.array([get_text_embedding(question)])
question_embeddings.shape

(1, 384)

In [24]:
question_embeddings

array([[0.87802083, 0.36368342, 0.62105168, 0.72783167, 0.64696208,
        0.2486663 , 0.75119849, 0.38147891, 0.20406318, 0.75687456,
        0.87696904, 0.85323777, 0.57878798, 0.22556383, 0.62671107,
        0.77638152, 0.55871051, 0.9421258 , 0.54221419, 0.19933051,
        0.27942118, 0.67654684, 0.86496604, 0.78607431, 0.61996087,
        0.38345325, 0.01384411, 0.36641655, 0.54616124, 0.46914813,
        0.09803941, 0.57252951, 0.38135483, 0.50120672, 0.85621412,
        0.22171292, 0.17454722, 0.49837733, 0.58871526, 0.68125253,
        0.71555964, 0.93322098, 0.83178476, 0.76875748, 0.63129956,
        0.68690865, 0.53978664, 0.04060748, 0.61009988, 0.96331778,
        0.81801412, 0.76830599, 0.41979903, 0.52545301, 0.27009569,
        0.93573108, 0.27216655, 0.25312146, 0.7029919 , 0.51631519,
        0.12121192, 0.00418767, 0.32248962, 0.33006628, 0.96600258,
        0.46138347, 0.57217572, 0.18556137, 0.09911915, 0.32190901,
        0.4799523 , 0.55010289, 0.50235695, 0.80



### Retrieve similar chunks from the vector database
We can perform a search on the vector database with `index.search`, which takes two arguments: the first is the vector of the question embeddings, and the second is the number of similar vectors to retrieve. This function returns the distances and the indices of the most similar vectors to the question vector in the vector database. Then based on the returned indices, we can retrieve the actual relevant text chunks that correspond to those indices.


In [25]:
D, I = index.search(question_embeddings, k=2)
print(I)

[[2 1]]


In [26]:
retrieved_chunk = [chunks[i] for i in I.tolist()[0]]
print(retrieved_chunk)

["tures. Muhammad [13] gives certain ANN applications in concrete \nstructures. Pandey and Barai [14] detected damage in a bridge truss by applying \nANN of multilayer perceptron architectures to numericall y simulated data. Some \nstudies such as [15]-[17] used artificial neural network f or structural damage \ndetection and system identification.  \n \nIn the present paper, the Chamoli earthquake ground acceleration at Barkot (NE) \nand Uttarkashi earthquake ground acceleration recorded a t Barkot (NE and NW)  \nhave been considered based on the authors' previous stud y [18]. From their \nground acceleration the responses are computed using the u sual procedure. \nThen the ground acceleration and the corresponding re sponse are trained using \nArtificial Neural Network (ANN) with and without damp ing. After training the \nnetwork with one earthquake, the converged weight mat rices are stored. In order \nto show the power of these converged (trained) networ ks other earthquakes are \n

In [27]:
import pandas as pd

# Assuming retrieved_chunk is your list of text chunks corresponding to the nearest neighbors
retrieved_chunk = [chunks[i] for i in I.tolist()[0]]

# Creating a DataFrame
df = pd.DataFrame(retrieved_chunk, columns=['Retrieved Chunks'])

# Display the DataFrame
print(df)

                                    Retrieved Chunks
0  tures. Muhammad [13] gives certain ANN applica...
1  d esign of the building. All \nbuildings have ...


In [28]:
import pandas as pd

# Set option to increase width and max_colwidth
pd.set_option('display.max_colwidth', None)  # For pandas versions >= 1.0
pd.set_option('display.width', 1000)  # Adjust as necessary for your display

df_full_text = pd.DataFrame([full_text], columns=['Full Text'])
print(df_full_text)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        


  
### Combine context and question in a prompt and generate response

Finally, we can present the retrieved text chunks as the context information within the prompt. Here is a prompt template where we can include both the retrieved text and user question in the prompt.



In [29]:
prompt = f"""
Context information is below.
---------------------
{retrieved_chunk}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {question}
Answer:
"""

In [30]:
from transformers import pipeline

def run_mistral_pipeline(user_message):
    # Initialize the pipeline with the text-generation model
    pipe = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2")

    # Generate a response
    response = pipe(user_message, max_length=512)[0]['generated_text']

    return response

In [31]:
user_message = "How to write an abstract similar to the response prediction using ANN paper?"
response = run_mistral_pipeline(user_message)
print("Response:", response)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Response: How to write an abstract similar to the response prediction using ANN paper?

I'm trying to write an abstract for a paper on a neural network that predicts the response of a system to a given input. The paper is based on a deep learning model, specifically a feedforward neural network. The abstract should be similar to the response prediction using ANN paper by LeCun et al. (1990). Here's a draft of the abstract:

---

Title: Predicting System Responses with a Feedforward Neural Network

Abstract: In this paper, we present a novel approach for predicting the response of a complex system to a given input using a feedforward neural network. Our model is inspired by the seminal work of LeCun et al. (1990) on recognizing handwritten digits using a backpropagation neural network. We apply this concept to predict the response of a system, which can be any physical or mathematical model, given a specific input.

Our neural network consists of an input layer, multiple hidden layers, 

In [32]:
possible_questions = [
    [
        "What are the key keywords mentioned in the paper?",
        "Can you list the main topics or subject areas covered in the paper?",
        "What technical terms or concepts are referenced in the paper's keywords?",
        "Are there any notable keywords missing that you would expect to see in a paper on this topic?",
        "How do the keywords indicate the focus of the research presented in the paper?"
    ],
    [
        "What is the main purpose of using Artificial Neural Networks in this research?",
        "How are the ANN models trained and used to predict structural responses?",
        "What are the key capabilities of the ANN models described in the paper?",
        "How can the trained ANN models be used to assess the safety of structural systems?",
        "What are the benefits of using ANN-based approaches for earthquake response prediction?"
    ],
    [
        "How does a building's natural frequency affect its response to an earthquake?",
        "What happens when the ground shaking frequency is in resonance with the building's natural frequency?",
        "What factors determine a building's natural frequency?",
        "Why can resonance between ground shaking and building frequency lead to increased risk of damage or collapse?",
        "What is the relationship between a building's natural frequency and its response to earthquake motions?"
    ],
    [
        "What type of neural network model is most commonly used for modeling the dynamic response of structures?",
        "Can you describe the Back-Propagation Neural Network (BPN) and its application in this context?",
        "Why is the BPN model particularly well-suited for modeling the dynamic response of structures?",
        "Are there any other neural network architectures that have been explored for this application?",
        "What are the key advantages of the BPN model for predicting structural responses to earthquakes?"
    ],
    [
        "What data was used to train the ANN model for predicting earthquake responses?",
        "How was the ANN model trained and validated for different earthquake intensities?",
        "Can you describe the process of using the trained ANN architecture to predict structural responses over time?",
        "What were the key findings regarding the accuracy of the ANN model's predictions?",
        "Why is the Artificial Neural Network considered a powerful soft computing technique for this application?"
    ],
    [
        "What are the key implications of the study's findings on using ANN models for predicting structural safety?",
        "How does the training of the ANN model on Indian earthquake data contribute to its potential for accurate predictions?",
        "What are the benefits of being able to predict the safeness of structural systems in advance of an earthquake?",
        "How can the trained ANN model be used to simulate different earthquake intensities and study structural behavior?",
        "Why do the study's findings represent a promising approach for ensuring the safety of buildings and structures during earthquakes?"
    ]
]

### Populating Lists and save as CSV File

In [33]:
import pandas as pd
from transformers import pipeline

# Initialize the text generation pipeline with the specified model
def initialize_pipeline():
    return pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2")

# Function to generate responses using the pipeline
def run_mistral_pipeline(question, pipe):
    # Generate response with specific settings including beam search and truncation
    response = pipe(question, max_length=100, num_beams=5, length_penalty=2.0, truncation=True)[0]['generated_text']
    return response

# Main function to process questions and generate dataset
def main():
    # Initialize the pipeline
    pipe = initialize_pipeline()

    # List of questions to process
    questions = [
        "What are the keywords mentioned in the paper titled Response Prediction of Structural System Subject to Earthquake Motions using Artificial Neural Network?",
        "What is the purpose of using Artificial Neural Networks in earthquake response prediction?",
        "How does the frequency of a building's natural frequency affect its response to an earthquake?",
        "What kind of neural network model is most frequently applied for modeling dynamic response of structures?",
        "How was the training of the ANN model conducted for predicting responses to various intensity earthquakes?",
        "What is the significance of the study's findings on predicting the safeness of structural systems?"
    ]

    # Corresponding ground truths for each question
    ground_truths = [
        "The keywords written in the paper are: Earthquake, Neural Network, Frequency, Structure, Building.",
        "Artificial Neural Networks (ANNs) are used to compute the response of structural systems to Indian earthquakes and simulate various intensities of earthquakes. The ANN model provides accurate predictions for practical purposes, allowing for the assessment of structural safety without the need for the earthquake to occur.",
        "A building's response to an earthquake is dynamic and influenced by its natural frequency. If the ground shakes at the same frequency as the building's natural frequency, it causes resonance, leading to increased amplitude of sway and potential collapse due to the strain on building components.",
        "The most frequently applied neural network model for modeling the dynamic response of structures is the feedforward, multilayer, supervised neural network with error backpropagation algorithm, known as the BPN.",
        "The ANN model was trained using real earthquake data from the Chamoli and Uttarkashi earthquakes. The training involved using ground motion data to compute structural responses, which were then used to adjust the weights of the ANN for accurate future predictions.",
        "The study's findings demonstrate the ability of the trained ANN architecture to simulate and predict the response of a structural system to future earthquakes. This can be crucial in predicting the safety of structures and in taking pre-emptive measures to mitigate earthquake damage."
    ]

    # Generate responses for each question using the initialized pipeline
    rag_answers = [run_mistral_pipeline(question, pipe) for question in questions]

    # Contexts and possible_questions arrays (placeholders for now)
    contexts = ['']*len(questions)  # Empty context for each question
    possible_questions = [[] for _ in questions]

    # Compile data into a DataFrame
    data = {
        "question": questions,
        "ground_truth": ground_truths,
        "rag_answer": rag_answers,
        "context": contexts,
        "possible_questions": possible_questions
    }
    dataset = pd.DataFrame(data)

    # Print and save the DataFrame to a CSV file
    print(dataset)
    dataset.to_csv('mistral_output.csv', index=False)

# Execute the main function
if __name__ == "__main__":
    main()

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


                                                                                                                                                      question                                                                                                                                                                                                                                                                                                                        ground_truth  \
0  What are the keywords mentioned in the paper titled Response Prediction of Structural System Subject to Earthquake Motions using Artificial Neural Network?                                                                                                                                                                                                                                  The keywords written in the paper are: Earthquake, Neural Network, Frequency, Structure, Building.   
1                           

In [2]:
! pip install -U sentence-transformers

Collecting sentence-transformers
  Downloading sentence_transformers-2.6.1-py3-none-any.whl (163 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/163.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━[0m [32m143.4/163.3 kB[0m [31m4.0 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m163.3/163.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: sentence-transformers
Successfully installed sentence-transformers-2.6.1


In [1]:
import os
import pandas as pd
from transformers import pipeline
from sentence_transformers import SentenceTransformer, util

# Initialize the text generation pipeline
pipe = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2")

# Initialize embedding model
embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Load or initialize history
def load_history():
    if os.path.exists('session_history.csv'):
        return pd.read_csv('session_history.csv')
    else:
        return pd.DataFrame(columns=['question', 'answer', 'embedding'])

# Save history
def save_history(history_df):
    history_df.to_csv('session_history.csv', index=False)

# Update session history with new Q&A and embeddings
def update_history(question, answer, history_df):
    embedding = embedder.encode(answer, convert_to_tensor=True)
    new_data = pd.DataFrame({
        'question': [question],
        'answer': [answer],
        'embedding': [embedding.numpy().tolist()]  # Ensure the embedding is properly serialized
    })
    history_df = pd.concat([history_df, new_data], ignore_index=True)
    save_history(history_df)
    return history_df

# Generate response and update history
def generate_response_and_update_history(question, history_df):
    answer = pipe(question, max_length=100, num_beams=5, length_penalty=2.0, truncation=True)[0]['generated_text']
    history_df = update_history(question, answer, history_df)
    return answer, history_df

# Main function to process questions
def main():
    history_df = load_history()

    # Example questions
    questions = [
        "How can the trained ANN model be used to simulate different earthquake intensities and study structural behavior?",
        "Why do the study's findings represent a promising approach for ensuring the safety of buildings and structures during earthquakes?"
    ]

    for question in questions:
        answer, history_df = generate_response_and_update_history(question, history_df)
        print(f"Question: {question}\nAnswer: {answer}\n")

# Run the main function
if __name__ == "__main__":
    main()

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Question: How can the trained ANN model be used to simulate different earthquake intensities and study structural behavior?
Answer: How can the trained ANN model be used to simulate different earthquake intensities and study structural behavior?

To simulate different earthquake intensities and study structural behavior, the trained ANN model can be used as follows:

1. Define the input parameters: The input parameters to the ANN model are the ground motion parameters such as PGA, PGV, and spectral acceleration at different frequencies.
2. Define the output parameters: The output parameters of the AN

Question: Why do the study's findings represent a promising approach for ensuring the safety of buildings and structures during earthquakes?
Answer: Why do the study's findings represent a promising approach for ensuring the safety of buildings and structures during earthquakes?

The study's findings represent a promising approach for ensuring the safety of buildings and structures during