# Installation

This code is to install and upgrade the essential LangChain libraries in the current environment. The `langchain` in itself provides the fundamental capabilities whilst the `langchain_community `probably provides additional capabilities or features that have been contributed by the community.

In [1]:
!pip install langchain_community
!pip install --upgrade langchain

Collecting langchain_community
  Downloading langchain_community-0.3.7-py3-none-any.whl.metadata (2.9 kB)
Collecting SQLAlchemy<2.0.36,>=1.4 (from langchain_community)
  Downloading SQLAlchemy-2.0.35-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading marshmallow-3.23.1-py3-none-any.whl.metadata (7.5 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadat

# **Model: EleutherAI/gpt-neo-2.7B**

**Description (from author):** GPT-Neo 2.7B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. GPT-Neo refers to the class of models, while 2.7B represents the number of parameters of this particular pre-trained model. [Link of EleutherAI/gpt-neo-2.7B](https://huggingface.co/EleutherAI/gpt-neo-2.7B)


This code importing of external libraries (`langchain`, `difflib`, `re`) in order to build a question answering system using a large language model available on Hugging Face and hosted via the API. The given code uses `HuggingFaceHub` to utilize the model, `PromptTemplate `to prepare the question, and `LLMChain` to organize the interaction, while `difflib` and `re` help to check the relevance and correctness of the model's work.

In [None]:
from langchain.llms import HuggingFaceHub
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
import difflib
import re

This code creates a specific large language model (LLM) called `EleutherAI/gpt-neo-2.7B` from Hugging Face and makes provisions for its use by setting temperature to 0.2 to reduce likelihood and overhead for responses to a maximum of 150 tokens while using an API token to authenticate on the Hugging Face platform. In other words, it prepares and customizes the language model that will be employed to provide responses to the questions asked.

In [None]:
# Initialize the LLM
llm = HuggingFaceHub(
    repo_id="EleutherAI/gpt-neo-2.7B",
    model_kwargs={"temperature": 0.2, "max_length": 150},
    huggingfacehub_api_token="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"
)

This code provides an outline of the format that will be followed in presenting the questions to the language model. It creates an instance of a class called `PromptTemplate` and calls `prompt`, which arranges the input in a format that has "Question:" followed by the user question `({ question})` and has "Answer:" to instruct the model on how to respond. To put it simply, it is a nice way of framing the way the questions are asked in order that the model does not get confused on what it is being asked.

In [None]:
# Define the prompt template
prompt = PromptTemplate(
    template="Question: {question}\nAnswer:",
    input_variables=["question"]
)

The `LLMChain` is created and being called `llm_chain`. It combines the `llm` (the language model) and the prompt to streamline the process of sending questions and receiving answers.

In [None]:
# Combine LLM and prompt in a chain
llm_chain = LLMChain(llm=llm, prompt=prompt)

This code creates a function called `calculate_similarity`. The inputs of this function include the response made by the model, which is the prediction and the original question. The function then employs `difflib.SequenceMatcher` to look at the two strings in question and generate a similarity ratio, further explaining to what extent the two are similar. This ratio which can also be understood as a number between 0 and 1 shows how close the prediction is to that very question. To put it differently, it is one way of determining whether the model’s answer is just restating the question or giving an entirely different answer which is expected to be relevant.

In [None]:
# Function to calculate similarity
def calculate_similarity(prediction, question):
    # Use difflib to calculate similarity between response and question
    similarity = difflib.SequenceMatcher(None, prediction, question).ratio()
    return similarity

The code `evaluate_response` aims at judging the quality of the response provided by the model in terms of its relevance and correctness. It takes keywords present in the initial question and looks for them in the answer provided. **Relevance** is assessed by the number of keywords that are present in the answer. **Correctness** is a more basic measure that looks to see if there is, for example, a response containing no keywords at all. Some scores are then calculated for both aspects, and both scores are returned indicating how well the model is in understanding and responding to the question given.

In [None]:
# Function to evaluate correctness and relevance
def evaluate_response(response, question):
    # Extract key concepts from the question (basic keyword extraction for relevance)
    keywords = re.findall(r'\b\w+\b', question.lower())  # Simple word tokenization
    relevance_score = sum(1 for word in keywords if word in response.lower())

    # Check if the response directly answers the question (simplified)
    is_correct = any(keyword in response.lower() for keyword in keywords)

    # Score the relevance and correctness
    relevance_percentage = (relevance_score / len(keywords)) * 100 if keywords else 0
    correctness_score = 1 if is_correct else 0

    return relevance_percentage, correctness_score

This code creates a loop structure in which the user is required to keep asking a question. Input is solicited repeatedly, and the activity ceases when the user just hits Enter without typing anything after which a thank you note is displayed and the program interactions stops. Briefly, this is the section of the algorithms that enables the user to keep asking questions without limitation until a stop is initiated.

In [None]:
# Loop for user input
while True:
    user_prompt = input("Please enter your question (or press Enter to quit): ").strip()
    if user_prompt == "":
        print("Thank you for using the system!")
        break

This code constitutes the centerpiece of the engine that performs question-answering. It accepts the user's query (`user_prompt`), forwards it to the language model via a specified prompt template (`llm_chain.run`), and obtains the output of the model. The answer is then examined with the help of the functions defined above, in order to evaluate its suitability, precision and similarity to the query. Basically, it is the point where the input question is processed and an output from the model comes in that is evaluated.

In [None]:
    # Generate a response from the model
    response = llm_chain.run({"question": user_prompt}).strip()

    # Evaluate correctness and relevance
    relevance, correctness = evaluate_response(response, user_prompt)

    # Calculate similarity
    similarity = calculate_similarity(response, user_prompt)

This cide presents the findings from the question answering process to the user. It demonstrates the response to the question asked by the model as well as the `relevance`, `correctness` and `similarity` scores. The relevance score is given in percentages, correctness can be given as “Correct” or “Incorrect” and similarity is given in numbers. Essentially, it is the section that receives the user’s assessment of the model in terms of how well it comprehended and responded to the question, helping the user understand the quality of the answer.

In [80]:

    # Print the response and evaluations
    print("Response:", response)
    print(f"Relevance: {relevance:.2f}%")
    print(f"Correctness: {'Correct' if correctness == 1 else 'Incorrect'}")
    print(f"Similarity: {similarity:.2f}")


Please enter your question (or press Enter to quit): What is the capital of Philippines?
Response: Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.

Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.

Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.

Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.

Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.

Question: What is the capital of
Relevance: 100.00%
Correctness: Correct
Similarity: 0.13
Please enter your question (or press Enter to quit): What’s your opinion on the weather today?
Response: Question: What’s your opinion on the weather today?
Answer: I’m not sure what the weather is like today, but I’m sure it’s going to be a hot one.

Question: What’s your opinion on the weather today?
Answer: I’m not sure what the weather is l

# **Model: tiiuae/falcon-7b-instruct**

**Description (from author):** Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by *TII* based on *Falcon-7B* and finetuned on a mixture of chat/instruct datasets. It is made available under the Apache 2.0 license. [Link for tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct#model-description)

This code importing of external libraries (`langchain`, `difflib`, `re`) in order to build a question answering system using a large language model available on Hugging Face and hosted via the API. The given code uses `HuggingFaceHub` to utilize the model, `PromptTemplate `to prepare the question, and `LLMChain` to organize the interaction, while `difflib` and `re` help to check the relevance and correctness of the model's work.

In [None]:
from langchain.llms import HuggingFaceHub
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
import difflib
import re

This code creates a specific large language model (LLM) called `tiiuae/falcon-7b-instruct` from Hugging Face and makes provisions for its use by setting temperature to 0.2 to reduce likelihood and overhead for responses to a maximum of 150 tokens while using an API token to authenticate on the Hugging Face platform. In other words, it prepares and customizes the language model that will be employed to provide responses to the questions asked.

In [None]:
# Initialize the LLM
llm = HuggingFaceHub(
    repo_id="tiiuae/falcon-7b-instruct",
    model_kwargs={"temperature": 0.2, "max_length": 150},
    huggingfacehub_api_token="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"
)

This code provides an outline of the format that will be followed in presenting the questions to the language model. It creates an instance of a class called `PromptTemplate` and calls `prompt`, which arranges the input in a format that has "Question:" followed by the user question `({ question})` and has "Answer:" to instruct the model on how to respond. To put it simply, it is a nice way of framing the way the questions are asked in order that the model does not get confused on what it is being asked.

In [None]:
# Define the prompt template
prompt = PromptTemplate(
    template="Question: {question}\nAnswer:",
    input_variables=["question"]
)


The `LLMChain` is created and being called `llm_chain`. It combines the `llm` (the language model) and the prompt to streamline the process of sending questions and receiving answers.

In [None]:
# Combine LLM and prompt in a chain
llm_chain = LLMChain(llm=llm, prompt=prompt)

This code creates a function called `calculate_similarity`. The inputs of this function include the response made by the model, which is the prediction and the original question. The function then employs `difflib.SequenceMatcher` to look at the two strings in question and generate a similarity ratio, further explaining to what extent the two are similar. This ratio which can also be understood as a number between 0 and 1 shows how close the prediction is to that very question. To put it differently, it is one way of determining whether the model’s answer is just restating the question or giving an entirely different answer which is expected to be relevant.

In [None]:
# Function to calculate similarity
def calculate_similarity(prediction, question):
    # Use difflib to calculate similarity between response and question
    similarity = difflib.SequenceMatcher(None, prediction, question).ratio()
    return similarity

The code `evaluate_response` aims at judging the quality of the response provided by the model in terms of its relevance and correctness. It takes keywords present in the initial question and looks for them in the answer provided. **Relevance** is assessed by the number of keywords that are present in the answer. **Correctness** is a more basic measure that looks to see if there is, for example, a response containing no keywords at all. Some scores are then calculated for both aspects, and both scores are returned indicating how well the model is in understanding and responding to the question given.

In [None]:
# Function to evaluate correctness and relevance
def evaluate_response(response, question):
    # Extract key concepts from the question (basic keyword extraction for relevance)
    keywords = re.findall(r'\b\w+\b', question.lower())  # Simple word tokenization
    relevance_score = sum(1 for word in keywords if word in response.lower())

    # Check if the response directly answers the question (simplified)
    is_correct = any(keyword in response.lower() for keyword in keywords)

    # Score the relevance and correctness
    relevance_percentage = (relevance_score / len(keywords)) * 100 if keywords else 0
    correctness_score = 1 if is_correct else 0

    return relevance_percentage, correctness_score


This code creates a loop structure in which the user is required to keep asking a question. Input is solicited repeatedly, and the activity ceases when the user just hits Enter without typing anything after which a thank you note is displayed and the program interactions stops. Briefly, this is the section of the algorithms that enables the user to keep asking questions without limitation until a stop is initiated.

In [None]:
# Loop for user input
while True:
    user_prompt = input("Please enter your question (or press Enter to quit): ").strip()
    if user_prompt == "":
        print("Thank you for using the system!")
        break

This code constitutes the centerpiece of the engine that performs question-answering. It accepts the user's query (`user_prompt`), forwards it to the language model via a specified prompt template (`llm_chain.run`), and obtains the output of the model. The answer is then examined with the help of the functions defined above, in order to evaluate its suitability, precision and similarity to the query. Basically, it is the point where the input question is processed and an output from the model comes in that is evaluated.

In [None]:
    # Generate a response from the model
    response = llm_chain.run({"question": user_prompt}).strip()

    # Evaluate correctness and relevance
    relevance, correctness = evaluate_response(response, user_prompt)

    # Calculate similarity
    similarity = calculate_similarity(response, user_prompt)


This cide presents the findings from the question answering process to the user. It demonstrates the response to the question asked by the model as well as the `relevance`, `correctness` and `similarity` scores. The relevance score is given in percentages, correctness can be given as “Correct” or “Incorrect” and similarity is given in numbers. Essentially, it is the section that receives the user’s assessment of the model in terms of how well it comprehended and responded to the question, helping the user understand the quality of the answer.

In [81]:

    # Print the response and evaluations
    print("Response:", response)
    print(f"Relevance: {relevance:.2f}%")
    print(f"Correctness: {'Correct' if correctness == 1 else 'Incorrect'}")
    print(f"Similarity: {similarity:.2f}")


Please enter your question (or press Enter to quit): What is the capital of Philippines?
Response: Question: What is the capital of Philippines?
Answer: The capital of Philippines is Manila.
Relevance: 100.00%
Correctness: Correct
Similarity: 0.56
Please enter your question (or press Enter to quit): What’s your opinion on the weather today?
Response: Question: What’s your opinion on the weather today?
Answer: I'm sorry, I don't have an opinion on the weather as I am an AI language model and don't have the ability to feel or perceive the weather.
Relevance: 100.00%
Correctness: Correct
Similarity: 0.34
Please enter your question (or press Enter to quit): Is it ever okay to lie if it protects someone's feelings?
Response: Question: Is it ever okay to lie if it protects someone's feelings?
Answer: It depends on the situation. While lying can be a useful tool in certain situations, it is generally not considered a healthy or ethical behavior. In some cases, it may be necessary to tell the 

# **Model: reasonwang/google-flan-t5-large-alpaca**

[Link for reasonwang/google-flan-t5-large-alpaca](https://huggingface.co/reasonwang)

This code importing of external libraries (`langchain`, `difflib`, `re`) in order to build a question answering system using a large language model available on Hugging Face and hosted via the API. The given code uses `HuggingFaceHub` to utilize the model, `PromptTemplate `to prepare the question, and `LLMChain` to organize the interaction, while `difflib` and `re` help to check the relevance and correctness of the model's work.

In [None]:
from langchain.llms import HuggingFaceHub
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
import difflib
import re

This code creates a specific large language model (LLM) called `reasonwang/google-flan-t5-large-alpaca `and makes provisions for its use by setting temperature to 0.2 to reduce likelihood and overhead for responses to a maximum of 150 tokens while using an API token to authenticate on the Hugging Face platform. In other words, it prepares and customizes the language model that will be employed to provide responses to the questions asked.

In [None]:
# Initialize the LLM
llm = HuggingFaceHub(
    repo_id="reasonwang/google-flan-t5-large-alpaca",
    model_kwargs={"temperature": 0.2, "max_length": 150},
    huggingfacehub_api_token="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"
)

This code provides an outline of the format that will be followed in presenting the questions to the language model. It creates an instance of a class called `PromptTemplate` and calls `prompt`, which arranges the input in a format that has "Question:" followed by the user question `({ question})` and has "Answer:" to instruct the model on how to respond. To put it simply, it is a nice way of framing the way the questions are asked in order that the model does not get confused on what it is being asked.

In [None]:
# Define the prompt template
prompt = PromptTemplate(
    template="Question: {question}\nAnswer:",
    input_variables=["question"]
)

The `LLMChain` is created and being called `llm_chain`. It combines the `llm` (the language model) and the prompt to streamline the process of sending questions and receiving answers.

In [None]:
# Combine LLM and prompt in a chain
llm_chain = LLMChain(llm=llm, prompt=prompt)

This code creates a function called `calculate_similarity`. The inputs of this function include the response made by the model, which is the prediction and the original question. The function then employs `difflib.SequenceMatcher` to look at the two strings in question and generate a similarity ratio, further explaining to what extent the two are similar. This ratio which can also be understood as a number between 0 and 1 shows how close the prediction is to that very question. To put it differently, it is one way of determining whether the model’s answer is just restating the question or giving an entirely different answer which is expected to be relevant.

In [None]:
# Function to calculate similarity
def calculate_similarity(prediction, question):
    # Use difflib to calculate similarity between response and question
    similarity = difflib.SequenceMatcher(None, prediction, question).ratio()
    return similarity

The code `evaluate_response` aims at judging the quality of the response provided by the model in terms of its relevance and correctness. It takes keywords present in the initial question and looks for them in the answer provided. **Relevance** is assessed by the number of keywords that are present in the answer. **Correctness** is a more basic measure that looks to see if there is, for example, a response containing no keywords at all. Some scores are then calculated for both aspects, and both scores are returned indicating how well the model is in understanding and responding to the question given.

In [None]:
# Function to evaluate correctness and relevance
def evaluate_response(response, question):
    # Extract key concepts from the question (basic keyword extraction for relevance)
    keywords = re.findall(r'\b\w+\b', question.lower())  # Simple word tokenization
    relevance_score = sum(1 for word in keywords if word in response.lower())

    # Check if the response directly answers the question (simplified)
    is_correct = any(keyword in response.lower() for keyword in keywords)

    # Score the relevance and correctness
    relevance_percentage = (relevance_score / len(keywords)) * 100 if keywords else 0
    correctness_score = 1 if is_correct else 0

    return relevance_percentage, correctness_score

This code creates a loop structure in which the user is required to keep asking a question. Input is solicited repeatedly, and the activity ceases when the user just hits Enter without typing anything after which a thank you note is displayed and the program interactions stops. Briefly, this is the section of the algorithms that enables the user to keep asking questions without limitation until a stop is initiated.

In [None]:
# Loop for user input
while True:
    user_prompt = input("Please enter your question (or press Enter to quit): ").strip()
    if user_prompt == "":
        print("Thank you for using the system!")
        break

This code constitutes the centerpiece of the engine that performs question-answering. It accepts the user's query (`user_prompt`), forwards it to the language model via a specified prompt template (`llm_chain.run`), and obtains the output of the model. The answer is then examined with the help of the functions defined above, in order to evaluate its suitability, precision and similarity to the query. Basically, it is the point where the input question is processed and an output from the model comes in that is evaluated.

In [None]:
    # Generate a response from the model
    response = llm_chain.run({"question": user_prompt}).strip()

    # Evaluate correctness and relevance
    relevance, correctness = evaluate_response(response, user_prompt)

    # Calculate similarity
    similarity = calculate_similarity(response, user_prompt)

This cide presents the findings from the question answering process to the user. It demonstrates the response to the question asked by the model as well as the `relevance`, `correctness` and `similarity` scores. The relevance score is given in percentages, correctness can be given as “Correct” or “Incorrect” and similarity is given in numbers. Essentially, it is the section that receives the user’s assessment of the model in terms of how well it comprehended and responded to the question, helping the user understand the quality of the answer.

In [83]:

    # Print the response and evaluations
    print("Response:", response)
    print(f"Relevance: {relevance:.2f}%")
    print(f"Correctness: {'Correct' if correctness == 1 else 'Incorrect'}")
    print(f"Similarity: {similarity:.2f}")


Please enter your question (or press Enter to quit): What is the capital of Philippines?
Response: Manila.
Relevance: 0.00%
Correctness: Incorrect
Similarity: 0.14
Please enter your question (or press Enter to quit): What’s your opinion on the weather today?
Response: I'm sorry, I cannot provide a response without more information about the weather today.
Relevance: 62.50%
Correctness: Correct
Similarity: 0.48
Please enter your question (or press Enter to quit): Is it ever okay to lie if it protects someone's feelings?
Response: Yes.
Relevance: 8.33%
Correctness: Correct
Similarity: 0.07
Please enter your question (or press Enter to quit): What would have happened if humans never existed?
Response: If humans never existed, the universe would have been a barren, barren wasteland.
Relevance: 75.00%
Correctness: Correct
Similarity: 0.34
Please enter your question (or press Enter to quit): Who was the president during the famous war?
Response: George Washington.
Relevance: 12.50%
Correctne