# Data Augmented Question Answering

https://python.langchain.com/en/latest/use_cases/evaluation/data_augmented_question_answering.html

This notebook uses some generic prompts/language models to evaluate an question answering system that uses other sources of data besides what is in the model. For example, this can be used to evaluate a question answering system over your proprietary data.

## Setup

Let's set up for my own use case.


In [1]:
from langchain.llms import LlamaCpp
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceInstructEmbeddings
from llms import GPT4AllJApi
import pickle
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
def get_retriever(k: int = 3):
    embedding = HuggingFaceInstructEmbeddings(
        model_name="hkunlp/instructor-large")
    with open("faiss_store.pkl", "rb") as f:
        vector_store = pickle.load(f) 
    retriever = vector_store.as_retriever(search_kwargs={"k": k})
    return retriever


llm = GPT4AllJApi()

In [3]:
qa = RetrievalQA.from_llm(llm=llm, retriever=get_retriever())

  from tqdm.autonotebook import trange


load INSTRUCTOR_Transformer
max_seq_length  512


## Examples

Now we need some examples to evaluate. We can do this in two ways:

1. Hard code some examples ourselves
2. Generate examples automatically, using a language model


In [4]:
# Hard-coded examples
examples = [
    {
        "query": "What is Flutter?",
        "answer": "Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop."
    },
    {
        "query": "What programming language is used in Flutter?",
        "answer": "Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type inference, and asynchronous programming."
    },
    {
        "query": "How do I install Flutter?",
        "answer": "To install Flutter, you need to first download the Flutter SDK from the official Flutter website. Once downloaded, extract the contents of the zip file to a desired location on your system, and then add the Flutter SDK's bin directory to your system's PATH environment variable."
    },
    {
        "query": "How do I create a new Flutter project?",
        "answer": "To create a new Flutter project, you can use the flutter create command in the terminal, followed by the name of your project. This will create a new Flutter project with the required directory structure and files."
    },
    {
        "query": "What is a widget in Flutter?",
        "answer": "In Flutter, everything is a widget. A widget is a basic building block of a Flutter app, which can be thought of as a visual element or a part of the user interface. Widgets can be either stateful or stateless."
    },
    {
        "query": "What is the difference between stateful and stateless widgets?",
        "answer": "A stateful widget is a widget that contains mutable state, i.e., data that can change over time. A stateless widget, on the other hand, is a widget that does not contain any mutable state and is purely based on its input parameters."
    },
    {
        "query": "How do I handle user input in Flutter?",
        "answer": "To handle user input in Flutter, you can use various event handlers like onPressed for buttons, onChanged for text fields, etc. You can also use Flutter's GestureDetector widget to detect gestures like tapping, swiping, etc."
    },
    {
        "query": "How do I navigate between screens in Flutter?",
        "answer": "To navigate between screens in Flutter, you can use the Navigator class. You can push a new screen onto the navigation stack using the Navigator.push method, and pop the current screen using the Navigator.pop method."
    },
    {
        "query": "How do I handle async operations in Flutter?",
        "answer": "In Flutter, you can use the async and await keywords to handle async operations. You can use the Future class to represent a value or error that may be available at some point in the future, and use async functions to wait for these values."
    },
    {
        "query": "What is a FutureBuilder in Flutter?",
        "answer": "A FutureBuilder is a widget in Flutter that makes it easy to build UIs that depend on asynchronous data. It takes a Future as input and rebuilds itself whenever the future completes with either a value or an error."
    },
    {
        "query": "How do I add animations in Flutter?",
        "answer": "In Flutter, you can use the Animation class to create animations. You can define an animation by specifying the start and end values, a duration, and a curve. You can then use an AnimationController to control the animation and update the UI accordingly."
    },
    {
        "query": "How do I add images in Flutter?",
        "answer": "To add images in Flutter, you can use the Image widget. You can specify the source of the image using a URL or a local file path. You can also specify the width and height of the image, as well as various other properties like the fit, alignment, and color."
    },
    {
        "query": "How do I add custom fonts in Flutter?",
        "answer": "To add custom fonts in Flutter, you need to first add the font files to your project's assets directory. You then need to specify the font family and file name in your pubspec.yaml file. You can then use the TextStyle widget to apply the custom font to your text."
    },
    {
        "query": "How do I add a splash screen in Flutter?",
        "answer": "To add a splash screen in Flutter, you can create a new widget that displays your app's logo or branding, and then use it as the first screen in your app. You can then use a FutureBuilder to load the data required for your app's home screen, and navigate to it once the data is loaded."
    },
    {
        "query": "How do I add internationalization (i18n) support to my Flutter app?",
        "answer": "To add internationalization support to your Flutter app, you can use the intl package. You can define a set of messages for each supported language, and use the Localizations widget to load the appropriate message set based on the user's device language."
    },
    {
        "query": "How do I use the camera in Flutter?",
        "answer": "To use the camera in Flutter, you can use the camera package. You can use the CameraController class to control the camera and capture images or videos. You can also use various other packages for advanced camera features like barcode scanning and face detection."
    },
    {
        "query": "How do I use the device's sensors in Flutter?",
        "answer": "To use the device's sensors in Flutter, you can use various packages like sensors, flutter_blue, etc. These packages provide APIs to access the device's sensors like accelerometer, gyroscope, magnetometer, etc."
    },
    {
        "query": "How do I use Firebase in Flutter?",
        "answer": "To use Firebase in Flutter, you need to first add the Firebase SDK to your app by following the setup instructions provided by Firebase. Once done, you can use various Firebase services like authentication, database, storage, etc. using the respective Flutter plugins."
    },
    {
        "query": "How do I test my Flutter app?",
        "answer": "To test your Flutter app, you can use Flutter's built-in testing framework called flutter_test. You can write unit tests, widget tests, and integration tests using this framework. You can also use various third-party testing tools like mockito, flutter_driver, etc."
    },
    {
        "query": "How do I deploy my Flutter app?",
        "answer": "To deploy your Flutter app, you can use various methods like publishing to the app stores, creating APKs or IPA files for distribution, deploying to web or desktop platforms, etc. You can also use various third-party tools like Google Play Console, Apple App Store Connect, etc. for app store deployment."
    }
]

In [5]:
examples[:10]

[{'query': 'What is Flutter?',
  'answer': 'Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.'},
 {'query': 'What programming language is used in Flutter?',
  'answer': 'Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type inference, and asynchronous programming.'},
 {'query': 'How do I install Flutter?',
  'answer': "To install Flutter, you need to first download the Flutter SDK from the official Flutter website. Once downloaded, extract the contents of the zip file to a desired location on your system, and then add the Flutter SDK's bin directory to your system's PATH environment variable."},
 {'query': 'How do I create a new Flutter project?',
  'answer': 'To create a new Flutter project, you can use the f

In [None]:
# Generated examples
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(OpenAI())

In [None]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in texts[:5]])

In [None]:
new_examples

In [None]:
# Combine examples
examples += new_examples

## Evaluate

Now that we have examples, we can use the question answering evaluator to evaluate our question answering chain.


In [6]:
from langchain.evaluation.qa import QAEvalChain

In [7]:
predictions = qa.apply(examples[:10])

data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nFlutter is a framework for building cross-platform applications\nthat uses the Dart programming language.\nTo understand some differences between programming with Dart\nand programming with Javascript, \nsee Learning Dart as a JavaScript Developer.\n\nContext:\nFlutter is a framework for building cross-platform applications\nthat uses the Dart programming language.\nTo understand some differences between programming with Dart\nand programming with Swift, see Learning Dart as a Swift Developer\nand Flutter concurrency for Swift developers.\n\nContext:\nFlutter is a multi-paradigm programming environment.\nMany programming techniques developed over the past few decades\nare used in Flutter. We use each one where we believe\nthe strengths of the technique make it particularly well-suited.\nIn no par

In [8]:
predictions[0]

{'query': 'What is Flutter?',
 'answer': 'Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.',
 'result': ' FluTter is a framework for building cross-platform applications using the Dart programming language.'}

In [9]:
for i, p in enumerate(predictions):
    print(f"{i}) {p['query']}")
    print(f"{p['answer']}")
    print(f"{p['result']}\n")

0) What is Flutter?
Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.
 FluTter is a framework for building cross-platform applications using the Dart programming language.

1) What programming language is used in Flutter?
Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type inference, and asynchronous programming.
 Flutter uses Dart as its programming language.

2) How do I install Flutter?
To install Flutter, you need to first download the Flutter SDK from the official Flutter website. Once downloaded, extract the contents of the zip file to a desired location on your system, and then add the Flutter SDK's bin directory to your system's PATH environment variable.
 There are different ways to install Flutter 

In [None]:
llm = OpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)

In [None]:
graded_outputs = eval_chain.evaluate(examples, predictions)

In [None]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

## Evaluate with Other Metrics

In addition to predicting whether the answer is correct or incorrect using a language model, we can also use other metrics to get a more nuanced view on the quality of the answers. To do so, we can use the [Critique](https://docs.inspiredco.ai/critique/) library, which allows for simple calculation of various metrics over generated text.

First you can get an API key from the [Inspired Cognition Dashboard](https://dashboard.inspiredco.ai) and do some setup:

```bash
export INSPIREDCO_API_KEY="..."
pip install inspiredco
```


In [None]:
import inspiredco.critique
import os
critique = inspiredco.critique.Critique(
    api_key=os.environ['INSPIREDCO_API_KEY'])

Then run the following code to set up the configuration and calculate the [ROUGE](https://docs.inspiredco.ai/critique/metric_rouge.html), [chrf](https://docs.inspiredco.ai/critique/metric_chrf.html), [BERTScore](https://docs.inspiredco.ai/critique/metric_bert_score.html), and [UniEval](https://docs.inspiredco.ai/critique/metric_uni_eval.html) (you can choose [other metrics](https://docs.inspiredco.ai/critique/metrics.html) too):


In [None]:
metrics = {
    "rouge": {
        "metric": "rouge",
        "config": {"variety": "rouge_l"},
    },
    "chrf": {
        "metric": "chrf",
        "config": {},
    },
    "bert_score": {
        "metric": "bert_score",
        "config": {"model": "bert-base-uncased"},
    },
    "uni_eval": {
        "metric": "uni_eval",
        "config": {"task": "summarization", "evaluation_aspect": "relevance"},
    },
}

In [None]:
critique_data = [
    {"target": pred['result'], "references": [pred['answer']]} for pred in predictions
]
eval_results = {
    k: critique.evaluate(dataset=critique_data,
                         metric=v["metric"], config=v["config"])
    for k, v in metrics.items()
}

Finally, we can print out the results. We can see that overall the scores are higher when the output is semantically correct, and also when the output closely matches with the gold-standard answer.


In [None]:
for i, eg in enumerate(examples):
    score_string = ", ".join(
        [f"{k}={v['examples'][i]['value']:.4f}" for k, v in eval_results.items()])
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Scores: " + score_string)
    print()