# Data Augmented Question Answering

https://python.langchain.com/en/latest/use_cases/evaluation/data_augmented_question_answering.html

This notebook uses some generic prompts/language models to evaluate an question answering system that uses other sources of data besides what is in the model. For example, this can be used to evaluate a question answering system over your proprietary data.

## Setup

Let's set up for my own use case.


In [None]:
%pip install -qU supabase

In [4]:
from langchain.llms import LlamaCpp
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from supabase.client import Client, create_client
from langchain.vectorstores import SupabaseVectorStore
import os
import GPT4AllJ
from dotenv import load_dotenv
load_dotenv(dotenv_path=".env_supabase")

data {"prompt": "What is Flutter?", "params": {"seed": -1, "n_threads": -1, "n_predict": 128, "top_k": 40, "top_p": 0.9, "temperature": 0.9, "repeat_penalty": 1, "repeat_last_n": 64, "n_batch": 8}}

Fluctter is a social media app that allows users to create and share text-based messages, called "tweets," with other users. The app was created in 2012 and is available for free on both iOS and Android devices. Users can follow other users or create their own groups and follow them. The app also includes features such as the ability to share photos, links, and GIFs, and the option to share posts to social media platforms such as Twitter, Facebook, and Instagram. Fluiter was a notable app at the time of its creation, as it was the first app that allowed users to send text messages
[1mGPT4AllJ[0m
Params: {'seed': -1, 'n_threads': -1, 'n_predict': 128, 'top_k': 40, 'top_p': 0.9, 'temperature': 0.9, 'repeat_penalty': 1, 'repeat_last_n': 64, 'n_batch': 8}


True

In [6]:
supabase_url = os.environ.get("SUPABASE_URL")
supabase_key = os.environ.get("SUPABASE_KEY")
supabase: Client = create_client(supabase_url, supabase_key)


def get_retriever(k: int = 3):
    embedding_model_name = os.environ.get("EMBEDDING_MODEL_NAME")
    embedding = HuggingFaceEmbeddings(model_name=embedding_model_name)
    vector_store = SupabaseVectorStore(
        client=supabase, embedding=embedding, table_name="documents")
    retriever = vector_store.as_retriever(search_kwargs={"k": k})
    return retriever


print('loading model...')

llm = GPT4AllJ.GPT4AllJ()

loading model...


In [11]:
qa = RetrievalQA.from_llm(llm=llm, retriever=get_retriever())

  from .autonotebook import tqdm as notebook_tqdm
2023-05-12 15:56:14,443:INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
2023-05-12 15:56:18,007:INFO - Use pytorch device: cpu


## Examples

Now we need some examples to evaluate. We can do this in two ways:

1. Hard code some examples ourselves
2. Generate examples automatically, using a language model


In [7]:
# Hard-coded examples
examples = [
    {
        "query": "What is Flutter?",
        "answer": "Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop."
    },
    {
        "query": "What programming language is used in Flutter?",
        "answer": "Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type inference, and asynchronous programming."
    },
    {
        "query": "How do I install Flutter?",
        "answer": "To install Flutter, you need to first download the Flutter SDK from the official Flutter website. Once downloaded, extract the contents of the zip file to a desired location on your system, and then add the Flutter SDK's bin directory to your system's PATH environment variable."
    },
    {
        "query": "How do I create a new Flutter project?",
        "answer": "To create a new Flutter project, you can use the flutter create command in the terminal, followed by the name of your project. This will create a new Flutter project with the required directory structure and files."
    },
    {
        "query": "What is a widget in Flutter?",
        "answer": "In Flutter, everything is a widget. A widget is a basic building block of a Flutter app, which can be thought of as a visual element or a part of the user interface. Widgets can be either stateful or stateless."
    },
    {
        "query": "What is the difference between stateful and stateless widgets?",
        "answer": "A stateful widget is a widget that contains mutable state, i.e., data that can change over time. A stateless widget, on the other hand, is a widget that does not contain any mutable state and is purely based on its input parameters."
    },
    {
        "query": "How do I handle user input in Flutter?",
        "answer": "To handle user input in Flutter, you can use various event handlers like onPressed for buttons, onChanged for text fields, etc. You can also use Flutter's GestureDetector widget to detect gestures like tapping, swiping, etc."
    },
    {
        "query": "How do I navigate between screens in Flutter?",
        "answer": "To navigate between screens in Flutter, you can use the Navigator class. You can push a new screen onto the navigation stack using the Navigator.push method, and pop the current screen using the Navigator.pop method."
    },
    {
        "query": "How do I handle async operations in Flutter?",
        "answer": "In Flutter, you can use the async and await keywords to handle async operations. You can use the Future class to represent a value or error that may be available at some point in the future, and use async functions to wait for these values."
    },
    {
        "query": "What is a FutureBuilder in Flutter?",
        "answer": "A FutureBuilder is a widget in Flutter that makes it easy to build UIs that depend on asynchronous data. It takes a Future as input and rebuilds itself whenever the future completes with either a value or an error."
    },
    {
        "query": "How do I add animations in Flutter?",
        "answer": "In Flutter, you can use the Animation class to create animations. You can define an animation by specifying the start and end values, a duration, and a curve. You can then use an AnimationController to control the animation and update the UI accordingly."
    },
    {
        "query": "How do I add images in Flutter?",
        "answer": "To add images in Flutter, you can use the Image widget. You can specify the source of the image using a URL or a local file path. You can also specify the width and height of the image, as well as various other properties like the fit, alignment, and color."
    },
    {
        "query": "How do I add custom fonts in Flutter?",
        "answer": "To add custom fonts in Flutter, you need to first add the font files to your project's assets directory. You then need to specify the font family and file name in your pubspec.yaml file. You can then use the TextStyle widget to apply the custom font to your text."
    },
    {
        "query": "How do I add a splash screen in Flutter?",
        "answer": "To add a splash screen in Flutter, you can create a new widget that displays your app's logo or branding, and then use it as the first screen in your app. You can then use a FutureBuilder to load the data required for your app's home screen, and navigate to it once the data is loaded."
    },
    {
        "query": "How do I add internationalization (i18n) support to my Flutter app?",
        "answer": "To add internationalization support to your Flutter app, you can use the intl package. You can define a set of messages for each supported language, and use the Localizations widget to load the appropriate message set based on the user's device language."
    },
    {
        "query": "How do I use the camera in Flutter?",
        "answer": "To use the camera in Flutter, you can use the camera package. You can use the CameraController class to control the camera and capture images or videos. You can also use various other packages for advanced camera features like barcode scanning and face detection."
    },
    {
        "query": "How do I use the device's sensors in Flutter?",
        "answer": "To use the device's sensors in Flutter, you can use various packages like sensors, flutter_blue, etc. These packages provide APIs to access the device's sensors like accelerometer, gyroscope, magnetometer, etc."
    },
    {
        "query": "How do I use Firebase in Flutter?",
        "answer": "To use Firebase in Flutter, you need to first add the Firebase SDK to your app by following the setup instructions provided by Firebase. Once done, you can use various Firebase services like authentication, database, storage, etc. using the respective Flutter plugins."
    },
    {
        "query": "How do I test my Flutter app?",
        "answer": "To test your Flutter app, you can use Flutter's built-in testing framework called flutter_test. You can write unit tests, widget tests, and integration tests using this framework. You can also use various third-party testing tools like mockito, flutter_driver, etc."
    },
    {
        "query": "How do I deploy my Flutter app?",
        "answer": "To deploy your Flutter app, you can use various methods like publishing to the app stores, creating APKs or IPA files for distribution, deploying to web or desktop platforms, etc. You can also use various third-party tools like Google Play Console, Apple App Store Connect, etc. for app store deployment."
    }
]

In [6]:
examples[:10]

[{'query': 'What is Flutter?',
  'answer': 'Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.'},
 {'query': 'What programming language is used in Flutter?',
  'answer': 'Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type inference, and asynchronous programming.'},
 {'query': 'How do I install Flutter?',
  'answer': "To install Flutter, you need to first download the Flutter SDK from the official Flutter website. Once downloaded, extract the contents of the zip file to a desired location on your system, and then add the Flutter SDK's bin directory to your system's PATH environment variable."},
 {'query': 'How do I create a new Flutter project?',
  'answer': 'To create a new Flutter project, you can use the f

In [None]:
# Generated examples
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(OpenAI())

In [None]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in texts[:5]])

In [None]:
new_examples

In [None]:
# Combine examples
examples += new_examples

## Evaluate

Now that we have examples, we can use the question answering evaluator to evaluate our question answering chain.


In [9]:
from langchain.evaluation.qa import QAEvalChain

In [12]:
predictions = qa.apply(examples[:10])

Batches: 100%|██████████| 1/1 [00:00<00:00,  3.15it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nIntroduction\n\nThis page collects some common questions asked about\nFlutter. You might also check out the following\nspecialized FAQs:\n\nWeb FAQ\n\nPerformance FAQ\n\nWhat is Flutter?\n\nFlutter is Google\u2019s portable UI toolkit for crafting beautiful,\nnatively compiled applications for mobile, web,\nand desktop from a single codebase.\nFlutter works with existing code,\nis used by developers and organizations around\nthe world, and is free and open source.\n\nWho is Flutter for?\n\nFor users, Flutter makes beautiful apps come to life.\n\nFor developers, Flutter lowers the bar to entry for building apps.\nIt speeds app development and reduces the cost and complexity\nof app production across platforms.\n\nFor designers, Flutter provides a canvas for\nhigh-end user experiences. Fast Company

Batches: 100%|██████████| 1/1 [00:00<00:00,  9.68it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nThe rendering process: How Flutter turns UI code into pixels.\n\nAn overview of the platform embedders: The code that lets mobile and\ndesktop OSes execute Flutter apps.\n\nIntegrating Flutter with other code: Information about different techniques\navailable to Flutter apps.\n\nSupport for the web: Concluding remarks about the characteristics of\nFlutter in a browser environment.\n\nArchitectural layers\n\nFlutter is designed as an extensible, layered system. It exists as a series of\nindependent libraries that each depend on the underlying layer. No layer has\nprivileged access to the layer below, and every part of the framework level is\ndesigned to be optional and replaceable.\n\nTo the underlying operating system, Flutter applications are packaged in the\nsame way as any other native applica

Batches: 100%|██████████| 1/1 [00:00<00:00,  7.22it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nDownload the following installation bundle to get the latest\nstable release of the Flutter SDK:\n\n    (loading\u2026)\n\n    For other release channels, and older builds,\nsee the SDK releases page.\n\nExtract the file in the desired location, for example:\n\n    \n$ cd ~/development\n$ tar xf ~/Downloads/flutter_linux_vX.X.X-stable.tar.xz\n    \n\n    If you don\u2019t want to install a fixed version of the installation bundle, \nyou can skip steps 1 and 2. \nInstead, get the source code from the Flutter repo\non GitHub with the following command:\n\n    \n$ git clone https://github.com/flutter/flutter.git\n    \n\n    You can also change branches or tags as needed.\nFor example, to get just the stable version:\n\n    \n$ git clone https://github.com/flutter/flutter.git -b stable\n\nAdd the fl

Batches: 100%|██████████| 1/1 [00:00<00:00,  5.95it/s]




Batches: 100%|██████████| 1/1 [00:00<00:00,  3.65it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nIn Flutter, widgets (akin to components in React) are represented by immutable\nclasses that are used to configure a tree of objects. These widgets are used to\nmanage a separate tree of objects for layout, which is then used to manage a\nseparate tree of objects for compositing. Flutter is, at its core, a series of\nmechanisms for efficiently walking the modified parts of trees, converting trees\nof objects into lower-level trees of objects, and propagating changes across\nthese trees.\n\nA widget declares its user interface by overriding the build() method, which\nis a function that converts state to UI:\n\nThe build() method is by design fast to execute and should be free of side\neffects, allowing it to be called by the framework whenever needed (potentially\nas often as once per rendered fra

Batches: 100%|██████████| 1/1 [00:00<00:00,  4.75it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nIn order to build more complex experiences\u2014for example,\nto react in more interesting ways to user input\u2014applications\ntypically carry some state. Flutter uses StatefulWidgets to capture\nthis idea. StatefulWidgets are special widgets that know how to generate\nState objects, which are then used to hold state.\nConsider this basic example, using the ElevatedButton mentioned earlier:\n\nYou might wonder why StatefulWidget and State are separate objects.\nIn Flutter, these two types of objects have different life cycles.\nWidgets are temporary objects, used to construct a presentation of\nthe application in its current state. State objects, on the other\nhand, are persistent between calls to\nbuild(), allowing them to remember information.\n\nThe example above accepts user input and direc

Batches: 100%|██████████| 1/1 [00:00<00:00,  7.48it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nRetrieving user input\n\nGiven how Flutter uses immutable widgets with a separate state,\nyou might be wondering how user input fits into the picture.\nIn UIKit, you usually query the widgets for their current values\nwhen it\u2019s time to submit the user input, or action on it.\nHow does that work in Flutter?\n\nIn practice forms are handled, like everything in Flutter,\nby specialized widgets. If you have a TextField or a\nTextFormField, you can supply a TextEditingController\nto retrieve user input:\n\nYou can find more information and the full code listing in\nRetrieve the value of a text field,\nfrom the Flutter cookbook.\n\nPlaceholder in a text field\n\nIn Flutter, you can easily show a \u201chint\u201d or a placeholder text\nfor your field by adding an InputDecoration object\nto the deco

Batches: 100%|██████████| 1/1 [00:00<00:00,  5.46it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nIn Android, new screens are new Activities.\nIn iOS, new screens are new ViewControllers. In Flutter,\nscreens are just Widgets! And to navigate to new\nscreens in Flutter, use the Navigator widget.\n\nHow do I navigate between screens?\n\nIn React Native, there are three main navigators:\nStackNavigator, TabNavigator, and DrawerNavigator.\nEach provides a way to configure and define the screens.\n\n// React Native\n\nconst\n\nMyApp\n\nTabNavigator\n\nHome\n\nscreen\n\nHomeScreen\n\n},\n\nNotifications\n\nscreen\n\ntabNavScreen\n\n},\n\ntabBarOptions\n\nactiveTintColor\n\n#e91e63\n\n);\n\nconst\n\nSimpleApp\n\nStackNavigator\n\n({\n\nHome\n\nscreen\n\nMyApp\n\n},\n\nstackScreen\n\nscreen\n\nStackScreen\n\n});\n\nexport\n\ndefault\n\nMyApp1\n\nDrawerNavigator\n\n({\n\nHome\n\nscreen\n\nSimpleApp\n

Batches: 100%|██████████| 1/1 [00:00<00:00,  7.58it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nOnce the awaited network call is done, update the UI by calling setState(),\nwhich triggers a rebuild of the widget sub-tree and updates the data.\n\nThe following example loads data asynchronously and displays it in a ListView:\n\nRefer to the next section for more information on doing work in the\nbackground, and how Flutter differs from Android.\n\nHow do you move work to a background thread?\n\nIn Android, when you want to access a network resource you would typically\nmove to a background thread and do the work, as to not block the main thread,\nand avoid ANRs. For example, you might be using an AsyncTask, a LiveData,\nan IntentService, a JobScheduler job, or an RxJava pipeline with a\nscheduler that works on background threads.\n\nSince Flutter is single threaded and runs an event loop (lik

Batches: 100%|██████████| 1/1 [00:00<00:00,  7.69it/s]


data {"prompt": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nContext:\nAsync widgets\n\nUI\n\nWidgets\n\nAsync\n\nAsync patterns to your Flutter application.\n\nSee more widgets in the widget catalog.\n\nFutureBuilder\n\nWidget that builds itself based on the latest snapshot of interaction with a Future.\n\nStreamBuilder\n\nWidget that builds itself based on the latest snapshot of interaction with a Stream.\n\nSee more widgets in the widget catalog.\n\nContext:\nIn Flutter, widgets (akin to components in React) are represented by immutable\nclasses that are used to configure a tree of objects. These widgets are used to\nmanage a separate tree of objects for layout, which is then used to manage a\nseparate tree of objects for compositing. Flutter is, at its core, a series of\nmechanisms for efficiently walking the modified parts of trees, converting trees\nof objects

In [13]:
predictions[0]

{'query': 'What is Flutter?',
 'answer': 'Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.',
 'result': ' Flutter is a cross-platform UI toolkit that allows code reuse\nacross operating systems such as iOS and Android, while also enabling applications\nto interface directly with underlying platform services. The goal is to enable developers\nto deliver high-performance apps that feel natural on different platforms, embracing\ndifferences where they exist while sharing as much code as possible. Flutter is open\nsource, with a permiissi BSD license, and has a thriving ecosystem of third-party\npackages that supplement the core library functionality.'}

In [14]:
for i, p in enumerate(predictions):
    print(f"{i}) {p['query']}")
    print(f"{p['answer']}")
    print(f"{p['result']}\n")

0) What is Flutter?
Flutter is an open-source mobile application development framework developed by Google. It allows developers to build high-performance, natively compiled mobile apps for iOS and Android, as well as for the web and desktop.
 Flutter is a cross-platform UI toolkit that allows code reuse
across operating systems such as iOS and Android, while also enabling applications
to interface directly with underlying platform services. The goal is to enable developers
to deliver high-performance apps that feel natural on different platforms, embracing
differences where they exist while sharing as much code as possible. Flutter is open
source, with a permiissi BSD license, and has a thriving ecosystem of third-party
packages that supplement the core library functionality.

1) What programming language is used in Flutter?
Flutter uses Dart programming language, which is also developed by Google. Dart is a modern, object-oriented language with features like garbage collection, type 

In [None]:
llm = OpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)

In [None]:
graded_outputs = eval_chain.evaluate(examples, predictions)

In [None]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

## Evaluate with Other Metrics

In addition to predicting whether the answer is correct or incorrect using a language model, we can also use other metrics to get a more nuanced view on the quality of the answers. To do so, we can use the [Critique](https://docs.inspiredco.ai/critique/) library, which allows for simple calculation of various metrics over generated text.

First you can get an API key from the [Inspired Cognition Dashboard](https://dashboard.inspiredco.ai) and do some setup:

```bash
export INSPIREDCO_API_KEY="..."
pip install inspiredco
```


In [None]:
import inspiredco.critique
import os
critique = inspiredco.critique.Critique(
    api_key=os.environ['INSPIREDCO_API_KEY'])

Then run the following code to set up the configuration and calculate the [ROUGE](https://docs.inspiredco.ai/critique/metric_rouge.html), [chrf](https://docs.inspiredco.ai/critique/metric_chrf.html), [BERTScore](https://docs.inspiredco.ai/critique/metric_bert_score.html), and [UniEval](https://docs.inspiredco.ai/critique/metric_uni_eval.html) (you can choose [other metrics](https://docs.inspiredco.ai/critique/metrics.html) too):


In [None]:
metrics = {
    "rouge": {
        "metric": "rouge",
        "config": {"variety": "rouge_l"},
    },
    "chrf": {
        "metric": "chrf",
        "config": {},
    },
    "bert_score": {
        "metric": "bert_score",
        "config": {"model": "bert-base-uncased"},
    },
    "uni_eval": {
        "metric": "uni_eval",
        "config": {"task": "summarization", "evaluation_aspect": "relevance"},
    },
}

In [None]:
critique_data = [
    {"target": pred['result'], "references": [pred['answer']]} for pred in predictions
]
eval_results = {
    k: critique.evaluate(dataset=critique_data,
                         metric=v["metric"], config=v["config"])
    for k, v in metrics.items()
}

Finally, we can print out the results. We can see that overall the scores are higher when the output is semantically correct, and also when the output closely matches with the gold-standard answer.


In [None]:
for i, eg in enumerate(examples):
    score_string = ", ".join(
        [f"{k}={v['examples'][i]['value']:.4f}" for k, v in eval_results.items()])
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Scores: " + score_string)
    print()