uptrain-ai · shrjain1312 · Apr 4, 2024 · Apr 4, 2024 · Apr 4, 2024 · Apr 4, 2024
diff --git a/README.md b/README.md
@@ -215,6 +215,8 @@ Speak directly with the maintainers of UpTrain by [booking a call here](https://
 | Eval | Description |
 | ---- | ----------- |
 |[Sub-Query Completeness](https://docs.uptrain.ai/predefined-evaluations/query-quality/sub-query-completeness) | Evaluate whether all of the sub-questions generated from a user's query, taken together, cover all aspects of the user's query or not |
+| [Multi-Query Accuracy](https://docs.uptrain.ai/predefined-evaluations/query-quality/multi-query-accuracy) | Evaluate whether the variants generated accurately represent the original query |
+
 
 <br />
 

diff --git a/docs/mint.json b/docs/mint.json
@@ -128,7 +128,8 @@
         {
           "group": "Query Clarity Evals",
           "pages": [
-            "predefined-evaluations/query-quality/sub-query-completeness"
+            "predefined-evaluations/query-quality/sub-query-completeness",
+            "predefined-evaluations/query-quality/multi-query-accuracy"
           ]
         },
         {

diff --git a/docs/predefined-evaluations/overview.mdx b/docs/predefined-evaluations/overview.mdx
@@ -60,6 +60,7 @@ You can choose evals as per your needs. We have divided them into a few categori
 | Eval | Description |
 | ---- | ----------- |
 |[Sub-query Completeness](/predefined-evaluations/query-quality/sub-query-completeness) | Evaluate if the list of generated sub-questions comprehensively cover all aspects of the main question. |
+|[Multi-query Accuracy](/predefined-evaluations/query-quality/multi-query-accuracy) | Evaluates how accurately the variations of the query represent the same question. |
   </Accordion>
 
   <Accordion title="Code Related Evals">

diff --git a/docs/predefined-evaluations/query-quality/multi-query-accuracy.mdx b/docs/predefined-evaluations/query-quality/multi-query-accuracy.mdx
@@ -0,0 +1,83 @@
+---
+title: Multi-Query Accuracy
+description: Evaluates how accurately the variations of the query represent the same question.
+---
+
+Columns required:
+- `question`: The question asked by the user
+- `variants`: Sub questions generated from the question
+
+### How to use it?
+
+```python
+from uptrain import EvalLLM, Evals
+
+OPENAI_API_KEY = "sk-********************"  # Insert your OpenAI key here
+
+data = [
+    {
+        'question': 'How does the stock market work?',
+        'variants': '1. What is the stock market?\n 2. How does the stock market function?\n 3. What is the purpose of the stock market?'        
+    },
+    {
+        'question': 'How does the stock market work?',
+        'variants': '1. What is the stock market?'        
+    }
+]
+
+eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
+
+res = eval_llm.evaluate(
+    data = data,
+    checks = [Evals.MULTI_QUERY_ACCURACY]
+)
+```
+<Info>By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this [tutorial](https://github.com/uptrain-ai/uptrain/blob/main/examples/open_source_evaluator_tutorial.ipynb).</Info>
+
+Sample Response:
+```json
+[
+   {
+      "question": "How does the stock market work?",
+      "variants": "1. What is the stock market?\n 2. How does the stock market function?\n 3. What is the purpose of the stock market?",
+      "score_multi_query_accuracy": 1.0,
+      "explanation_multi_query_accuracy": "{\n    \"Reasoning\": \"The response provides accurate and relevant information about the functioning and purpose of the stock market, addressing the various aspects of the question across different queries. It covers the definition of the stock market, its functioning, and its purpose, demonstrating a comprehensive understanding of the topic.\",\n    \"Choice\": \"A\"\n}"
+   },
+   {
+      "question": "How does the stock market work?",
+      "variants": "1. What is the stock market?",
+      "score_multi_query_accuracy": 0.0,
+      "explanation_multi_query_accuracy": "{\n    \"Reasoning\": \"The given variation does not directly address the main causes of climate change, but rather focuses on defining the stock market. It does not cover the aspects of how the stock market works, such as trading, investment, and market dynamics.\",\n    \"Choice\": \"C\"\n}"
+   }
+]
+```
+
+<Note>A higher Multi-Query Accuracy score reflects that the generated variants accurately represent the main question. A lower score indicates that the variants do not cover all the aspects of the main question.</Note>
+
+### How it works?
+
+We evaluate Multi-Query Accuracy by determining which of the following three cases apply for the given task data:
+
+* The given variations mean the same as the original question.
+* The given variations partially mean the same as the original question.
+* The given variations do not mean the same as the original question.
+
+
+<CardGroup cols={2}>
+  <Card
+    title="Tutorial"
+    href="https://github.com/uptrain-ai/uptrain/blob/main/examples/checks/query_quality/multi_query_accuracy.ipynb"
+    icon="github"
+    color="#808080"
+  >
+    Open this tutorial in GitHub
+  </Card>
+  <Card
+    title="Have Questions?"
+    href="https://join.slack.com/t/uptraincommunity/shared_invite/zt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg"
+    icon="slack"
+    color="#808080"
+  >
+    Join our community for any questions or requests
+  </Card>
+</CardGroup>
diff --git a/docs/predefined-evaluations/query-quality/sub-query-completeness.mdx b/docs/predefined-evaluations/query-quality/sub-query-completeness.mdx
@@ -7,7 +7,7 @@ Sub-Query Completeness checks whether the sub-queries generated from a question
 
 Columns required:
 - `question`: The question asked by the user
-- `sub_question`: Sub questions generated from the question
+- `sub_questions`: Sub questions generated from the question
 
 ### How to use it?
 
@@ -17,10 +17,10 @@ from uptrain import EvalLLM, Evals
 OPENAI_API_KEY = "sk-********************"  # Insert your OpenAI key here
 
 data = [
-    {
-        'question': 'What is the Taj Mahal? When was it built, where and by whom',
-        'sub_questions': '1. What is the Taj Mahal? '        
-    }
+  {
+    'question': 'What is the Taj Mahal? When was it built, where and by whom?',
+    'sub_questions': '1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?'        
+  }
 ]
 
 eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
@@ -35,15 +35,17 @@ res = eval_llm.evaluate(
 Sample Response:
 ```json
 [
-   {
-      "score_sub_query_completeness": 0.0,
-      "explanation_sub_query_completeness": "Step by step reasoning:\n\n1. The main question is \"What is the Taj Mahal? When was it built, where and by whom.\"\n2. The sub-question provided is \"What is the Taj Mahal?\"\n3. The sub-question does not cover the aspects of when it was built, where, and by whom.\n4. The sub-question collectively does not cover any aspects of the main question.\n\n[Choice]: (C) Sub Questions collectively does not cover any aspects of the main question.\n[Explanation]: The sub-question provided collectively does not cover any aspects of the main question."
-   }
+  {
+    "question": "What is the Taj Mahal? When was it built, where and by whom?",
+    "sub_questions": "1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?",
+    "score_sub_query_completeness": 1.0,
+    "explanation_sub_query_completeness": "Step by step reasoning:\n\n1. What is the Taj Mahal? - This sub-question covers the aspect of understanding what the Taj Mahal is, providing information about its nature and purpose.\n2. When was the Taj Mahal built? - This sub-question covers the aspect of the time of construction, addressing the historical timeline of the Taj Mahal's creation.\n3. Where is the Taj Mahal? - This sub-question covers the aspect of location, providing information about the geographical placement of the Taj Mahal.\n4. Who built the Taj Mahal? - This sub-question covers the aspect of the creator, addressing the individuals or entities responsible for the construction of the Taj Mahal.\n\nConclusion:\nThe sub-questions collectively cover all the aspects of the main question.\n\n[Choice]: (A) Sub Questions collectively all the aspects of the main question."
+  }
 ]
 ```
 <Note>A higher Sub-Query Completeness score reflects that the generated sub-questions cover all aspects of the question asked.</Note>
 
-The `sub_question` does not contain some parts of the `question` such as: "When was the Taj Mahal?", "Who built the Taj Mahal?", "Where is the the Taj Mahal?"
+The `sub_questions` do not contain some parts of the `question` such as: "When was the Taj Mahal?", "Who built the Taj Mahal?", "Where is the the Taj Mahal?"
 
 Resulting in low Sub-Query Completeness score.
 
@@ -59,7 +61,7 @@ We evaluate Sub-Query Completeness by determining which of the following three c
 <CardGroup cols={2}>
   <Card
     title="Tutorial"
-    href="https://github.com/uptrain-ai/uptrain/blob/main/examples/checks/sub_query/sub_query_completeness.ipynb"
+    href="https://github.com/uptrain-ai/uptrain/blob/main/examples/checks/query_quality/sub_query_completeness.ipynb"
     icon="github"
     color="#808080"
   >

diff --git a/examples/checks/README.md b/examples/checks/README.md
@@ -87,3 +87,4 @@
 | Eval | Description |
 | ---- | ----------- |
 |[Sub-Query Completeness](https://docs.uptrain.ai/predefined-evaluations/query-quality/sub-query-completeness) | Evaluate whether all of the sub-questions generated from a user's query, taken together, cover all aspects of the user's query or not |
+| [Multi-Query Accuracy](https://docs.uptrain.ai/predefined-evaluations/query-quality/multi-query-accuracy) | Evaluate whether the variants generated accurately represent the original query |
diff --git a/examples/checks/sub_query/README.md → examples/checks/query_quality/README.md b/examples/checks/sub_query/README.md → examples/checks/query_quality/README.md
@@ -25,3 +25,4 @@
 | Eval | Description |
 | ---- | ----------- |
 |[Sub-Query Completeness](https://docs.uptrain.ai/predefined-evaluations/query-quality/sub-query-completeness) | Evaluate whether all of the sub-questions generated from a user's query, taken together, cover all aspects of the user's query or not |
+| [Multi-Query Accuracy](https://docs.uptrain.ai/predefined-evaluations/query-quality/multi-query-accuracy) | Evaluate whether the variants generated accurately represent the original query |