Retrieval Augmented Generation with LLM Demo (#16)

- Added a new RAG + prompt + LLM UI (demo). - Added an example config and notebook. - Updated main README with "updates" sub-section. - Updated `run_demo.py` to include all the options to run a demo (UI, UI + service, UI + <user_defined_service>)
IntelLabs · May 21, 2023 · af86bc2 · af86bc2
1 parent 056c1c6
commit af86bc2
Show file tree

Hide file tree

Showing 13 changed files with 776 additions and 186 deletions.
diff --git a/README.md b/README.md
diff --git a/config/rag_generation_with_dynamic_prompt.yaml b/config/rag_generation_with_dynamic_prompt.yaml
@@ -0,0 +1,50 @@
+components:
+- name: Store
+  params:
+    host: <index ip>
+    index: <index name>
+    port: 80
+    search_fields: ["title", "content"]
+  type: ElasticsearchDocumentStore
+- name: Retriever
+  params:
+    document_store: Store
+    top_k: 100
+  type: BM25Retriever
+- name: Reranker
+  params:
+    batch_size: 32
+    model_name_or_path: cross-encoder/ms-marco-MiniLM-L-6-v2
+    top_k: 5
+    use_gpu: true
+  type: SentenceTransformersRanker
+- name: AParser
+  type: AnswerParser
+- name: LFQA
+  params:
+    name: lfqa
+    prompt_text: "Answer the question using the provided context. Your answer should be in your own words and be no longer than 50 words. \n\n Context: {join(documents)} \n\n Question: {query} \n\n Answer:"
+    output_parser: AParser
+  type: PromptTemplate
+- name: Prompter
+  params:
+    model_name_or_path: MBZUAI/LaMini-Flan-T5-783M
+    use_gpu: true
+    model_kwargs:
+      model_max_length: 2048
+      torch_dtype: torch.bfloat16
+    default_prompt_template: LFQA
+  type: PromptNode
+pipelines:
+- name: query
+  nodes:
+  - inputs:
+    - Query
+    name: Retriever
+  - inputs:
+    - Retriever
+    name: Reranker
+  - inputs:
+    - Reranker
+    name: Prompter
+version: 1.17.0
diff --git a/demo/README.md b/demo/README.md
@@ -1,26 +1,50 @@
 # Running Demos
 
-To run a demo, use its config name; for example:
+To execute a demo, use its configuration name. For instance:
 
 ```sh
 python run_demo.py -t QA1
 ```
 
-The server and UI are are created as subprocesses that run in the background. Use the PIDs to kill them.
+The server and UI will be spawned as subprocesses that run in the background. You can use the PIDs (Process IDs) to terminate them when needed.
 
-Use the  `--help` flag for a list of available configurations.
+To obtain a list of available configurations, utilize the `--help` flag.
 
 ## Available Demos
 
-| Name    | Comment                                                                             | Config Name |
-|:--------|:------------------------------------------------------------------------------------|:-----------:|
-| Q&A     | Abstractive Q&A demo using BM25, SBERT reranker and an FiD.                         | `QA1`       |
-| Q&A     | Abstractive Q&A demo using ColBERT v2 (w/ PLAID index) retriever and an FiD reader. | `QA2`       |
-| Summary | Summarization using BM25, SBERT reranker and long-T5 reader                         | `SUM`       |
-| Image   | Abstractive Q&A demo, with an image generation model for the answer.                | `QADIFF`    |
+| Name    | Description                                                                          | Config Name |
+|:--------|:-------------------------------------------------------------------------------------|:-----------:|
+| Q&A     | Abstractive Q&A demo utilizing BM25, SBERT reranker, and FiD model.                   | `QA1`       |
+| Q&A     | Abstractive Q&A demo using ColBERT v2 (with PLAID index) retriever and FiD reader.   | `QA2`       |
+| Summarization | Summarization demo employing BM25, SBERT reranker, and long-T5 reader.               | `SUM`       |
+| Image   | Abstractive Q&A demo with an image generation model for the answer.                   | `QADIFF`    |
+| LLM     | Retrieval augmented generation with generative LLM model.                             | `LLM`       |
 
-ColBERT demo with a wikipedia index takes about 15 minutes to load up. Also, see remark about GPU usage in the [README](../README.md#plaid-requirements).
+Please note that the ColBERT demo with a Wikipedia index may take around 15 minutes to load. Also, make sure to review the [README](../models.md#plaid-requirements) for information regarding GPU usage requirements.
 
-## Demo Screenshot
+### Additional Options
+
+If you already have a fastRAG pipeline service running locally and wish to utilize it with one of the provided UI interfaces, you can add the `--only-ui` flag to the demo script:
+
+```sh
+python run_demo.py -t LLM --only-ui
+```
+
+In case your pipeline service is running on a non-local machine or a different port other than 8000, you can use the `--endpoint` argument to specify the URL:
+
+```sh
+python run_demo.py -t LLM --endpoint http://hostname:80
+```
+
+To manually run a UI with the `API_ENDPOINT` directed to a fastRAG service, you can execute the following command:
+
+```bash
+API_ENDPOINT=http://localhost:8000 \
+             python -m streamlit run fastrag/ui/webapp.py
+```
+
+Make sure to replace `http://localhost:8000` with the appropriate URL of your fastRAG service.
+
+## Screenshot
 
 ![alt text](../assets/qa_demo.png)
diff --git a/demo/run_demo.py b/demo/run_demo.py
@@ -8,13 +8,15 @@
     "QA2": "qa_plaid.yaml",
     "QADIFF": "qa_diffusion_pipeline.yaml",
     "SUMR": "summarization_pipeline.yaml",
+    "LLM": "rag_generation_with_dynamic_prompt.yaml",
 }
 
 SCREENS = {
     "QA1": "webapp",
     "QA2": "webapp",
     "QADIFF": "webapp",
     "SUMR": "webapp_summarization",
+    "LLM": "prompt_llm",
 }
 
 
@@ -40,27 +42,37 @@ def get_pid(cmd):
         choices=list(TASKS.keys()),
         help=f"The abbreviated name for the task configuraion. \n {TASKS} \n",
     )
+    parser.add_argument(
+        "-e", "--endpoint", default="http://localhost:8000", help="pipeline service endpoint"
+    )
+    parser.add_argument(
+        "--only-ui",
+        action="store_true",
+        help="launch only the UI interface (without launching a service)",
+    )
 
     args = parser.parse_args()
     path = os.getcwd()
 
-    # Create REST server
-    cmd = f"python -m fastrag.rest_api.application --config={path}/config/TASKCONFIGURATION"
-    cmd = cmd.replace("TASKCONFIGURATION", TASKS[args.task_config])
-    run_service(cmd)
+    s_pid = "NA"
+    if not args.only_ui:
+        # Create REST server
+        cmd = f"python -m fastrag.rest_api.application --config={path}/config/TASKCONFIGURATION"
+        cmd = cmd.replace("TASKCONFIGURATION", TASKS[args.task_config])
+        print("Launching fastRAG pipeline service...")
+        run_service(cmd)
+        time.sleep(10)
+        s_pid = get_pid("fastrag.rest_api.application")
 
     # Create UI
-    os.environ["API_ENDPOINT"] = "http://localhost:8000"
+    os.environ["API_ENDPOINT"] = f"{args.endpoint}"
     cmd = f"python -m streamlit run {path}/fastrag/ui/SCREEN.py"
     cmd = cmd.replace("SCREEN", SCREENS[args.task_config])
+    print("Launching UI...")
+    time.sleep(3)
     run_service(cmd)
-
-    # Sleep and wait for initialization, pids
-    print("Creating services...")
-    time.sleep(10)
-    s_pid = get_pid("fastrag.rest_api.application")
     u_pid = get_pid("streamlit run")
 
     print("\n")
-    print(f"Server on  localhost:8000/docs   PID={s_pid}")
+    print(f"Server on  {args.endpoint}/docs  PID={s_pid}")
     print(f"UI on      localhost:8501        PID={u_pid}")
diff --git a/examples/rag-prompt-hf.ipynb b/examples/rag-prompt-hf.ipynb
@@ -0,0 +1,164 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "7af1bfad",
+   "metadata": {},
+   "source": [
+    "# Retrieval Augmented Generation with LLMs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "be7a0c4a",
+   "metadata": {},
+   "source": [
+    "Define an information source to retrieve from"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "359f06de",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from haystack.schema import Document\n",
+    "from haystack.document_stores import InMemoryDocumentStore\n",
+    "\n",
+    "document_store = InMemoryDocumentStore(use_gpu=False, use_bm25=True)\n",
+    "\n",
+    "# 4 example documents to index\n",
+    "examples = [\n",
+    "    \"Lionel Andrés Messi[note 1] (Spanish pronunciation: [ljoˈnel anˈdɾes ˈmesi] (listen); born 24 June 1987), also known as Leo Messi, is an Argentine professional footballer who plays as a forward for Ligue 1 club Paris Saint-Germain and captains the Argentina national team. Widely regarded as one of the greatest players of all time, Messi has won a record seven Ballon d'Or awards,[note 2] a record six European Golden Shoes, and in 2020 was named to the Ballon d'Or Dream Team. Until leaving the club in 2021, he had spent his entire professional career with Barcelona, where he won a club-record 35 trophies, including 10 La Liga titles, seven Copa del Rey titles and four UEFA Champions Leagues. With his country, he won the 2021 Copa América and the 2022 FIFA World Cup. A prolific goalscorer and creative playmaker, Messi holds the records for most goals in La Liga (474), most hat-tricks in La Liga (36) and the UEFA Champions League (8), and most assists in La Liga (192) and the Copa América (17). He has also the most international goals by a South American male (98). Messi has scored over 795 senior career goals for club and country, and has the most goals by a player for a single club (672).\",\n",
+    "    \"Born and raised in central Argentina, Messi relocated to Spain at the age of 13 to join Barcelona, for whom he made his competitive debut aged 17 in October 2004. He established himself as an integral player for the club within the next three years, and in his first uninterrupted season in 2008–09 he helped Barcelona achieve the first treble in Spanish football; that year, aged 22, Messi won his first Ballon d'Or. Three successful seasons followed, with Messi winning four consecutive Ballons d'Or, making him the first player to win the award four times. During the 2011–12 season, he set the La Liga and European records for most goals scored in a single season, while establishing himself as Barcelona's all-time top scorer. The following two seasons, Messi finished second for the Ballon d'Or behind Cristiano Ronaldo (his perceived career rival), before regaining his best form during the 2014–15 campaign, becoming the all-time top scorer in La Liga and leading Barcelona to a historic second treble, after which he was awarded a fifth Ballon d'Or in 2015. Messi assumed captaincy of Barcelona in 2018, and in 2019 he won a record sixth Ballon d'Or. Out of contract, he signed for Paris Saint-Germain in August 2021.\",\n",
+    "    \"An Argentine international, Messi holds the national record for appearances and is also the country's all-time leading goalscorer. At youth level, he won the 2005 FIFA World Youth Championship, finishing the tournament with both the Golden Ball and Golden Shoe, and an Olympic gold medal at the 2008 Summer Olympics. His style of play as a diminutive, left-footed dribbler drew comparisons with his compatriot Diego Maradona, who described Messi as his successor. After his senior debut in August 2005, Messi became the youngest Argentine to play and score in a FIFA World Cup in 2006, and reached the final of the 2007 Copa América, where he was named young player of the tournament. As the squad's captain from August 2011, he led Argentina to three consecutive finals: the 2014 FIFA World Cup, for which he won the Golden Ball, and the 2015 and 2016 Copa América, winning the Golden Ball in the 2015 edition. After announcing his international retirement in 2016, he reversed his decision and led his country to qualification for the 2018 FIFA World Cup, a third-place finish at the 2019 Copa América, and victory in the 2021 Copa América, while winning the Golden Ball and Golden Boot for the latter. This achievement would see him receive a record seventh Ballon d'Or in 2021. In 2022, he captained his country to win the 2022 FIFA World Cup, for which he won the Golden Ball for a record second time, and broke the record for most appearances in World Cup tournaments with 26 matches played.\",\n",
+    "    \"Messi has endorsed sportswear company Adidas since 2006. According to France Football, he was the world's highest-paid footballer for five years out of six between 2009 and 2014, and was ranked the world's highest-paid athlete by Forbes in 2019 and 2022. Messi was among Time's 100 most influential people in the world in 2011 and 2012. In February 2020, he was awarded the Laureus World Sportsman of the Year, thus becoming the first footballer and the first team sport athlete to win the award. Later that year, Messi became the second footballer and second team-sport athlete to surpass $1 billion in career earnings.\",\n",
+    "    \n",
+    "]\n",
+    "\n",
+    "documents = []\n",
+    "for i, d in enumerate(examples):\n",
+    "    documents.append(Document(content=d, id=i))\n",
+    "\n",
+    "document_store.write_documents(documents)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "7e653e5f",
+   "metadata": {},
+   "source": [
+    "Define the prompt template. `{query}` will be replaced with the user's query and `{documents}` with the retrieved documents fetched from the index.\n",
+    "\n",
+    "We define a `PromptModel` that automatically uses a Huggingface model interface given by `model_name_or_path`.\n",
+    "\n",
+    "Use `{query}` for injecting the original query text into the prompt and `{documents}` to inject the documents fetched by the retriever (can be used with smaller manipulation functions such as `join()` to concatenate the documents)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6443e7a3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "from haystack.nodes import  PromptNode, PromptTemplate\n",
+    "from haystack.nodes import BM25Retriever, SentenceTransformersRanker\n",
+    "\n",
+    "retriever = BM25Retriever(document_store=document_store, top_k=100)\n",
+    "reranker = SentenceTransformersRanker(model_name_or_path=\"cross-encoder/ms-marco-MiniLM-L-12-v2\", top_k=1)\n",
+    "\n",
+    "\n",
+    "lfqa_prompt = PromptTemplate(name=\"lfqa\",\n",
+    "                             prompt_text=\"Answer the question using the provided context. Your answer should be in your own words and be no longer than 50 words. \\n\\n Context: {join(documents)} \\n\\n Question: {query} \\n\\n Answer:\",\n",
+    "                             output_parser={\"type\": \"AnswerParser\"}) \n",
+    "prompt = PromptNode(model_name_or_path=\"MBZUAI/LaMini-Flan-T5-783M\", default_prompt_template=lfqa_prompt,\n",
+    "                    model_kwargs={\"model_max_length\": 2048, \"torch_dtype\": torch.bfloat16},)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "04408d03",
+   "metadata": {},
+   "source": [
+    "Defining the pipeline"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "4652b226",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from haystack import Pipeline\n",
+    "p = Pipeline()\n",
+    "p.add_node(component=retriever, name=\"Retriever\", inputs=[\"Query\"])\n",
+    "p.add_node(component=reranker, name=\"Reranker\", inputs=[\"Retriever\"])\n",
+    "p.add_node(component=prompt, name=\"prompt_node\", inputs=[\"Reranker\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0f842709",
+   "metadata": {},
+   "source": [
+    "Run a query through the pipeline and print the generated answer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "3dd989ac",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Messi has won a club-record 35 trophies, including 10 La Liga titles, seven Copa del Rey titles, and four UEFA Champions Leagues. He has also won the 2021 Copa América and the 2022 FIFA World Cup.'"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a = p.run(\"What trophies does Messi has?\", debug=True)\n",
+    "a['answers'][0].answer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "65963ad0-ac72-4073-ad8d-cf3d459ea5d5",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/fastrag/__init__.py b/fastrag/__init__.py
@@ -4,7 +4,7 @@
 from fastrag import image_generators, kg_creators, rankers, readers, retrievers, stores
 from fastrag.utils import add_timing_to_pipeline
 
-__version__ = "1.1.0"
+__version__ = "1.2.0"
 
 
 def load_pipeline(config_path: str) -> Pipeline: