diff --git a/integrations/langchain/news-search-langchain.ipynb b/integrations/langchain/news-search-langchain.ipynb new file mode 100644 index 00000000..da15d044 --- /dev/null +++ b/integrations/langchain/news-search-langchain.ipynb @@ -0,0 +1,248 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "685e5fb7", + "metadata": {}, + "source": [ + "# News search using Hopsworks and Langchain" + ] + }, + { + "cell_type": "markdown", + "id": "541a9ee1", + "metadata": {}, + "source": [ + "In this tutorial, you will learn how to create a news search bot which can answer users' question about news using Opensearch in Hopsworks with Langchain. Concretely, you will create a RAG (Retrieval-Augmented Generation) application which searches news matching users' questions, and answers the question using a LLM with the retrieved news as the context.\n", + "The steps include:\n", + "1. [Ingest news data to Hopsworks](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-knn.ipynb)\n", + "2. Setup a `vectorstores` in Langchain using Opensearch in Hopsworks\n", + "3. Create a LLM using model from huggingface\n", + "4. Create a RAG application using `RetrievalQA` chain in Langchain" + ] + }, + { + "cell_type": "markdown", + "id": "014fe398-8337-44d7-a71a-334b884d5ebf", + "metadata": {}, + "source": [ + "## Prerequisite" + ] + }, + { + "cell_type": "markdown", + "id": "ba83a905-3944-4bf1-b4d7-43d4336f0beb", + "metadata": {}, + "source": [ + "You need to run this [notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-knn.ipynb) to ingest news data to Hopsworks." + ] + }, + { + "cell_type": "markdown", + "id": "b4621965", + "metadata": {}, + "source": [ + "## Setup a vector store in Langchain using Opensearch in Hopsworks" + ] + }, + { + "cell_type": "markdown", + "id": "5c55b995", + "metadata": {}, + "source": [ + "First, you need to get the opensearch configuration, and the index name in the vector store from Hopsworks." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "942caffc", + "metadata": {}, + "outputs": [], + "source": [ + "import hopsworks\n", + "from hsfs.core.opensearch_api import OpenSearchApi\n", + "proj = hopsworks.login()\n", + "fs = proj.get_feature_store()\n", + "\n", + "opensearch_config = OpenSearchApi(project_id=proj.id, project_name=proj.name).get_default_py_config()\n", + "opensearch_config[\"opensearch_url\"] = f'{opensearch_config[\"hosts\"][0][\"host\"]}:{opensearch_config[\"hosts\"][0][\"port\"]}'\n", + "opensearch_config.pop(\"hosts\")\n", + "\n", + "# `news_fg.embedding_index.index_name` return the index name\n", + "news_fg = fs.get_feature_group(\n", + " name=\"news_fg\",\n", + " version=1,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "5cf29f55", + "metadata": {}, + "source": [ + "Then, you can setup the vector store in Langchain using the configuration, and the embedding model used for generating the embedding in the feature group." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "efd8c222", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_community.vectorstores import OpenSearchVectorSearch\n", + "from langchain.embeddings import SentenceTransformerEmbeddings\n", + "\n", + "embeddings = SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n", + "\n", + "docsearch = OpenSearchVectorSearch(\n", + " index_name=news_fg.embedding_index.index_name,\n", + " embedding_function=embeddings,\n", + " **opensearch_config\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "40fd60d5", + "metadata": {}, + "source": [ + "## Create a LLM using model from huggingface" + ] + }, + { + "cell_type": "markdown", + "id": "b7016353", + "metadata": {}, + "source": [ + "You need to load a llm model from huggingface. You can pick any model on huggingface. To accelerate the inference, you can use load the model to gpu if available." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "401e21c4", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.llms import HuggingFacePipeline\n", + "from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\n", + "import torch\n", + "\n", + "import torch\n", + "# Check for GPU availability and set device\n", + "if torch.cuda.is_available():\n", + " device = torch.device(\"cuda\")\n", + " print(\"GPU is available!\")\n", + "else:\n", + " device = torch.device(\"cpu\")\n", + " print(\"GPU is not available, using CPU.\")\n", + "\n", + " \n", + "# Load the Llama2 chat model (replace with your preferred model name)\n", + "model_name = \"TinyLlama/TinyLlama-1.1B-Chat-v1.0\" \n", + "tokenizer = AutoTokenizer.from_pretrained(model_name)\n", + "model = AutoModelForCausalLM.from_pretrained(model_name).to(device)\n", + "pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, device=device)\n", + "llm = HuggingFacePipeline(pipeline=pipe)" + ] + }, + { + "cell_type": "markdown", + "id": "c4da0be7", + "metadata": {}, + "source": [ + "# Create a RAG application using `RetrievalQA` chain in Langchain" + ] + }, + { + "cell_type": "markdown", + "id": "dabfbf85", + "metadata": {}, + "source": [ + "Lastly, you need to create a prompt for the llm, and create a `RetrievalQA` chain in Langchain. You need to provide `vector_field`, and `text_field` which are feature names in the `news_fg` feature group. You can also modify the number of results returned from the vector store by adjusting `k`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3b902254", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain import PromptTemplate\n", + "from langchain.chains import RetrievalQA\n", + "\n", + "# Prompt\n", + "template = \"\"\"\n", + "<|system|>\n", + "Use the following pieces of context to answer the question from user. \n", + "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n", + "context:\n", + "{context} \n", + "<|user|>\n", + "{question}\n", + "<|assistant|>\n", + "\"\"\"\n", + "QA_CHAIN_PROMPT = PromptTemplate(\n", + " input_variables=[\"context\", \"question\"],\n", + " template=template,\n", + ")\n", + "\n", + "qa_chain = RetrievalQA.from_chain_type(\n", + " llm,\n", + " retriever=docsearch.as_retriever(\n", + " search_kwargs={\n", + " \"vector_field\": news_fg.embedding_body.name, \n", + " \"text_field\": news_fg.article.name, \n", + " \"k\": 1}\n", + " ),\n", + " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", + ")\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8b9f317f", + "metadata": {}, + "outputs": [], + "source": [ + "question = \"any news about France?\"\n", + "result = qa_chain({\"query\": question})\n", + "print(result[\"result\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "828fc9cd", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/integrations/langchain/requirements.txt b/integrations/langchain/requirements.txt new file mode 100644 index 00000000..2d0a080d --- /dev/null +++ b/integrations/langchain/requirements.txt @@ -0,0 +1,5 @@ +hsfs==3.7.0rc5 +hopsworks==3.7.0rc1 +sentence_transformers +torch +langchain \ No newline at end of file