# RedisVectorStore

This notebook demonstrates the usage of RedisVectorStore from the langchain-redis package. RedisVectorStore leverages Redis as a vector database, enabling efficient storage, retrieval, and similarity search of vector embeddings.

## Installation

First, we need to install the necessary packages. Run the following command to install langchain-redis, sentence-transformers, and scikit-learn:

In [1]:
%pip install -qU langchain-redis langchain-huggingface sentence-transformers scikit-learn

Note: you may need to restart the kernel to use updated packages.


## Importing Required Libraries
We'll import the necessary libraries for our tasks:

In [2]:
# ruff: noqa: E501
import os

from langchain.docstore.document import Document
from sklearn.datasets import fetch_20newsgroups

from langchain_redis import RedisVectorStore

## Setting up Redis Connection
To use RedisVectorStore, you need a running Redis instance. For this example, we assume a local Redis instance running on the default port. Modify the URL if your setup differs:

In [3]:
# Use the environment variable if set, otherwise default to localhost
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")

Let's check that Redis is up an running by pinging it:

In [4]:
import redis

redis_client = redis.from_url(REDIS_URL)
redis_client.ping()

True

## Preparing Sample Data
We'll use a subset of the 20 Newsgroups dataset for this demonstration. This dataset contains newsgroup posts on various topics. We'll focus on two categories: 'alt.atheism' and 'sci.space':

In [5]:
categories = ["alt.atheism", "sci.space"]
newsgroups = fetch_20newsgroups(
    subset="train", categories=categories, shuffle=True, random_state=42
)

# Use only the first 250 documents
texts = newsgroups.data[:250]
metadata = [
    {"category": newsgroups.target_names[target]} for target in newsgroups.target[:250]
]

documents = [
    Document(page_content=text, metadata=meta) for text, meta in zip(texts, metadata)
]
len(documents)

250

Let's inspect the first document:

In [6]:
documents[0]

Document(metadata={'category': 'alt.atheism'}, page_content='From: bil@okcforum.osrhe.edu (Bill Conner)\nSubject: Re: Not the Omni!\nNntp-Posting-Host: okcforum.osrhe.edu\nOrganization: Okcforum Unix Users Group\nX-Newsreader: TIN [version 1.1 PL6]\nLines: 18\n\nCharley Wingate (mangoe@cs.umd.edu) wrote:\n: \n: >> Please enlighten me.  How is omnipotence contradictory?\n: \n: >By definition, all that can occur in the universe is governed by the rules\n: >of nature. Thus god cannot break them. Anything that god does must be allowed\n: >in the rules somewhere. Therefore, omnipotence CANNOT exist! It contradicts\n: >the rules of nature.\n: \n: Obviously, an omnipotent god can change the rules.\n\nWhen you say, "By definition", what exactly is being defined;\ncertainly not omnipotence. You seem to be saying that the "rules of\nnature" are pre-existant somehow, that they not only define nature but\nactually cause it. If that\'s what you mean I\'d like to hear your\nfurther thoughts on the q

## Creating Embeddings
We'll use the SentenceTransformer model to create embeddings. This model runs locally and doesn't require an API key:

In [7]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="msmarco-distilbert-base-v4")

Error in cpuinfo: prctl(PR_SVE_GET_VL) failed
  from tqdm.autonotebook import tqdm, trange


## Basic Usage with LangChain's RedisVectorStore
Now we'll demonstrate basic usage of RedisVectorStore, including creating an instance, inserting data, and performing a simple similarity search.

### Creating a RedisVectorStore instance and inserting data
We'll create a RedisVectorStore instance and populate it with our sample data:

In [8]:
vector_store = RedisVectorStore.from_documents(
    documents,
    embeddings,
    redis_url=REDIS_URL,
    index_name="newsgroups",
    metadata_schema=[
        {"name": "category", "type": "tag"},
    ],
)

### Performing a simple similarity search
Let's perform a basic similarity search using a query about space exploration:

In [9]:
query = "Tell me about space exploration"
results = vector_store.similarity_search(query, k=2)

for doc in results:
    pass

## Advanced Queries with RedisVectorStore
RedisVectorStore supports more advanced query types. We'll demonstrate similarity search with metadata filtering, maximum marginal relevance search, and similarity search with score.

### Similarity search with metadata filtering
We can filter our search results based on metadata:

In [10]:
from redisvl.query.filter import Tag

query = "Tell me about space exploration"

# Create a filter expression
filter_condition = Tag("category") == "sci.space"

filtered_results = vector_store.similarity_search(query, k=2, filter=filter_condition)

for doc in filtered_results:
    pass

### Maximum marginal relevance search
Maximum marginal relevance search helps in getting diverse results:

In [11]:
# Maximum marginal relevance search with filter
mmr_results = vector_store.max_marginal_relevance_search(
    query, k=2, fetch_k=10, filter=filter_condition
)

for doc in mmr_results:
    pass

### Similarity search with score
We can also get similarity scores along with our search results:

In [12]:
# Similarity search with score and filter
scored_results = vector_store.similarity_search_with_score(
    query, k=2, filter=filter_condition
)

for doc, score in scored_results:
    pass

## Cleanup
After we're done, it's important to clean up our Redis indices:

In [13]:
# Delete the underlying index and it's data
vector_store.index.delete(drop=True)