## Creating an index and populating it with documents using Redis

Simple example on how to ingest HTML documents / web pages content into a Redis VectorStore.

Requirements:
- A Redis cluster
- A Redis database with at least 2GB of memory (to match with the initial index cap)

### Base parameters, the Redis info

In [None]:
redis_url = "redis://server:port"
index_name = "dellwebdocs"

#### Imports

from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.vectorstores.redis import Redis

## Ingesting new documents

In [None]:
loader = WebBaseLoader(["https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/business-challenge-193/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/solution-introduction-81/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/design-guide-introduction-28/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/terminology-279/"
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/physical-architecture-69/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/logical-architecture-106/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/virtualization-design-10/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/container-design/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/software-919/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/overview-4230/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/initial-setup/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/haproxy-loadbalancer-for-dell-ecs-storage/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/certificate-creation-and-installation/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/setup-access-to-dell-ecs-storage-cluster/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/configuring-vvols-on-dell-powerstore-storage/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/virtual-environment-setup/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/containerized-environment-set-up/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/introduction-3357/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/backup-and-restore-use-case-2/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/data-virtualization-use-case-2/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/data-tiering-use-case/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/new-t-sql-functions-use-cases/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/deployment-automation-use-case/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/summary-1165/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/conclusion-616/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/request-for-feedback/",
                        "https://infohub.delltechnologies.com/l/design-guide-sql-server-2022-database-solution-with-object-storage-on-dell-hardware-stack/automation-scripts/"
                       ])

In [None]:
data = loader.load()

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024,
                                               chunk_overlap=40)
all_splits = text_splitter.split_documents(data)

In [None]:
embeddings = HuggingFaceEmbeddings()
rds = Redis.from_existing_index(embeddings,
                                redis_url=redis_url,
                                index_name=index_name,
                                schema="dellwebdocs_redis_schema.yaml")

In [None]:
rds.add_documents(all_splits)

#### Write the schema to a yaml file to be able to open the index later on

In [None]:
rds.write_schema("redis_schema.yaml")