# How to create and query graph vector stores

Graph vector stores are a special type of vector store that store links between documents as well as the documents themselves.
The Graph vector stores retrieval methods can be used to retrieve documents based on the links between them.

## Get started

We chunk the State of the Union text and split it into documents.

In [None]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter

raw_documents = TextLoader("state_of_the_union.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)

Links can be added to documents manually but it's easier to use a link extractor.
Several common link extractors are available and you can build your own.
For this guide, we'll use the KeybertLinkExtractor which uses the KeyBERT model to tag documents with keywords and uses these keywords to create links between documents.

In [None]:
from langchain_community.graph_vectorstores.extractors import KeybertLinkExtractor
from langchain_core.graph_vectorstores.links import add_links

extractor = KeybertLinkExtractor()

for doc in documents:
    add_links(doc, extractor.extract_one(doc))

documents[:10]

[Document(metadata={'source': 'state_of_the_union.txt', 'links': [Link(kind='kw', direction='bidir', tag='russia'), Link(kind='kw', direction='bidir', tag='putin'), Link(kind='kw', direction='bidir', tag='ukraine'), Link(kind='kw', direction='bidir', tag='ukrainian'), Link(kind='kw', direction='bidir', tag='vladimir')]}, page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscal

## Create the graph vector store and add documents

We'll use an Apache Cassandra or Astra DB database as an example.
We create a `CassandraGraphVectorStore` from the documents and an `OpenAIEmbeddings` model.

In [None]:
import cassio
from langchain_community.graph_vectorstores import CassandraGraphVectorStore
from langchain_openai import OpenAIEmbeddings

# Initialize cassio and the Cassandra session from the environment variables
cassio.init(auto=True)

store = CassandraGraphVectorStore.from_documents(
    embedding=OpenAIEmbeddings(),
    documents=documents,
)

## Similarity search

If we don't traverse the graph, a graph vector store behaves like a regular vector store.
So all methods available in a vector store are also available in a graph vector store.
The similarity search method returns documents similar to a query without considering the links between documents.

In [None]:
docs = store.similarity_search(
    "What did the president say about Ketanji Brown Jackson?"
)

## Traversal search

The traversal search method returns documents similar to a query considering the links between documents.
It first does a similarity search and then traverses the graph to find linked documents.

In [None]:
docs = list(
    store.traversal_search("What did the president say about Ketanji Brown Jackson?")
)

## Async methods

The graph vector store has async versions of the methods prefixed with `a`.

In [None]:
docs = [
    doc
    async for doc in store.atraversal_search(
        "What did the president say about Ketanji Brown Jackson?"
    )
]

## Graph vector store retriever

The graph vector store can be converted to a retriever. It is similar to the vector store retriever but it also has traversal search methods such as `traversal` and `mmr_traversal`.

In [None]:
retriever = store.as_retriever(search_type="mmr_traversal")
docs = retriever.invoke("What did the president say about Ketanji Brown Jackson?")