# Time Weighted VectorStore Retriever

This retriever uses a combination of semantic similarity and recency.

The algorithm for combining them is basically:

```
semantic_similarity + decay_factor ** hours_passed
```

Notably, hours_passed refers to the hours passed since the object in the retriever was last accessed, not since it ws created.

In [1]:
from langchain.retrievers import TimeWeightedVectorStoreRetriever
import faiss
from langchain.vectorstores import FAISS
from langchain.docstore import InMemoryDocstore
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema import Document
import time

## High decay factor

With a relatively high decay factor, this will mostly return documents that are semantically similar

In [2]:
# Define your embedding model
embeddings_model = OpenAIEmbeddings()
# Initialize the vectorstore as empty
embedding_size = 1536
index = faiss.IndexFlatL2(embedding_size)
vectorstore = FAISS(embeddings_model.embed_query, index, InMemoryDocstore({}), {})
retriever = TimeWeightedVectorStoreRetriever(vectorstore=vectorstore, decay_factor=.99, k=1) 

In [3]:
retriever.add_documents([Document(page_content="hello world")])
time.sleep(20)
retriever.add_documents([Document(page_content="hello foo")])

['f6303531-d3a5-44af-b7c8-e3cf76916ce5']

In [4]:
retriever.get_relevant_documents("hello world")

0.9999994359334345
1.8408203353689756
0.9999428041917008
1.9999408025741263


[Document(page_content='hello world', metadata={'last_accessed_at': datetime.datetime(2023, 4, 15, 21, 4, 41, 457055), 'created_at': datetime.datetime(2023, 4, 15, 21, 4, 20, 437090), 'buffer_idx': 0})]

## Low decay factor
With a low decay factor (in this, to be extreme, we will set close to 0) this will return most recent docs

In [5]:
# Define your embedding model
embeddings_model = OpenAIEmbeddings()
# Initialize the vectorstore as empty
embedding_size = 1536
index = faiss.IndexFlatL2(embedding_size)
vectorstore = FAISS(embeddings_model.embed_query, index, InMemoryDocstore({}), {})
retriever = TimeWeightedVectorStoreRetriever(vectorstore=vectorstore, decay_factor=.0000000000000000000000001, k=1) 

In [6]:
retriever.add_documents([Document(page_content="hello world")])
time.sleep(20)
retriever.add_documents([Document(page_content="hello foo")])

['f063e5b2-c2eb-42fc-8894-79c7a7a9038e']

In [7]:
retriever.get_relevant_documents("hello world")

0.9978079943966991
1.8412067636745604
0.7235696005754303
1.7230094271411442


[Document(page_content='hello foo', metadata={'last_accessed_at': datetime.datetime(2023, 4, 15, 21, 5, 2, 243331), 'created_at': datetime.datetime(2023, 4, 15, 21, 5, 1, 579028), 'buffer_idx': 1})]