# ObjectBox

This notebook will demonstrate the use of [ObjectBox](https://docs.objectbox.io/) as an efficient, on-device vector-store with Langchain.

Features of ObjectBox:

- Local, embeddable and fast ACID-compliant NoSQL database
- Bindings available for all major programming languages
- Supports on-device vector search for mobile platforms (Android, iOS and Flutter)


For detailed documentation of all ObjectBoxVectorStore features and configurations head to the API reference: https://api.python.langchain.com/en/latest/vectorstores/langchain_objectbox.vectorstores.ObjectBoxVectorStore.html


## Setup

We can install the `langchain-objectbox` package from PyPI to get started.

## Initialization

The `langchain-objectbox` provides an `ObjectBoxVectorStore` class which extends Langchain's abstract vector-store and implements basic CRUD operations along with vector-search using ObjectBox. We perform the following steps:

1. Setup a fake embedding producer to test the vector-store
2. Initialize `ObjectBoxVectorStore` with available parameters

```{=mdx}
import EmbeddingTabs from "@theme/EmbeddingTabs";

<EmbeddingTabs/>
```

In [None]:
from langchain_objectbox.vectorstores import ObjectBoxVectorStore
from langchain_core.embeddings.fake import DeterministicFakeEmbedding

embeddings = DeterministicFakeEmbedding(size=128)

"""
Possible arguments:
embedding_function: Embedding function to use.
embedding_dimensions: Dimensions of the embeddings.
db_directory: Path to the database where data will be stored.
clear_db: Flag for deleting all the data in the database.
entity_model: Creates an objectbox entity.
db: Registers the model with objectbox.
vector_box: Initializing objectbox.
"""
vector_store = ObjectBoxVectorStore(
    embedding=embeddings,
    embedding_dimensions=128
)

## Basic Operations

### Adding documents to the vector-store



In [None]:
from langchain_core.documents import Document

document_1 = Document(
    page_content="foo",
    metadata={"source": "https://example.com"}
)

document_2 = Document(
    page_content="bar",
    metadata={"source": "https://example.com"}
)

document_3 = Document(
    page_content="baz",
    metadata={"source": "https://example.com"}
)

documents = [document_1, document_2, document_3]
vector_store.add_documents(documents=documents,ids=["1","2","3"])

### Update documents


In [None]:
updated_document = Document(
    page_content="qux",
    metadata={"source": "https://another-example.com"}
)

vector_store.update_documents(document_id="1",document=updated_document)

### Remove documents


In [None]:
vector_store.delete(ids=["3"])

## Vector Search

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. 

In [None]:
results = vector_store.similarity_search(query="foo",k=1,filter={"source":"https://another-example.com"})
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")
    
# perform similarity search and get scores
results = vector_store.similarity_search_with_score(query="thud",k=1,filter={"source":"https://example.com"})
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

## Usage for retrieval-augmented generation

For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:

- [Tutorials](/docs/tutorials/)
- [How-to: Question and answer with RAG](https://python.langchain.com/docs/how_to/#qa-with-rag)
- [Retrieval conceptual docs](https://python.langchain.com/docs/concepts/#retrieval)

### Simple RAG Example

The following cells will demonstrate the setup of a simple RAG pipeline which uses ObjectBox's vector-search capabilities to generate responses with the LLM backed by the given documents

In [None]:
!pip install langchain-community
!pip install langchain-openai
!pip install langchain-huggingface

In [1]:
from langchain_objectbox.vectorstores import ObjectBoxVectorStore
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAI
from langchain_community.document_loaders import WebBaseLoader
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts.chat import ChatPromptTemplate
from langchain.chains import create_retrieval_chain
import os
import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

# Initialize LLM, embeddings, and document loader
llm = OpenAI(model_name="gpt-3.5-turbo-instruct")
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
docs = loader.load()

# Prepare a prompt for RAG with a placeholder for context retrieved by vector search
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

# Initialize a text splitter to create chunks from the web page text
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
documents = text_splitter.split_documents(docs)

# Create ObjectBoxVectorStore 
# Note: Adjust embedding_dimensions to match the chosen embedding model 
vector = ObjectBoxVectorStore.from_documents(documents, embeddings, embedding_dimensions=384)

document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever(search_kwargs={"k": 3})  # retrieve top 3 most relevant docs
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with a question
response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
print(response["answer"])

# Optional: Print retrieved documents to understand context
print("\nRetrieved Documents:")
for doc in retriever.get_relevant_documents("how can langsmith help with testing?"):
    print(f"---\n{doc.page_content}")

ImportError: cannot import name 'LangSmithParams' from 'langchain_core.language_models.chat_models' (e:\Client_Projects\ObjectBox\langchain-integration\obx_langchain\.venv\Lib\site-packages\langchain_core\language_models\chat_models.py)