## Writing content into a Document Store


### `DocumentWriter`

#### Writing regular documents

We can write `Document` objects into a Document Store using the `DocumentWriter` class. In this example, we create a `DocumentStore` and write a `Document` object into it.

In [None]:
!pip install --upgrade haystack-ai
!pip install "sentence-transformers>=2.2.0"

In [None]:
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory.document_store import InMemoryDocumentStore
from haystack.dataclasses import Document

# Initialize an in-memory document store
doc_store = InMemoryDocumentStore()

# Create the DocumentWriter component with the document store
document_writer = DocumentWriter(document_store=doc_store)

# Define a list of documents to write
documents_to_write = [
    Document(content="Document 1 content"),
    Document(content="Document 2 content"),
]

# Use the DocumentWriter component to write documents to the store
result = document_writer.run(documents=documents_to_write)

# Print the number of documents written
print(f"Documents written: {result['documents_written']}")


In [None]:
doc_store.count_documents()

In [None]:
doc_store.filter_documents()

#### Writing embedded documents

There may be times in which, either due to the size of the data, or to preserve semantic meaning while leveraging embedding models, that we may want to work with embeddings instead. 

We can follow the next key steps.

* Compute Embeddings: Use either the `OpenAIDocumentEmbedder` or `SentenceTransformersDocumentEmbedder`, or other Haystack embedding model integration, to compute the embeddings for your documents.

* Store Embeddings: The computed embeddings are stored in the embedding field of the Document objects.

* Write to DocumentStore: Use the DocumentWriter component to write these Document objects, now with embeddings, into a DocumentStore.

Here's an example code snippet that demonstrates how to use the SentenceTransformersDocumentEmbedder to write embeddings into a document store:



In [None]:
from haystack.document_stores.in_memory.document_store import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.dataclasses import Document

# Initialize document store and components
doc_store = InMemoryDocumentStore()
doc_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-mpnet-base-v2")
document_writer = DocumentWriter(document_store=doc_store)

# Example document
documents = [
    Document(content="The quick brown fox jumps over the lazy dog."),
    Document(content="When it comes to natural language processing, context is key.")
]

# Warm up the embedder and compute embeddings
doc_embedder.warm_up()
embedded_docs = doc_embedder.run(documents)['documents']

# Write documents with embeddings to the document store
document_writer.run(documents=embedded_docs)


Showing the document content and their embeddings

In [None]:
# Retrieve all documents
all_documents = doc_store.filter_documents()

# Print details of each document, including the embedding if it exists
for doc in all_documents:
    print(f"Document ID: {doc.id}")
    print(f"Content: {doc.content}")
    if doc.embedding:
        print(f"Embedding: {doc.embedding[:5]}...")  # Displaying first 5 values of the embedding for brevity
    print("\n")
