# BigtableByteStore

This guide will help you get started with Google Bigtable as a [key-value store](/docs/concepts/key_value_stores). 

For more advanced examples and tutorials, please visit the [langchain-google-bigtable GitHub repository](https://github.com/googleapis/langchain-google-bigtable-python/). For a full feature reference, see the [API documentation](https://python.langchain.com/v0.2/api_reference/google_bigtable/storage/langchain_google_bigtable.storage.BigtableByteStore.html).

## Overview

[Google Cloud Bigtable](https://cloud.google.com/bigtable) is a scalable, fully managed NoSQL wide-column database that is suitable for both real-time access and analytics workloads.

### Integration details

| Class | Package | Local | JS support | Package downloads | Package latest |
| :--- | :--- | :---: | :---: | :---: | :---: |
| [BigtableByteStore](https://python.langchain.com/v0.2/api_reference/google_bigtable/storage/langchain_google_bigtable.storage.BigtableByteStore.html) | [langchain-google-bigtable](https://pypi.org/project/langchain-google-bigtable/) | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-google-bigtable?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-google-bigtable?style=flat-square&label=%20) |

## Setup

The `BigtableByteStore` stores string keys as the `row_key` and byte values in a designated column.

### Prerequisites

You need a Google Cloud project with the Bigtable API enabled and an active Bigtable instance and table. You can follow the [official setup guide](https://cloud.google.com/bigtable/docs/) for detailed instructions.

### Installation

The integration lives in the `langchain-google-bigtable` package. The following command also installs `langchain-google-vertexai` for the embedding cache example.

In [None]:
%pip install -qU langchain-google-bigtable langchain-google-vertexai

## Instantiation

To instantiate the store, you'll need your Google Cloud `project_id` and your Bigtable `instance_id` and `table_id`. The example below shows how to create a table if it doesn't exist and then asynchronously creates the store.

In [None]:
from langchain_google_bigtable import BigtableByteStore, init_key_value_store_table

# Your Google Cloud project ID
PROJECT_ID = "<YOUR_PROJECT_ID>"
# Your Bigtable instance ID
INSTANCE_ID = "<YOUR_INSTANCE_ID>"
# The table to use for the key-value store
TABLE_ID = "my-kv-store-table"

# Helper function to create the table and column family if they don't exist
init_key_value_store_table(
    project_id=PROJECT_ID,
    instance_id=INSTANCE_ID,
    table_id=TABLE_ID,
)

# The store is created asynchronously
kv_store = await BigtableByteStore.create(
    project_id=PROJECT_ID, 
    instance_id=INSTANCE_ID, 
    table_id=TABLE_ID
)

## Usage

The `BigtableByteStore` supports both synchronous (e.g., `mset`, `mget`) and asynchronous (e.g., `amset`, `amget`) methods. This guide uses the async methods.

You can set and get data using `amset` and `amget`.

In [None]:
await kv_store.amset(
    [
        ("key1", b"value1"),
        ("key2", b"value2"),
    ]
)

await kv_store.amget(["key1", "key2", "nonexistent_key"])

You can delete data using `amdelete`.

In [None]:
await kv_store.amdelete(["key1"])

await kv_store.amget(["key1", "key2"])

And you can iterate over keys using `ayield_keys`.

In [None]:
[key async for key in kv_store.ayield_keys()]

## Advanced Usage: Embedding Caching

A common use case for a key-value store is to cache the results of expensive operations, like computing text embeddings. This saves both time and money on subsequent requests for the same text.

In [None]:
from langchain.embeddings import CacheBackedEmbeddings
from langchain_google_vertexai.embeddings import VertexAIEmbeddings

# Initialize the underlying embeddings model
underlying_embeddings = VertexAIEmbeddings(project=PROJECT_ID, model_name="textembedding-gecko@003")

# Create the cached embedder using the same store
# The 'namespace' argument prevents key collisions with other data in the store
cached_embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings,
    kv_store,
    namespace="text-embeddings"
)

# The first call to embed_query will compute the embeddings and store them in Bigtable.
# Subsequent calls with the same text will be much faster as they will fetch the result directly from the cache.
embedding_result = await cached_embedder.aembed_query("Hello, world!")
print(embedding_result[:5])