# BigtableByteStore

This guide covers how to use Google Cloud Bigtable as a key-value store.

[Bigtable](https://cloud.google.com/bigtable) is a key-value and wide-column store, ideal for fast access to structured, semi-structured, or unstructured data. 

## Overview

The `BigtableByteStore` is a key-value store implementation that uses Google Cloud Bigtable as the backend. It supports both synchronous and asynchronous operations for setting, getting, and deleting key-value pairs.

### Integration details
| Class | Package | Local | JS support | Package downloads | Package latest |
| :--- | :--- | :---: | :---: | :---: | :---: |
| [BigtableByteStore](https://github.com/googleapis/langchain-google-bigtable-python/blob/main/src/langchain_google_bigtable/key_value_store.py) | [langchain-google-bigtable](https://pypi.org/project/langchain-google-bigtable/) | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-google-bigtable?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-google-bigtable) |

## Setup

### Prerequisites

You need a Google Cloud project with the Bigtable API enabled, an active Bigtable instance, and a table. You can follow these guides for detailed instructions:
* [Create a Google Cloud Project](https://developers.google.com/workspace/guides/create-project)
* [Enable the Bigtable API](https://console.cloud.google.com/flows/enableapi?apiid=bigtable.googleapis.com)
* [Create a Bigtable instance and table](https://cloud.google.com/bigtable/docs/creating-instance)

### Installation

The integration lives in the `langchain-google-bigtable` package. The command below also installs `langchain-google-vertexai`, which is needed for the embedding cache example.

In [None]:
%pip install -qU langchain-google-bigtable langchain-google-vertexai

## Authentication

Set your Google Cloud project ID and authenticate to your account.

In [None]:
# @markdown Please fill in with your Google Cloud project ID, Bigtable instance ID, and table ID.
PROJECT_ID = "your-gcp-project-id"  # @param {type:"string"}
INSTANCE_ID = "your-instance-id"  # @param {type:"string"}
TABLE_ID = "your-table-id"  # @param {type:"string"}

# Set the project id
!gcloud config set project {PROJECT_ID}

# Authenticate user
from google.colab import auth
auth.authenticate_user()

## Instantiation

To use `BigtableByteStore`, we first need to ensure a table exists. The library provides a helper function `init_key_value_store_table` to create one if it doesn't exist. We then create the store asynchronously.

In [None]:
from langchain_google_bigtable import BigtableByteStore, init_key_value_store_table

# Helper function to create the table and column family if they don't exist
init_key_value_store_table(
    project_id=PROJECT_ID,
    instance_id=INSTANCE_ID,
    table_id=TABLE_ID,
)

# The store is created asynchronously
store = await BigtableByteStore.create(
    project_id=PROJECT_ID, 
    instance_id=INSTANCE_ID, 
    table_id=TABLE_ID
)

## Usage

The `BigtableByteStore` supports both synchronous (e.g., `mset`, `mget`) and asynchronous (e.g., `amset`, `amget`) methods. This guide uses the async methods.

### Set
Use `amset` to save key-value pairs to the store.

In [None]:
kv_pairs = [
    ("key1", b"value1"),
    ("key2", b"value2"),
    ("key3", b"value3"),
]

await store.amset(kv_pairs)

### Get
Use `amget` to retrieve values for a given list of keys. If a key is not found, `None` is returned for that key.

In [None]:
retrieved_vals = await store.amget(["key1", "key2", "nonexistent_key"])
print(retrieved_vals)

### Delete
Use `amdelete` to remove keys from the store.

In [None]:
await store.amdelete(["key3"])

# Verifying the key was deleted
await store.amget(["key1", "key3"])

### Iterate over keys
Use `ayield_keys` to iterate over all keys in the store or over keys with a specific prefix.

In [None]:
all_keys = [key async for key in store.ayield_keys()]
print(f"All keys: {all_keys}")

prefixed_keys = [key async for key in store.ayield_keys(prefix="key1")]
print(f"Prefixed keys: {prefixed_keys}")

## Advanced Usage: Embedding Caching

A common use case for a key-value store is to cache the results of expensive operations, like computing text embeddings. This saves both time and money on subsequent requests for the same text.

In [None]:
from langchain.embeddings import CacheBackedEmbeddings
from langchain_google_vertexai.embeddings import VertexAIEmbeddings

# Initialize the underlying embeddings model
underlying_embeddings = VertexAIEmbeddings(project=PROJECT_ID, model_name="textembedding-gecko@003")

# Create the cached embedder using the same store
# The 'namespace' argument prevents key collisions with other data in the store
cached_embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings,
    store,
    namespace="text-embeddings"
)

# The first call to embed_query will compute the embeddings and store them in Bigtable.
print("First call (computes and caches embedding):")
%time embedding_result_1 = await cached_embedder.aembed_query("Hello, world!")

# Subsequent calls with the same text will be much faster as they fetch the result directly from the cache.
print("\nSecond call (retrieves from cache):")
%time embedding_result_2 = await cached_embedder.aembed_query("Hello, world!")

## API reference

For full details on the `BigtableByteStore` class, see the [API reference](https'://python.langchain.com/v0.2/api_reference/google_bigtable/storage/langchain_google_bigtable.storage.BigtableByteStore.html).