# Reducing storage usage for vectors on Azure AI Search

This code demonstrates how to use the following features to reduce vector storage on Azure AI Search.

+ Use smaller "narrow" data types instead of `Edm.Single`. Types such as `Edm.Float16` reduce storage overhead.
+ Disable including vectors in the query response. Vectors returned in a query response are stored separately from the vectors used during queries.
+ Compressing vectors. Use built-in scalar quantization to compress embeddings to `Edm.Int8` without any reduction in query performance. Information loss from compression can be compensated for using the original uncompressed embeddings and oversampling.

### Prerequisites

+ An Azure subscription.
 
+ Azure AI Search, any tier, but we recommend Basic or higher for this workload. [Enable semantic ranker](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable) if you want to run a hybrid query with semantic ranking.

### Set up a Python virtual environment in Visual Studio Code

1. Open the Command Palette (Ctrl+Shift+P).
1. Search for **Python: Create Environment**.
1. Select **Venv**.
1. Select a Python interpreter. Choose 3.10 or later.

It can take a minute to set up. If you run into problems, see [Python environments in VS Code](https://code.visualstudio.com/docs/python/environments).

### Install packages

In [1]:
! pip install -r vector-compression-and-storage-requirements.txt --quiet

### Load .env file (Copy .env-sample to .env and update accordingly)

In [3]:
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
import os

load_dotenv(override=True) # take environment variables from .env.

# Variables not used here do not need to be updated in your .env file
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
credential = AzureKeyCredential(os.environ["AZURE_SEARCH_ADMIN_KEY"]) if len(os.environ["AZURE_SEARCH_ADMIN_KEY"]) > 0 else DefaultAzureCredential()
index_name = os.environ["AZURE_SEARCH_INDEX"]

## Load embeddings

Load the embeddings from a precomputed file. These embeddings use [text-embedding-3-large](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#embeddings) with 3072 dimensions. The chunks are from the sample data in the document folder, chunked using the [Split Skill](https://learn.microsoft.com/azure/search/cognitive-search-skill-textsplit).

In [21]:
import json
from lib.embeddings import content_path

with open(content_path, "r") as f:
    chunks = json.load(f)