## Bitmask Compression Example ##

Bitmask compression allows for storing sparse tensors efficiently on the disk. 

Instead of storing each zero element represented as an actual number, we use bitmask to indicate which tensor entries correspond to zero elements. This approach is useful when the matrix is mostly zero values, as it saves space by not wastefully storing those zeros explicitly.

The example below shows how to save and load sparse tensors using bitmask compression. It also demonstrates the benefits of the bitmask compression over "dense" representation.

In [39]:
import torch
import os
from safetensors import safe_open
from safetensors.torch import save_file
from compressed_tensors import save_compressed, load_compressed, BitmaskConfig

In [40]:
# create a dense 3D tensor
tensor = torch.randn((100, 100, 100))
# use dropout to create a sparse tensor with 50% sparsity
sparse_tensor = torch.nn.functional.dropout(tensor, p=0.5)
print(f"Size of the sparse tensor: {sparse_tensor.size()}. The tensor is {100 * (1 - sparse_tensor.nonzero().size(0) / sparse_tensor.numel()):.2f}% sparse")

Size of the sparse tensor: torch.Size([100, 100, 100]). The tensor is 50.00% sparse


In [41]:
# let's save the sparse tensor on disk using safetensors and compressed-tensors and compare the size on disk
tensors = dict(sparse_tensor=sparse_tensor)

## save the sparse tensor using safetensors ##
save_file(tensors, "tensors.safetensors")
size_on_disk_kb = os.path.getsize('tensors.safetensors') / 1024

## save the sparse tensor using compressed-tensors ##
save_compressed(tensors, "compressed_tensors.safetensors", compression_format="sparse-bitmask")
compressed_size_on_disk_kb = os.path.getsize('compressed_tensors.safetensors') / 1024

print(f"Size of the sparse tensor on disk using safetensors: {size_on_disk_kb:.2f} KB")
print(f"Size of the sparse tensor on disk using compressed-tensors: {compressed_size_on_disk_kb:.2f} KB")
print("The compression ratio is x{:.2f}".format(size_on_disk_kb / compressed_size_on_disk_kb))

Compressing model: 100%|██████████| 1/1 [00:00<00:00, 15.82it/s]

Size of the sparse tensor on disk using safetensors: 3906.34 KB
Size of the sparse tensor on disk using compressed-tensors: 2158.75 KB
The compression ratio is x1.81





Storing a sparse tensor with 50% of zero entries requires almost twice less disk space using `compressed-tensors`. Naturally, the compression ratio improves radically for more sparse tensors. 

The additional compression metadata is imbibed in the saved tensor `safetensor` file. Let's display the first line of the file `compressed_tensors.safetensors`.

In [42]:
!head -n 1 compressed_tensors.safetensors

3114.38s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


`      {"sparse_tensor.row_offsets":{"dtype":"I64","shape":[100,100],"data_offsets":[0,80000]},"sparse_tensor.shape":{"dtype":"I64","shape":[3],"data_offsets":[80000,80024]},"sparse_tensor.compressed":{"dtype":"F32","shape":[500044],"data_offsets":[80024,2080200]},"sparse_tensor.bitmask":{"dtype":"U8","shape":[100,100,13],"data_offsets":[2080200,2210200]}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 

We can see, that the header contains the information for every tensor in the file about their properties:
- row offsets (offset tensor that is of shape `tensor.ndim() - 1`)
- shape (uncompressed shape tensor)
- compressed (compressed tensor of shape `tensor.ndim() - 1`)
- bitmask (bitmask tensor of shape `tensor.ndim() - 1`)

Finally, we can load both uncompressed and compressed tensors from disk and validate, that they have the same representation once in-memory.

In [43]:
# load the safetensor and the compressed-tensor and show that they have the same representation

## load the uncompressed safetensors to memory ##
tensors = {}
with safe_open('tensors.safetensors', framework="pt") as f:
   for key in f.keys():
       tensors[key] = f.get_tensor(key)

## load the compressed-tensors to memory ##
config = BitmaskConfig() # we need to specify the method for decompression
compressed_tensors = load_compressed("compressed_tensors.safetensors", config)

tensors_equal = torch.allclose(tensors['sparse_tensor'], compressed_tensors['sparse_tensor'])

print(f"Once loaded, the sparse tensors from safetensors and compressed-tensors are equal: {tensors_equal}")

Once loaded, the sparse tensors from safetensors and compressed-tensors are equal: True
