Optimizations when building a dense index #1910

ftvalentini · 2024-06-05T20:47:02Z

In the main class for building dense indexes:

Line 24 in b7e1da3

class AutoDocumentEncoder(DocumentEncoder):

the arg fp16 does not seem to be used anywhere (it is kwargs in this line:

Line 38 in b7e1da3

def encode(self, texts, titles=None, max_length=256, add_sep=False, **kwargs):

). It could be included in the __init__ as:

self.model = AutoModel.from_pretrained(model_name, torch_dtype=torch.float16)

Moreover, the encode() method coulde use inference_mode(), like:

with torch.inference_mode():
    outputs = self.model(**inputs)

which would significantly reduce the memory footprint of inference.

The text was updated successfully, but these errors were encountered:

Provide feedback