Loading mgenre models is taking 44GB RAM #102

banyous · 2023-09-25T20:41:57Z

When I try to run this test code, my computer's memory usage shoots up to 44GB, even though the models themselves are only 7GB on my hard drive. I know that .PKL files can take up much more space in memory than their actual file size in disk. What I'm wondering is if there's a way to shrink these model files when I load them into memory so I can run the code on machines with less RAM?

import pickle
from genre.fairseq_model import mGENRE
from genre.trie import MarisaTrie, Trie

with open("../lang_title2wikidataID-normalized_with_redirect.pkl", "rb") as f:
    lang_title2wikidataID = pickle.load(f)

# memory efficient prefix tree (trie) implemented with `marisa_trie`
with open("../titles_lang_all105_marisa_trie_with_redirect.pkl", "rb") as f:
    trie = pickle.load(f)

# generate Wikipedia titles and language IDs
model = mGENRE.from_pretrained("../fairseq_multilingual_entity_disambiguation.tar.gz").eval()

model.sample(
    sentences=["[START] Einstein [END] era un fisico tedesco."],
    # Italian for "[START] Einstein [END] was a German physicist."
    prefix_allowed_tokens_fn=lambda batch_id, sent: [
        e for e in trie.get(sent.tolist()) if e < len(model.task.target_dictionary)
    ],
    text_to_id=lambda x: max(lang_title2wikidataID[
        tuple(reversed(x.split(" >> ")))
    ], key=lambda y: int(y[1:])),
    marginalize=True,
)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading mgenre models is taking 44GB RAM #102

Loading mgenre models is taking 44GB RAM #102

banyous commented Sep 25, 2023 •

edited

Loading

Loading mgenre models is taking 44GB RAM #102

Loading mgenre models is taking 44GB RAM #102

Comments

banyous commented Sep 25, 2023 • edited Loading

banyous commented Sep 25, 2023 •

edited

Loading