Why GPU Memory is not released after the pipeline is finished? #13114

Hansyvea · 2023-11-08T03:54:06Z

Hi, I am using trf model to do a pipeline. It runs good with the first batch and fails to load the second batch because vram is not released.
I tried to reduce the data fed into the pipeline and after the pipeline is finished, the vram is still occupied.
How to release the vram after one batch of work is done?

How to reproduce the behaviour

def spacy_tokenise(df: pd.DataFrame, batch_size: int):
    from thinc.api import set_gpu_allocator, require_gpu

    # manage gpu vram
    set_gpu_allocator("pytorch")
    # use GPU
    # spacy.require_gpu()
    require_gpu(0)
    # Check if spaCy is using GPU
    print("spaCy is using GPU: ", spacy.prefer_gpu())
    # load model
    model = spacy.load("en_core_web_trf")
    docs = model.pipe(df.TEXT, batch_size=batch_size)
    res = []
    for doc in tqdm(docs, total=len(df.TEXT), desc="spaCy pipeline"):
        for sent in doc.sents:
            lst_token = [word.text for word in sent]
            lst_pos = [word.pos_ for word in sent]
            lst_lemma = [word.lemma_ for word in sent]
            lst_ner_token = [ent.text for ent in sent.ents]
            lst_ner_label = [ent.label_ for ent in sent.ents]
            if len(lst_ner_token) == 0:
                lst_ner_token = np.nan
                lst_ner_label = np.nan
            res.append(
                {
                    "token": lst_token,
                    "pos": lst_pos,
                    "lemma": lst_lemma,
                    "ner_token": lst_ner_token,
                    "ner_label": lst_ner_label,
                }
            )
    res = pd.DataFrame(res)
    return res

Your Environment

Operating System: WSL2 Ubuntu 20
Python Version Used: 3.10
spaCy Version Used: 3
Environment Information: Cuda 12.3

explosion locked and limited conversation to collaborators Nov 8, 2023

shadeMe converted this issue into discussion #13117 Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Why GPU Memory is not released after the pipeline is finished? #13114

Why GPU Memory is not released after the pipeline is finished? #13114

Hansyvea commented Nov 8, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Why GPU Memory is not released after the pipeline is finished? #13114

Why GPU Memory is not released after the pipeline is finished? #13114

Comments

Hansyvea commented Nov 8, 2023

How to reproduce the behaviour

Your Environment

This issue was moved to a discussion.