[ktrain 0.25.4] Possible Memory Leak in preprocessor/hf_convert_examples #351

RAbraham · 2021-04-05T19:13:54Z

Hi,
I'm investigating a memory leak in our application and one of the signals I get is that preprocessor/hf_convert_examples may be leaking memory at line 383

    return  TransformerDataset(np.array(features_list), np.array(labels))

Our code calls the following methods on ktrain

class Bert:
    def __init__(self, model_name=None):
         self.predictor = ktrain.load_predictor(..)

    def predict(self, batch_string):
        res = self.predictor.predict(batch_string, return_proba=True)
	    output = .. post processing on res...
        return output

I am using tracemalloc and running the above code in a for loop and capturing the memory snapshot and printing it after every iteration. Almost all other objects have stable memory usage except below where I show the ktrain usage after every iteration(first iteration 248 KiB, last iteration 992 KiB)

../ktrain/text/preprocessor.py:383: size=248 KiB, count=17, average=14.6 KiB
../ktrain/text/preprocessor.py:383: size=496 KiB, count=32, average=15.5 KiB
../ktrain/text/preprocessor.py:383: size=744 KiB, count=49, average=15.2 KiB
../ktrain/text/preprocessor.py:383: size=992 KiB, count=67, average=14.8 KiB

Would this be a memory leak? If so, is there anything I can do in my code that can prevent this for now?

Additional info:

If this is a memory leak, it's a higher in a lower version(i.e. 0.21.4) but I report on the latest I can upgrade to. I can't upgrade to 0.26 right now. Actually, I prefer sticking to 0.21.4 for now if there are any workarounds.

The text was updated successfully, but these errors were encountered:

amaiya · 2021-04-05T20:35:42Z

Hi @RAbraham

I wasn't able to reproduce this using the latest versions of ktrain and transformers and TensorFlow 2.3.1.

But, if you look at preprocessor.py, it is not building any sort of cache or anything that would cause a memory leak. It could be something related to your deployment setup. I'm not sure what version of TensorFlow you're using. But, if there really is a memory leak, it may be in lower-level TensorFlow code (e.g., tf.data.Dataset which is used by preprocessor). Also, the hf_convert_examples function and other portions of peprocessor.py have not been changed for several versions now. One of the main differences across ktrain versions that you're testing is the version of transformers. So, if the leak changes across versions, another possibility is that it may be an issue with an older version of transformers. Like I said, I wasn't able to reproduce with latest ktrain which uses transformers>=4.0.

One easy way to possibly address this issue is to deploy your model using ONNX which allows you to deploy WITHOUT the need for TensorFlow/PyTorch/ktrain. Please see the example ONNX notebook that shows you how to convert your ktrain-trained transformers model to ONNX. This allows you to deploy ktrain models with much smaller memory/storage footprints. I have done this using both Heroku and AWS Lambda and it is quite efficient.

amaiya · 2021-04-06T00:14:12Z

It may or may not be related to this TensorFlow issue. From the thread, a workaround is to use del and gc.collect().

However, as I mentioned before, the better solution would be to deploy your model using ONNX.

RAbraham · 2021-04-06T13:03:37Z

Thank you very much for your detailed investigation for this issue 🙏
Sorry for not mentioning, but I'm on TF 2.2, ktrain 0.25.4
I did try del and gc.collect but that didn't change anything.
The ONNX recommendation is quite valuable!
I'll try out your suggestions. Thank you.

amaiya added the user question Further information is requested label Apr 5, 2021

amaiya closed this as completed Apr 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ktrain 0.25.4] Possible Memory Leak in preprocessor/hf_convert_examples #351

[ktrain 0.25.4] Possible Memory Leak in preprocessor/hf_convert_examples #351

RAbraham commented Apr 5, 2021

amaiya commented Apr 5, 2021

amaiya commented Apr 6, 2021

RAbraham commented Apr 6, 2021

[ktrain 0.25.4] Possible Memory Leak in preprocessor/hf_convert_examples #351

[ktrain 0.25.4] Possible Memory Leak in preprocessor/hf_convert_examples #351

Comments

RAbraham commented Apr 5, 2021

amaiya commented Apr 5, 2021

amaiya commented Apr 6, 2021

RAbraham commented Apr 6, 2021