You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I'm investigating a memory leak in our application and one of the signals I get is that preprocessor/hf_convert_examples may be leaking memory at line 383
I am using tracemalloc and running the above code in a for loop and capturing the memory snapshot and printing it after every iteration. Almost all other objects have stable memory usage except below where I show the ktrain usage after every iteration(first iteration 248 KiB, last iteration 992 KiB)
Would this be a memory leak? If so, is there anything I can do in my code that can prevent this for now?
Additional info:
If this is a memory leak, it's a higher in a lower version(i.e. 0.21.4) but I report on the latest I can upgrade to. I can't upgrade to 0.26 right now. Actually, I prefer sticking to 0.21.4 for now if there are any workarounds.
The text was updated successfully, but these errors were encountered:
I wasn't able to reproduce this using the latest versions of ktrain and transformers and TensorFlow 2.3.1.
But, if you look at preprocessor.py, it is not building any sort of cache or anything that would cause a memory leak. It could be something related to your deployment setup. I'm not sure what version of TensorFlow you're using. But, if there really is a memory leak, it may be in lower-level TensorFlow code (e.g., tf.data.Dataset which is used by preprocessor). Also, the hf_convert_examples function and other portions of peprocessor.py have not been changed for several versions now. One of the main differences across ktrain versions that you're testing is the version of transformers. So, if the leak changes across versions, another possibility is that it may be an issue with an older version of transformers. Like I said, I wasn't able to reproduce with latest ktrain which uses transformers>=4.0.
One easy way to possibly address this issue is to deploy your model using ONNX which allows you to deploy WITHOUT the need for TensorFlow/PyTorch/ktrain. Please see the example ONNX notebook that shows you how to convert your ktrain-trained transformers model to ONNX. This allows you to deploy ktrain models with much smaller memory/storage footprints. I have done this using both Heroku and AWS Lambda and it is quite efficient.
Thank you very much for your detailed investigation for this issue 🙏
Sorry for not mentioning, but I'm on TF 2.2, ktrain 0.25.4
I did try del and gc.collect but that didn't change anything.
The ONNX recommendation is quite valuable!
I'll try out your suggestions. Thank you.
Hi,
I'm investigating a memory leak in our application and one of the signals I get is that
preprocessor/hf_convert_examples
may be leaking memory at line 383Our code calls the following methods on ktrain
I am using
tracemalloc
and running the above code in a for loop and capturing the memory snapshot and printing it after every iteration. Almost all other objects have stable memory usage except below where I show the ktrain usage after every iteration(first iteration 248 KiB, last iteration 992 KiB)Would this be a memory leak? If so, is there anything I can do in my code that can prevent this for now?
Additional info:
The text was updated successfully, but these errors were encountered: