transparentmodel is a Python package that provides a convenient wrapper for tracking memory usage and performance metrics (such as FLOPs) around large language models. It aims to simplify the process of monitoring resource consumption and computational efficiency during inference, helping researchers and developers optimize their models and deployments.
Currently, the package supports tracking memory usage and FLOPs for models based on the Transformers library. However, future updates are planned to extend support to other frameworks and architectures.
- Measure memory usage during model inference.
- Calculate FLOPs (floating-point operations) for the model.
- Compatible with models based on the Transformers library.
- Easy-to-use API for integrating with existing codebases.
- Extensible design to support additional frameworks and architectures in the future.
You can install the package using pip
:
git clone https://github.com/dpleus/transparentmodel pip install .
from transparentmodel.huggingface import inference # Replace original inference function with the wrapped oneoutput = model.generate(input_tokens)output = inference.generate_with_memory_tracking(model, realtime=True)
from transparentmodel.huggingface.training import train_with_memory_tracking # Replace original inference function with the wrapped onetrainer.train()train_with_memory_tracking(trainer, realtime=True)
Metrics
- System memory: For GPUs and RAM
- Model metrics: Parameter memory & dtype (activations and gradients for training)
- Memory Tracking: Per second, for both GPU and RAM (if realtime=True)
- Summary: Minimum free RAM and Peak Utilization (only CPU yet)
- Deep Dive: Compute Time and Memory Usage per sub-operation
For detailed instructions and more advanced usage examples, please refer to the documentation.
Contributions are welcome! If you encounter any issues or have suggestions for improvements, please open an issue on the GitHub repository.
If you would like to contribute code, please follow the contribution guidelines and submit a pull request.
This project is licensed under the MIT License.