Replies: 2 comments
-
|
Beta Was this translation helpful? Give feedback.
-
It depends on source model representation. For example, having PyTorch model, we do torch.jit.trace under the hood with your example_input provided as convert_model parameter. After model is traced we go over the trace and convert each operation from torch representation to OpenVINO operations (one or more for each source operation). Following this path we are trying to keep big weight tensors in their original memory without copying them. So when you convert a PyTorch model and have ov.Model object part of the weights are shared between original PyTorch model and ov.Model. It is quite handy for LLMs for example. When you save or compile_model, weights will be copied. When we are converting model from a file, for example in TF and ONNX, depending on which particular type of model file is used, we can avoid loading the original model in memory completely. For example, in ONNX model where weights are stored in set of separate files, we are not loading them completely in memory during the conversion. The same is true for TF saved_model representation. But when you compile the model with compile_model, all the weights will be required in memory. |
Beta Was this translation helpful? Give feedback.
-
Can someone please explain the flow of taking a pytorch/TF/onnx model through the convert_model API? What happens behind the scenes? I would like to understand if there is any memory issues with storing the converted model in memory.
Beta Was this translation helpful? Give feedback.
All reactions