You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I find it truly fascinating! Have you come across any methods similar to pruning, distillation, or quantization that could be applied to this model? While I'm aware of some size options, it would be truly remarkable if we could utilize compression techniques for more efficient processing and deployment on edge devices.
The text was updated successfully, but these errors were encountered:
I could recommend you Knowledge Distillation method, that consists in fitting "students" models from "teachers" since it is mentioned that GLiNER handles a BERT-like architecture, I have used it to distill sentiment classification models and the result is really efficient, from the computational process needed, model size and accuracy.
I find it truly fascinating! Have you come across any methods similar to pruning, distillation, or quantization that could be applied to this model? While I'm aware of some size options, it would be truly remarkable if we could utilize compression techniques for more efficient processing and deployment on edge devices.
The text was updated successfully, but these errors were encountered: