Compression techniques #37

erickdp · 2024-03-21T17:08:07Z

I find it truly fascinating! Have you come across any methods similar to pruning, distillation, or quantization that could be applied to this model? While I'm aware of some size options, it would be truly remarkable if we could utilize compression techniques for more efficient processing and deployment on edge devices.

urchade · 2024-03-22T14:47:30Z

Hi @erickdp thanks

Indeed it would be interesting, however I am not really familiar with this field. Do you have any idea how we could do that?

erickdp · 2024-03-22T16:57:54Z

I could recommend you Knowledge Distillation method, that consists in fitting "students" models from "teachers" since it is mentioned that GLiNER handles a BERT-like architecture, I have used it to distill sentiment classification models and the result is really efficient, from the computational process needed, model size and accuracy.

However, excellent contribution.

References:
https://neptune.ai/blog/knowledge-distillation
https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student

urchade added the enhancement New feature or request label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compression techniques #37

Compression techniques #37

erickdp commented Mar 21, 2024

urchade commented Mar 22, 2024

erickdp commented Mar 22, 2024

Compression techniques #37

Compression techniques #37

Comments

erickdp commented Mar 21, 2024

urchade commented Mar 22, 2024

erickdp commented Mar 22, 2024