Another interesting paper related to this idea #3

jukofyork · 2024-04-22T23:44:44Z

Not all Layers of LLMs are Necessary during Inference (v2)

shamanez · 2024-04-23T00:02:38Z

Interesting!

jukofyork · 2024-04-23T00:24:44Z

I've also read another paper along these lines too (but can't find it atm) that suggests the layers marked by the yellow areas at the bottom:

are just doing some kind of "averaging", and IIRC the paper suggests replacing these penultimate layers with some other much cheaper operation (IIRC, it was a training time modification rather than post-training though).

jukofyork · 2024-04-23T00:42:05Z

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Can't find the paper about "averaging" though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another interesting paper related to this idea #3

Another interesting paper related to this idea #3

jukofyork commented Apr 22, 2024

shamanez commented Apr 23, 2024

jukofyork commented Apr 23, 2024 •

edited

Loading

jukofyork commented Apr 23, 2024 •

edited

Loading

Another interesting paper related to this idea #3

Another interesting paper related to this idea #3

Comments

jukofyork commented Apr 22, 2024

shamanez commented Apr 23, 2024

jukofyork commented Apr 23, 2024 • edited Loading

jukofyork commented Apr 23, 2024 • edited Loading

jukofyork commented Apr 23, 2024 •

edited

Loading

jukofyork commented Apr 23, 2024 •

edited

Loading