Instead of having the same spill and unspill instructions duplicated we could have the spilling and unspilling encoded in a non preemptible, special and argumentless wrapper around morestack that is generated and deduplicated for every combination of register arguments encountered.
So the above sequence just becomes:
Alternatively (could make stack traces harder I assume) we could save the return address to continue after stack growth then JMP into the wrapper that does the argument spilling and then calls morestack and jmp back to the saved return address similar how duffzero and duffcopy do their return to the function that used them.
An alternative that may work better for rarely used argument combinations could be to encode the registers used as arguments into a compact value (1 bit per register) and set a scratch register to it before calling a generic spilling wrapper. The wrapper would then consist of a lot of bit tests on the scratch register and spilling each register for which the bit was set.
I dont have immediate plans to work on this but see a large potential for binary size saving and I am interested what complexities are involved here and if others have already thought about doing this or are already working on reducing the binary space used up for spilling arguments.
Ack on also needing to take offsets and sizes into account (I noticed that too after coding on a prototype after posting but before your comment). There seems to still be a lot of overlap in the spill sequences. That said I agree having a generic spill space and associated generic code in morestack that does the spilling would be even better.