-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve binary size of generated code #908
Comments
I have done some analysis. With the method of calculating the size from the arm-eabi-nm --print-size --size-sort --radix=d obj/main Most of the symbols are quite small, with 4 exceptions (sizes are decimal):
All of them have in common that they contain huge case statements (multiple hundred cases). Knowing that generics increase the code size, we have to find a better way to express the functionality, than with generics. Looking at the fact that not all of these subprograms have generics, but all of them contain huge case statements, we may have to find another solution for this. |
I have also identified these large case statements as major sources of proof slowness. So refactoring that would be doubly beneficial. |
I wonder if the generic instances have actually a big impact on the code size. These generic functions just consist of one function call. So I would expect the impact to be rather small.
@kanigsson I think then it would make sense that you have a look at how the mentioned functions could be improved (also as part of #767). |
I agree. |
|
As it seems, gcc is doing aggressive inlining here. I manually added |
This is unexpected, especially given |
So I just got the hint to use |
I noticed another source of problems. E.g. in case Fld is
when F_Major_Version .. F_Negotiate_Algorithms_Request_Req_Alg_Structs =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Get_Digests_Request_Param_1 =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Get_Measurements_Request_Reserved_1 =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Get_Version_Request_Param_1 =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Key_Exchange_Request_Measurement_Summary_Hash_Type =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Key_Update_Request_Key_Operation =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
when F_Negotiate_Algorithms_Request_Req_Alg_Struct_Count =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8; You get the idea. This statement can be reduced to: case Fld is
when F_Major_Version .. F_Negotiate_Algorithms_Request_Req_Alg_Structs =>
Ctx.Verified_Last := ((Field_Last (Ctx, Fld) + 7) / 8) * 8;
end case; This change for this single instance alone reduces the code size by 6kb. |
Adding |
That is related to #644. This case statement was introduced to increase the provability of the context predicate. All branches contain the exact same statement. There are more instances of this. |
Yes, you are right. That was a typo. |
Is there a way to easily reverse this behaviour? |
Removing these case statements in the code generator is quite simple. There are three instances in |
With function Field_First (Ctx : Context; Fld : Field) return RFLX_Types.Bit_Index is
if Fld = F_Major_Version then
return Ctx.First;
end if;
pragma Assert (Fld = F_Minor_Version and then Ctx.Cursors (Fld).Predecessor = F_Major_Version and then Ctx.Cursors (F_Major_Version).Value.Major_Version_Value = 1);
pragma Assert (Fld = F_Code and then Ctx.Cursors (Fld).Predecessor = F_Minor_Version and then Ctx.Cursors (F_Minor_Version).Value.Minor_Version_Value <= 1);
...
return Ctx.Cursors (Ctx.Cursors (Fld).Predecessor).Last + 1;
end Field_First; This has the advantage that the checking code (which I suppose is handled by the precondition?) is not compiled and that no exception raises are required. This also seem to apply to I think generally we should try to avoid these kind of case statements that include lots of exception raises. Each raise causes a jump to a local exception call. While this is already the most efficient way to call the exception at each case, it still makes up roughly half of the generated code in these functions. |
The currently two largest functions are |
Could you please collect all your findings where the code generator needs to be modified in a task list? I think that would be helpful to keep track. |
I guess a function call with |
That would get rid of the exception code, however it would not get rid of a jump/call in each branch of the case statement. And this is what currently increases the code size. The problem is that the compiler can't infer that these functions will never be called.
Putting the case statement as is into a post condition shouldn't be a problem in terms of code size since the compiler doesn't compile that. We could also put it into a ghost function and use that. |
After some discussions it seems that the aggressive inlining is caused by expression functions. Expression functions are always inlined (see V127-006). With GNAT 23.0w using |
All realistic optimization have been done. |
Potential improvements:
pragma Restrictions (No_Exceptions)
pragma Restrictions (No_Secondary_Stack)
The code size will be calculated by the size of the .text section of the generated binary minus the size of the .text section of a reference binary that is built from an empty main procedure.
Important compiler switches:
-ffunction-sections
,-fdata-sections
,-Wl,-gc-sections
-Os
-gnatp
-fno-inline
/-gnatd.8
(for GNAT 22.1, not needed for 23.0w)-fno-early-inlining
Improvements in the code generator:
Group case statements with identical branches into a single case
Group if-else statements with identical branches into a single expression
Replace case statements that use exceptions with assertions (at least if many/all branches are equal)
Replace case statements that generate boolean values with single boolean expressions
Rewrite case statements in expression functions that return case based enum values as regular functions with single returns
Remove code duplications in
RFLX_Generic_Types
(mergeInsert
functions)(Part 1)
pragma Restrictions (No_Exceptions)
Set_Field_Value_*
intoSet_*
Incomplete_Message
andInitialized
by using quantified expressions (no effect on binary size, but maybe positive effect on proof time)Valid_Length
Path_Condition
Field_Size
Set_*
functionsExtract
andInsert
Composite_Field
(Part 2)
(Part 3)
Field_Dependent_Value
) and therefore case statements(Part 4)
(Part 5)
Has_Buffer
ofVerify
into precondition(Part 6)
RFLX_Exception
variable with gotos. Currently this variable is used to stay inside a block and callUpdate
on the buffer at the end of that block before jumping via the goto. However it seems to be more size efficient to call theUpdate
and jump directly.Available_Field (Ctx, Field) >= Field_Size (Ctx, Field)
with separate function.(Part 7)
Available_Space
once (Message.size
must consider message path)Valid_Next
(Ideas for messages)
Field_Condition
andSuccessor
Successor
forValid_Predecessor
(Improve binary size of generated code #908 (comment))Valid_Predecessor
? (decreases provability, as transformation of expression function into function necessary)-gnatD
(don't ask my why this works)(Ideas for sessions)
(Rejected ideas)
Add(no effect when using 23.0w)Inline_Always
toRFLX.RFLX_Generic_Types.(Insert|Insert_LE|Extract|Extract_LE)
,Get_Field_Value
Set_Field_Value
andinto separate functions (negative impact forVerify
Get_Field_Value
andVerify
)Refactor(negative impact)Successor
andValid_Predecessor
Remove duplication in(slight negative impact on binary size)Valid
,Invalid
andIncomplete
for fields and cursorsRemove duplications in(no impact on binary size)Field_Condition
,Path_Condition
and(Structural_|)Valid_Message
by addingLink_Condition
functionAdd precalculation ofVal.Fld
toSet
The text was updated successfully, but these errors were encountered: