Skip to content

Conversation

@NBruschi
Copy link
Contributor

This patch disables tiny section relocation which confuses the linker and generates a linking-time error when there is -flto (LDFLAG)

Copy link
Member

@FrancescoConti FrancescoConti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the strong impact on performance if -flto cannot be enabled, two thumbs up for me 👍 👍

@gautschimi
Copy link

is this a fix for -flto or just a workaround?

@NBruschi
Copy link
Contributor Author

it should be a fix, which means that, if you don’t define CONFIG_NO_STD_RELOC, you don’t use assembly code optimizations that broke the -flto.

@FrancescoConti
Copy link
Member

From the discussion we had in the morning with @gtagliavini , it looks to me that the current behavior provides some benefit in very fine-grain cases (e.g., calling a runtime function requires 1 instruction less), but prevents -flto. For realistic benchmarks (https://github.com/pulp-platform/dory_examples) we have 15-20% performance improvement with this change.
So I think that the CONFIG_NO_STD_RELOC define can be useful for micro-benchmarks, but should be disabled by default.

@gautschimi
Copy link

I think you don't need the define. just use the version that works with -flto. the few additional instructions are negligible.

but as far as I understand its more a problem in the toolchain that cannot handle these assembly constructs. -> this is more a workaround than a fix.

@NBruschi
Copy link
Contributor Author

The next step will be improving the toolchain to properly support that structure. We are planning how and when to change it, and will likely do so in a future release. Until then, in order not to compromise compatibility with other versions, I have added a definition which is locked by default. If someone has a version of the toolchain that supports that construct, they might use more optimized runtime routine calls, triggering them during the building process.
Moreover, I think it is useful for small synthetic benchmarks that take considerable benefits from the fine-grain optimization of runtime functions, and by removing those parts, they won't be evaluable.

@FrancescoConti FrancescoConti merged commit 7863b6f into main Feb 1, 2021
@NBruschi NBruschi deleted the fix_lto branch May 4, 2021 09:51
Victor-Jung pushed a commit to Victor-Jung/pulp-sdk that referenced this pull request May 22, 2025
Added CONFIG_NO_STD_RELOC: Avoids tiny section -> Fixes linking-time …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants