Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMIGAOS: Use LTO, -O3 and dead code elimination by default #4238

Merged
merged 4 commits into from Sep 10, 2022

Conversation

raziel-
Copy link
Contributor

@raziel- raziel- commented Aug 26, 2022

This already fixes a strange hiccup in Grim videos, probably more

@lephilousophe
Any objections regarding buildbot?

@lephilousophe
Copy link
Member

lephilousophe commented Aug 26, 2022

It was pretty slow when building the Vita port but it was optimizing the whole binary.
I don't know... I am not fond of this.
What do you mean by hiccup? Small freeze or error?

@raziel-
Copy link
Contributor Author

raziel- commented Aug 26, 2022

I can do it locally with LDFLAGS envva, if you have doubts.

The whole intro video has hiccups in the way that everytime there was a change in the scene it would slow down to a crawl (both FPS-wise and the viseo playback, until it gains momentum back and plays real-time again, just to be slowed down by the next piece)
i.e. Manny saying "Me? But i'm your friend..." or when Selso starts tapping his foot etc.)
It looked like it was loading something in first before playing the next video piece, this is gone completely with LTO enabled

@raziel-
Copy link
Contributor Author

raziel- commented Aug 26, 2022

@lephilousophe

https://c.web.de/@309311521245107413/BIVPjCt1T6mTTG27NSbFCA

Bad quality, smartphone, but you may get what I mean with "hiccups"

@raziel-
Copy link
Contributor Author

raziel- commented Aug 26, 2022

Actually, no, i can't do it locally.
Using LDFLAGS and/or CXXFLAGS will still error out on not finding a suitable compiler.

I have to do it within configure.
It's limited to AmigaOS anyway, isn't it?

@lephilousophe
Copy link
Member

lephilousophe commented Aug 27, 2022

Could you make sure that optimizations are enabled? I just noticed that even the buildbot builds (where we use --enable-optimizations) have a -O0 flag.
If performance is an issue, I would first ensure that -O2 is used.
It's probable that using LTO force enables the optimizer.

@raziel-
Copy link
Contributor Author

raziel- commented Aug 27, 2022

Yes, they are.

I compiled a full test release yesterday and a smaller debug build today.
Both show that -O2 are used in release builds (confirmed in config.log and config.mk) and -O0 in debug builds.

configure, lines 2583-2590 handle that
if test "$_debug_build" = no; then
_optimization_level=-O2
else
_optimization_level=-O0
append_var CXXFLAGS "-fno-omit-frame-pointer"
append_var CXXFLAGS "-fno-strict-aliasing"
define_in_config_if_yes "$_debug_build" 'DEBUG_BUILD'
fi

If it makes more sense i can also use this to add -flto (but i'd have to add another -shared flag aswell, otherwise it wouldn't get picked up - also already tested)

Best place to add it there would be underneath
configure, line 2590
append_var LDFLAGS "-flto"
append_var LDFLAGS "-shared"

Sorry, i don't know how to inline code samples here

@lephilousophe
Copy link
Member

lephilousophe commented Aug 27, 2022

What about something like that instead:

diff --git a/configure b/configure
index 6dae89fba92..a450687764d 100755
--- a/configure
+++ b/configure
@@ -2579,10 +2579,8 @@ case $_host_os in
                _port_mk="backends/platform/sdl/amigaos/amigaos.mk"
                add_line_to_config_mk 'AMIGAOS = 1'
                append_var CXXFLAGS "-mlongcall"
-               # Enable optimizations for non-debug builds
-               if test "$_debug_build" = no; then
-                       _optimization_level=-O2
-               else
+               # Disable optimizations for debug builds
+               if test "$_debug_build" = yes; then
                        _optimization_level=-O0
                        append_var CXXFLAGS "-fno-omit-frame-pointer"
                        append_var CXXFLAGS "-fno-strict-aliasing"
@@ -6200,6 +6198,14 @@ case $_host_os in
                        append_var LDFLAGS "-Wl,--no-gc-sections"
                fi
                ;;
+       amigaos*)
+               # In release mode use LTO to improve performance
+               if test "$_release_build" = yes; then
+                       append_var CXXFLAGS "-flto"
+                       append_var LDFLAGS "-flto=jobserver"
+                       append_var PLUGIN_LDFLAGS "-flto=jobserver"
+               fi
+               ;;
        android)
                # Force treating unqualified char variables as signed by default.
                # NDK compiler for ARM will treat them as unsigned otherwise, which creates bugs in the code.

This way, without --debug-build, -O2 is used (it's set in _default_optimization_level).
In addition -flto will be used in release build.

@raziel-
Copy link
Contributor Author

raziel- commented Aug 27, 2022

_default_optimization_level doesn't seem to get pciked up.

With these changes i end up with -O0 in config.mk

@raziel-
Copy link
Contributor Author

raziel- commented Aug 27, 2022

Nevermind my last comment, residues interfered...

compiling and testing a two engine build now...

@raziel-
Copy link
Contributor Author

raziel- commented Aug 27, 2022

@lephilousophe

I tried every possible combination but only the below change to your patch works.
Otherwise i either get a linker error about missing the "-shared" flag when using "-f" flags (fixed by adding it to PLUGIN_LDFLAGS) or a compiled binary that refuses to start (fixed by removing the LDFLAGS additions).
I also tried removing the GCCFLAGS addition, tht works aswell, but i leave it be, won't hurt i guess(?)

diff --git a/configure b/configure
index 6dae89fba92..a450687764d 100755
--- a/configure
+++ b/configure
@@ -2579,10 +2579,8 @@ case $_host_os in
_port_mk="backends/platform/sdl/amigaos/amigaos.mk"
add_line_to_config_mk 'AMIGAOS = 1'
append_var CXXFLAGS "-mlongcall"

  •           # Enable optimizations for non-debug builds
    
  •           if test "$_debug_build" = no; then
    
  •                   _optimization_level=-O2
    
  •           else
    
  •           # Disable optimizations for debug builds
    
  •           if test "$_debug_build" = yes; then
                      _optimization_level=-O0
                      append_var CXXFLAGS "-fno-omit-frame-pointer"
                      append_var CXXFLAGS "-fno-strict-aliasing"
    

@@ -6200,6 +6198,14 @@ case $_host_os in
append_var LDFLAGS "-Wl,--no-gc-sections"
fi
;;

  •   amigaos*)
    
  •           # In release mode use LTO to improve performance
    
  •           if test "$_release_build" = yes; then
    
  •                   append_var CXXFLAGS "-flto"
    
  •                   append_var PLUGIN_LDFLAGS "-flto=jobserver -shared"
    
  •           fi
    
  •           ;;
      android)
              # Force treating unqualified char variables as signed by default.
              # NDK compiler for ARM will treat them as unsigned otherwise, which creates bugs in the code.
    

and the hiccups in Grim are also still gone, so that change i could live with.

I'm still doing some tests with -O3 (which gave me strange crashes and random freezes a few years back) instead of -O2, but that will be for another PR.

Also testing dead code elimination with "-ffunction-sections -fdata-sections", but again for a future PR.

@lephilousophe
Copy link
Member

lephilousophe commented Aug 27, 2022

_default_optimization_level doesn't seem to get pciked up.

With these changes i end up with -O0 in config.mk

Yes indeed, I missed that _optimizations must be set to true.

@lephilousophe
Copy link
Member

lephilousophe commented Aug 27, 2022

                        append_var PLUGIN_LDFLAGS "-flto=jobserver -shared"

I don't understand why you need to add -shared there, it should be already here.
In config.mk you should have two lines:

PLUGIN_LDFLAGS = -athread=native -flto=jobserver

and some lines below

PLUGIN_LDFLAGS  += -shared

@raziel-
Copy link
Contributor Author

raziel- commented Aug 27, 2022

Yes, i do.
Maybe this isn't needed after all?
Too many options to test :-)
Will check tomorrow, thank you for the feedback

@raziel-
Copy link
Contributor Author

raziel- commented Aug 28, 2022

Yes, you were right, works fine with "-shared" dropped after "-flto=jobserver"

@sev-
Copy link
Member

sev- commented Aug 29, 2022

So, what is the status then?

@raziel-
Copy link
Contributor Author

raziel- commented Aug 29, 2022

@sev-

Sorry, back at work I hope I didn't break anything, doing this from a smartphone.

Like it is now it will work for me, if @lephilousophe doesn't object?

Or we could wait for the end of the following week (I'm doing a test phase with a build that has lto, -O3 and dead code elimination in place with some fellow AmigaOS users).
Test phase will end middle of next week and I could add the changes to this PR before merging, so it could be reviewed...

...or I can create a new PR...what would be preferred?

Thank you

@lephilousophe
Copy link
Member

lephilousophe commented Aug 30, 2022

I propose we wait for the end of your tests.
You can still add commits to this PR and we will squash it at the end.
The timing is perfect for me because I expect limited availability before the end of next week.

@raziel-
Copy link
Contributor Author

raziel- commented Aug 30, 2022

Will do it like this then...thank you very much for reviewing

@raziel- raziel- changed the title AMIGAOS: Use LTO by default AMIGAOS: Use LTO, -O3 and dead code elimination by default Sep 3, 2022
@raziel-
Copy link
Contributor Author

raziel- commented Sep 3, 2022

That's what works for me locally.

A few tests have been made, none were unsuccessful.

@raziel-
Copy link
Contributor Author

raziel- commented Sep 5, 2022

@lephilousophe

Any remarks, objections or slicker solution?
Thank you

@lephilousophe
Copy link
Member

lephilousophe commented Sep 6, 2022

I can't review right now, will do at the end of week.

@raziel-
Copy link
Contributor Author

raziel- commented Sep 6, 2022

Oh sorry, I misread, I thought you have time upto the end of this week.

No problem...

@raziel-
Copy link
Contributor Author

raziel- commented Sep 10, 2022

I have a question regarding
PLUGIN_LDFLAGS

Are they connected to the plugins (shared engines) and only get used when compiling plugins, or is it just a coincidence (that they are called PLUGIN) and "plugin" in that context means that they are added as plug-in to the systems default LDFLAGS (but still getting used for the main binary as well)?

It's a little unclear to me.

Thank you

@lephilousophe
Copy link
Member

lephilousophe commented Sep 10, 2022

PLUGIN_LDFLAGS is used when linking the plugins objects files to make the shared object.
It is used in place of the LDFLAGS value in this specific case

@lephilousophe
Copy link
Member

lephilousophe commented Sep 10, 2022

Changes look good to me. I will squash.

@lephilousophe lephilousophe merged commit 63b709b into scummvm:master Sep 10, 2022
8 checks passed
@raziel-
Copy link
Contributor Author

raziel- commented Sep 10, 2022

@lephilousophe

Thank you

@raziel- raziel- deleted the patch-4 branch Sep 10, 2022
einstein95 pushed a commit to einstein95/scummvm that referenced this pull request Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants