Skip to content
This repository has been archived by the owner on May 9, 2022. It is now read-only.

Fix instability of White Star when built with optimizations using newer versions of clang #1

Closed
dbsoft opened this issue Feb 21, 2022 · 6 comments
Assignees

Comments

@dbsoft
Copy link
Owner

dbsoft commented Feb 21, 2022

I have been looking into this issue on the path to supporting updated builds for Apple Silicon and just a more modern build system for supporting newer versions of MacOS more completely.

The crash is happening in some RootingAPI.h code that was removed in the following bugzilla issue and commit:

https://bugzilla.mozilla.org/show_bug.cgi?id=1325050
https://hg.mozilla.org/mozilla-central/rev/d2758f635f72f779f712bf9c6e838868ed53c9f7

This is just a massive commit and hits large portions of the code base, but I think it is important to the long term viability of the browser and builds. I will be committing this change in the newclang branch that I have already been working on.

https://github.com/dbsoft/UXP/tree/newclang

If this fixes the issue, I will look at merging this into the master branch.

@dbsoft dbsoft self-assigned this Feb 21, 2022
@dbsoft
Copy link
Owner Author

dbsoft commented Feb 24, 2022

Using this comment as a list of ways my patch diverges from the Mozilla one:

js/src/builtin/DataViewObject.cpp: File does not exist in White Star/Pale Moon, patched js/src/vm/TypedArrayObject.cpp
Since the following patch was not integrated into Pale Moon.
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2ce4dbbc26d Move DataViewObject to its own file.

js/src/gc/Allocator.cpp: Removed an unexpected "AutoKeepAtoms keepAtoms(cx->perThreadData);"
Will need to verify this was not needed by some other change.

js/src/gc/AtomMarking.cpp: File does not exist in White Star/Pale Moon.
Since the following patch was not integrated into Pale Moon.
https://hg.mozilla.org/mozilla-central/rev/7311c06a7271
https://bugzilla.mozilla.org/show_bug.cgi?id=1324002

js/src/gc/GCInternals.h: AutoStopVerifyingBarriers class not in our version.
Haven't been able to find where this code was introduced. Just leaving it out for now.

js/src/gc/GCRuntime.h: JS_GC_ZEAL changes haven't been merged, so minor differences.

js/src/gc/Nursery.cpp: Minor changes due ot missing JS_GC_ZEAL changes.
printProfileHeader() differences.

js/src/gc/Statistics.cpp: printProfileHeader() differences.

js/src/gc/Verifier.cpp: None of the code requiring changes is present, not sure where it was added.
Probably the zeal mode that we don't have or keepAtoms.

js/src/jit/BaselineCacheIR.[cpp|h]: Was renamed BaselineCacheIRCompiler.[cpp|h]
https://hg.mozilla.org/mozilla-central/rev/308412053019c9379eb899378ef575e903d64cbe
Most required code changes missing.

js/src/jit/CacheIR.cpp: Code to change is missing.

js/src/jit/SharedIC.[cpp|h]: BaselineEmitPostWriteBarrierSlot missing.

js/src/jsapi.cpp: JS::GetModuleResolveHook and JS::SetModuleResolveHook different.

js/src/jsgc.cpp: Missing zeal and atom marking.

js/src/jsgc.h: GCParallelTask differences.

js/src/jscript.cpp: Most of the changes not related to ExclusiveContext removal, already present.

js/src/jswatchpoint.cpp: Missing?

js/src/vm/Debugger.cpp: Code to "Iterate through all wasm instances to find ones that need to be updated." missing.
JS_DefineDebuggerObject() code to change missing.

js/src/vm/GeckoProfiler.cpp: Is SPSProfiler.cpp in White Star/Pale Moon.

js/src/vm/HelperThreads.[cpp|h]: EnqueuePendingParseTasksAfterGC() code different.
Also some code de-duplicated in the mozilla version.
ScriptDecodeTask missing.

js/src/vm/Stack.cpp: savedPrevJitTop_ code different.

js/src/vm/Xdr.[cpp|h]: Most code changes already present or code to change is missing.

Needed to make extreme changes to the ModuleResolveHook code.

@dbsoft
Copy link
Owner Author

dbsoft commented Mar 16, 2022

Will need to look into warnings in dom/ips/StructuredCloneData.h in part 2 or a later patch.

I tried to implement fixes, but it caused white star to fail to link.

5:05.96 In file included from /Users/x/objdir-whitestar-current/dist/include/ipc/IPCMessageUtils.h:15:
5:05.96 Warning: -Wdefaulted-function-deleted in /Users/nuke/objdir-whitestar-current/dist/include/mozilla/dom/ipc/StructuredCloneData.h: explicitly defaulted move assignment operator is implicitly deleted
5:05.96 /Users/x/objdir-whitestar-current/dist/include/mozilla/dom/ipc/StructuredCloneData.h:83:3: warning: explicitly defaulted move assignment operator is implicitly deleted [-Wdefaulted-function-deleted]
5:05.96 operator=(StructuredCloneData&& aOther) = default;
5:05.96 ^
5:05.96 /Users/x/objdir-whitestar-current/dist/include/mozilla/dom/ipc/StructuredCloneData.h:61:29: note: move assignment operator of 'StructuredCloneData' is implicitly deleted because base class 'mozilla::dom::StructuredCloneHolder' has a deleted move assignment operator
5:05.96 class StructuredCloneData : public StructuredCloneHolder
5:05.96 ^
5:05.96 /Users/x/objdir-whitestar-current/dist/include/mozilla/dom/StructuredCloneHolder.h:159:3: note: copy assignment operator is implicitly deleted because 'StructuredCloneHolder' has a user-declared move constructor
5:05.96 StructuredCloneHolder(StructuredCloneHolder&& aOther) = default;
5:05.96 ^
5:06.72 2 warnings generated.

dbsoft added a commit that referenced this issue Mar 16, 2022
…ntimes

This is a massive patch hitting huge areas of the javascript engine.  Due to code divergence there are quite a few differences from the Mozilla patch, so there will be at least one more patch related to this, if not several. See UXP Issue #1 for more information.
@dbsoft
Copy link
Owner Author

dbsoft commented Mar 21, 2022

I've now switched to trying to figure out which optimization is causing the issue by enabling the optimizations directly.

According to https://developer.amd.com/wordpress/media/2017/04/AOCC-1.1-Clang%20-%20the%20C%20C++%20Compiler.pdf

The following optimizations are added on at the -O2 level.
-vectorize-loops -vectorize-slp -itodcalls -itodcallsbyclone -inline -mldst-motion -gvn -elim-avail-extern -slpinstcombine -globaldce -constmerge -loop-sink
The following optimizations are dropped at the -O2 level.
-always-inline

https://gist.github.com/lolo32/fd8ce29b218ac2d93a9e

Shows that -fvectorize and -fslp-vectorize options get added at -O2 level where the crash starts occuring.

So I attempted a build with this in the .mozconfig:

ac_add_options --enable-optimize="-O1 -fvectorize -fslp-vectorize"

Which should build at -O1 level with the vectorization options from -O2 enabled. This was stable.

So I am assuming the crash is being caused by one of the remaining optimizations:

-itodcalls -itodcallsbyclone -inline -mldst-motion -gvn -elim-avail-extern -slpinstcombine -globaldce -constmerge -loop-sink

However the rest do not have options to enable individually when calling clang. They only have options for calling LLVM's opt.

https://llvm.org/docs/CommandGuide/opt.html
https://llvm.org/docs/Passes.html

According to the earlier referenced PDF from the AMD site:

-mllvm
Need to provide -mllvm so that the option can pass through the compiler front end and get applied on the optimizer where this optimization is implemented.
For example: -mllvm -enable-strided-vectorization

So I attempted the following to enable additional optimizations from the list:

ac_add_options --enable-optimize="-O1 -fvectorize -fslp-vectorize -mllvm -gvn"

But I get the following error:

0:22.16 DEBUG: configure: error: These compiler flags for C are invalid: -O1 -fvectorize -fslp-vectorize -mllvm -gvn

Is the -mllvm option invalid for Apple's clang? If so how do I enable the optimizations individually using Apple's clang?

@dbsoft
Copy link
Owner Author

dbsoft commented May 2, 2022

I've verified this is absolutely a clang/llvm issue, and not a Mac specific issue. I've just done a clang build on Ubuntu Linux and get the same exact crash.

Will try to use clang on Linux to determine the optimization that is causing the problem, and put a fix in for all platforms. Either by disabling the problematic optimization for clang or work around it in the source.

@dbsoft
Copy link
Owner Author

dbsoft commented May 5, 2022

Fix has been discovered for the instability.

https://repo.palemoon.org/MoonchildProductions/UXP/issues/1891

@dbsoft
Copy link
Owner Author

dbsoft commented May 8, 2022

This fix has been merged into the newly unified Pale Moon UXP.

@dbsoft dbsoft closed this as completed May 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant