Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework PathogenExtensions.cpp into a separate library. #2

Closed
PathogenDavid opened this issue Aug 5, 2020 · 8 comments
Closed

Rework PathogenExtensions.cpp into a separate library. #2

PathogenDavid opened this issue Aug 5, 2020 · 8 comments

Comments

@PathogenDavid
Copy link
Member

Right now the Pathogen Extensions are built directly into libclang. This was done initially to make hacking on libclang easier, but I've since found that it's semi-necessary because we're using methods internal to libclang (such as cxcursor::getCursorDecl, which gets the Decl hidden inside a CXCursor.)

This is not ideal because it means libclang-pathogen is huge since it contains all of libclang. It also means consumers of ClangSharp.Pathogen are shipping two copies of libclang.

We could eliminate the extra copy of libclang by blocking the libclang.runtime. package added by ClangSharp and substituting it for our own. Also I'd have to experiment to see if this is possible to do gracefully from our own NuGet package. In theory this should be fine since the libclang.runtime. packages are only updated on each LLVM release, and we wouldn't expect mixing and matching ClangSharp.Pathogen with different LLVM releases to work anyway.

libClangSharp works around this issue by copy+pasting the required functionality out of libclang: https://github.com/microsoft/ClangSharp/tree/2712e6b3ac30bb914ef99fdf66d7c38256a997ad/sources/libClangSharp

PathogenDavid added a commit that referenced this issue Oct 19, 2020
… computed correctly.

Turns out the root cause of the issues I was observing  was the fact that in many places within LLVM uses the address of the fltSemantics instance to figure its identity. There's only a handful of fltSemantics instances and they're all statically allocated. This means that when there's multiple DLLs there's a set of fltSemantics instances for each DLL. We have multiple DLLs (the one that comes with ClangSharp and the one that holds PathogenExtensions) and we process data allocated in one with another, whoops. For whatever reason in certain scenarios the fltSemantics from the ClangSharp libclang were appearing in the APValue instances in libclang-pathogen.

This problem will be properly fixed by #2
For now I am manually overriding the native library resolver to only resolve everything to libclang-pathogen.dll
@PathogenDavid
Copy link
Member Author

This needs to happen sooner rather than later. I just spent all morning debugging weird behaviors in pathogen_ComputeConstantValue that were ultimately caused by the fact that there's two libclang DLLs in the mix.

For now I've patched Biohazrd to override the DLL resolver in ClangSharp to load libclang-pathogen in place of its libclang.

@PathogenDavid
Copy link
Member Author

Looking into the issue a bit further, I'm actually not confident that refactoring things into a separate library will actually solve the issue permanently. Instead I think the proper solution is that we need to rework libClangSharp to be a static library and rework PathogenExtensions into its own library which statically links to libclang and the other Clang static libraries so everything goes into one monolithic blob that has everything with no duplicates.

@PathogenDavid
Copy link
Member Author

For the sake of posterity, I wanted to document the original issue here. CC @tannergooding: I don't have any reason to believe this is an issue in libClangSharp today, but it might be something you want to keep in mind since it's not a fun landmine to step on. (TL;DR: Linking to Clang libraries statically in one DLL and sharing data libclang from another DLL can be problematic.)


I added functionality to ClangSharp.Pathogen that surfaces functionality provided by Expr::EvaluateAsRValue.

This result of this API is getting a union-like type APValue representing the evaluated constant. In the case of floating point numbers you end up with a llvm::APFloat.

APFloat is designed so that it can have multiple internal representations (IE: half, float, X87 doubles, etc), these semantics are described by a handful of statically allocated fltSemantics instances.

Some APFloat supporting methods directly compare pointers to fltSemantics to determine the type of float that's being dealt with. bitcastToAPInt once such example.

When you have multiple DLLs containing the APFloat implementation, they will have their own independent sets of fltSemantics. So as you can imagine it would be problematic if an APFloat instantiated in one DLL were used in another.

Somewhere deep within Clang (didn't try to narrow things down) these APFloats are instantiated ahead of time or by a virtual method in certain situations. (The situation I ran into was all non-finite floats and floats which are implicitly casted.) When this happens many APFloat APIs misbehave in very strange and unpredictable ways. (For example, going back to the bitcastToAPInt example, all floats will be treated as X87 double extended-type floats when they're just normal IEEE singles/doubles.)

Needless to say this was endlessly confusing and extremely problematic. (And was definitely not how I wanted to spend my morning yesterday.)

PathogenDavid added a commit to MochiLibraries/Biohazrd that referenced this issue Oct 20, 2020
@tannergooding
Copy link

For reference, the number one issue I've hit is that both the consumed copy of libClang and the separate library (such as libClangSharp) need to match exactly.

LLVM, in general, makes breaking changes almost every release and does not even have consistent type layout between the 4 default build configurations (debug, release, minsizerel, relwithdebinfo).
This means that you need to match both on version (possibly commit) and build configuration for things to be expected to work smoothly.

In an ideal setup, libClang would take some changes to make it inherently more extensible, but the various issues I've logged haven't gotten any responses yet.

@PathogenDavid
Copy link
Member Author

For reference, the number one issue I've hit is that both the consumed copy of libClang and the separate library (such as libClangSharp) need to match exactly.

I can definitely believe that. It seems to be a fickle library. The debug releases of LLVM don't even build properly on my machine. (Which makes debugging issues like this all the more annoying.)

Besides this issue I've gotten lucky mixing my MSVC-built release library with yours so far, but I definitely want to move away from hopes and luck. (Especially since this issue wasn't caused by that exactly.)

LLVM, in general, makes breaking changes almost every release

I've been wondering how real that threat of "the internal API is unstable" was. Guess I'll have that to look forward to.

the various issues I've logged haven't gotten any responses yet.

That's unfortunate to hear. The Clang API has so many powerful features that aren't accessible through libclang.

PathogenDavid added a commit to MochiLibraries/Biohazrd that referenced this issue Oct 20, 2020
@tannergooding
Copy link

tannergooding commented Oct 20, 2020

I've been wondering how real that threat of "the internal API is unstable" was. Guess I'll have that to look forward to.

This wasn't even referring to the internal API, which I would naturally assume is prone to breaks. They break public surface area from release to release.

The most common example is inserting new enum members between existing definitions, which shifts the constant value emitted for something like StmtClass_RequiresExpr. So if you are using _RequiresExpr from clang 10 against clang 11, you will actually match against _RecoveryExpr (edit: actually this is even off because there were other enum members inserted as well in 11, so this is shifted even more).

This can also happen with methods. For example LLVMDIBuilderCreateCompileUnit had 4 new trailing parameters added in LLVM 11: const char *SysRoot, size_t SysRootLen, const char *SDK, size_t SDKLen.

I've also seen it happen with "value types" (types passed around by value, rather than by pointer or reference) in a past release which impacts compatibility for code compiled against one and consumed by another as field indexing ends up being different (something which C# doesn't normally have a problem with given it is jitted).

@PathogenDavid
Copy link
Member Author

PathogenDavid commented Oct 20, 2020

This wasn't even referring to the internal API, which I would naturally assume is prone to breaks. They break public surface area from release to release.

"internal API" wasn't the right term here. I was referring to everything that isn't libclang.

The most common example is inserting new enum members between existing definitions

Glad to know my hyper-paranoid enum asserts aren't for nothing.

Thanks for the examples, it's helpful to know what sorts of issues to prepare for.

PathogenDavid added a commit to MochiLibraries/Biohazrd that referenced this issue Feb 12, 2021
ClangSharp.Pathogen's native runtime library (libclang-pathogen) now includes libClangSharp, which both makes it easier to build all of the native Clang-related components in one go and resolves any potential issues similar to MochiLibraries/llvm-project#2 (comment) by unifying all potential instances of Clang into a single DLL.

Part of this involved formalizing the __HACK__InstallLibClangDllWorkaround method in TranslatedLibraryBuilder as LibClangSharpResolver. This new interface also adds the ability to easily override the native runtime library with a single method call (LibClangSharpResolver.OverrideNativeRuntime), which makes iterating on libclang-pathogen much easier. (Previously I did something very similar to this, but it was always quirky and annoying to get right in all situations.)
@PathogenDavid
Copy link
Member Author

Closing this out as this ship has very sailed with the inline issues mentioned above and we've been shipping a single monolithic libclang+libClangSharp+Pathogen runtime for quite a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants