Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.

Fix issue 7995: D runtime initialization from C fails on OSX in 2.059, worked in 2.058. #228

Closed
wants to merge 4 commits into from

Conversation

jacob-carlborg
Copy link
Contributor

Second try.

I also refactored the startup code to remove duplicated code.
This will hopefully reduce this kind of bugs in the future.

I had to remove the call to "runModuleUnitTests" in "rt_init". Meaning the unit tests won't be run if the runtime is just initialized via "rt_init". This basically only happens if you start the D runtime from a C application. The unit tests are still run, just as before, for all D applications with a D main function. Don't know if this is a problem or not.

@@ -265,6 +295,11 @@ alias void delegate(Throwable) ExceptionHandler;

extern (C) bool rt_init(ExceptionHandler dg = null)
{
static __gshared bool result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static or __gshared? They're mutually exclusive, DMD just doesn't enforce that. What did you intend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended __gshared.

fldcw fpucw;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use only one function name if you
dispatch the version on the outside.

version (OSX)
  void initPlatform() { /*...*/ }
else version (FreeBSD)
  void initPlatform() { /*...*/ }
else
  void initPlatform() { /* NOOP */ }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good idea.

@MartinNowak
Copy link
Member

What are all these double initialization guards about?
None of them prevents orders concurrent calls to rt_init but
they break the rt_init/rt_term symmetry.

@jacob-carlborg
Copy link
Contributor Author

I thought I would be good idea to prevent the runtime to be initialize/terminated more than once.

@MartinNowak
Copy link
Member

It's not a bad idea, but let's do it in another pull in a thread-safe manner.

@jacob-carlborg
Copy link
Contributor Author

Then you think I should just close this pull request. Open a new one that just fixes issue 7995. After that open an additional pull request that removes the duplicated code, like I've done in this pull request, but in a thread safe manner?

@jacob-carlborg
Copy link
Contributor Author

A couple of follow up questions:

  1. Is this only a problem when initializing/terminating the runtime from C
  2. Is this any less safe than what we have now
  3. Would it be to ask too much that the user would need to synchronize the initialization/termination of the runtime from C

@jacob-carlborg
Copy link
Contributor Author

I pushed two more commits just to make the pull request complete.

@@ -296,26 +342,34 @@ void _d_criticalTerm()
}
}

extern (C) bool rt_term(ExceptionHandler dg = null)
extern (C) bool rt_term(ExceptionHandler dg = null, bool rethrow = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the signature would break all kinds of code.
You could do the same tricks as suggested for rt_init.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it wouldn't break anything since it's a default argument. It should be source compatible, but it's probably not binary compatible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that rt_init and rt_term have been used outside of druntime
for quite a while so we have no way of updating that code.

It should be source compatible, but it's probably not binary compatible.

Yup, adding another parameter will change the argument registers.

By the way dylib_fixes.c is already broken because it declares char rt_init(), i.e. dg is passed uninitialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the default argument kick in for rt_initwhen called from dylib_fixes.c?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, default args are not reflected in the ABI.

@MartinNowak
Copy link
Member

A couple of follow up questions:

  1. Is this only a problem when initializing/terminating the runtime from C

It's only an issue with a shared runtime library.

  1. Is this any less safe than what we have now

We don't yet know exactly how initialization for shared libraries will work. For example dylib_fixes is one approach that would handle this issue automatically. Deferring this means less unused code now and more options later.

  1. Would it be to ask too much that the user would need to synchronize the initialization/termination of the runtime from C

Surprisingly yes. Consider libcurl and libfreetype would both use D under the hood and the shared runtime required explicit initialization.
While your network thread calls curl_global_init->rt_init your render thread might call FT_Init_FreeType->rt_init and now you have a hidden race condition.

@jacob-carlborg
Copy link
Contributor Author

Is there a point for this pull request staying open?

@MartinNowak
Copy link
Member

First, thanks for your effort, I already wanted to do this for quite some time.

Is there a point for this pull request staying open?

Not sure if I understand you correctly.
If you're asking about reasons why this isn't yet merged.

  • the double initialization guards shouldn't be mixed with a
    refactoring and need more thought w.r.t. shared libraries
  • the ABI breakage (rt_init/rt_term) needs to be fixed

If you meant to replace this pull request you could close it
or just reset your branch.

@jacob-carlborg
Copy link
Contributor Author

I'm asking if I should close this pull request and make a new one that only fixes issue 7995 and doesn't do any refactoring.

Then initialization guards and the refactoring needs to happen at the same time, at least the way I've done it.

I can fix the rt_init/rt_term ABI breakage if you think it's ok to wait with the synchronization of the initialization guards and merge this anyway.

@MartinNowak
Copy link
Member

Looks like this pull request doesn't actually contain the a fix for #7995.
To be able to register the _dyld_register_func... callbacks in rt_init we need to figure out how to unregister them in rt_term (d as shared library).

I started a branch which contains only the refactorings. The fix for #7995 could then be based on it.

@jacob-carlborg
Copy link
Contributor Author

This pull request should contain a fix. The problem is that _d_osx_image_init2 is only called from main. It needs to be called from rt_init as well. The call sequences should look like this.

main -> _d_criticalInit -> initPlatform ->_d_osx_image_init2
And
rt_init -> _d_criticalInit -> initPlatform -> _d_osx_image_init2

@MartinNowak
Copy link
Member

OK, I made some tests and it indeed crashes.

http://www.opensource.apple.com/source/dyld/dyld-195.6/src/dyld.cpp
The OSX folks simply stuff the callbacks into a static vector and without a possibility to remove them.
That means once the library that installed the callbacks gets unloaded you'll crash/corrupt on the next dlopen/dlclose. This basically makes D unusable in a plugin environment.
One solution would be to copy the callbacks to executable heap memory and simply leak but this
has issues with security and privileges.

@jacob-carlborg
Copy link
Contributor Author

Do you have an example to show? The creator of the issue confirmed it was fixed with my pull request.

http://d.puremagic.com/issues/show_bug.cgi?id=7995#c5

@jacob-carlborg
Copy link
Contributor Author

Doesn't the if-statement prevent the callback from being called?

@thelastmammoth
Copy link

Yes, your initial pull request solved both Issue 7995 and Issue 8133, so it was very useful in itself (a real blocker otherwise) even if it is not a perfect solution; I posted the code and compilation steps so you can double check before and after. I would also like to see which tests make it crash?

@MartinNowak
Copy link
Member

Ah I've overlooked it rt_init->_d_criticalInit->initPlatform->_d_osx_image_init2.

I would also like to see which tests make it crash?

We install callbacks for dyld events but there is no mechanism to uninstall them.
Once you unload a D library from your process you will crash or corrupt.

@thelastmammoth
Copy link

Would it be possible to use bundles instead of dylibs for libraries that are intended to be unloaded?

I couldn't find a definitive answer on the topic but it seems unloading dylibs doesn't do anything since OSX 10.5.7. See also:
http://stackoverflow.com/questions/2339679/what-are-the-differences-between-so-and-dylib-on-osx

Here's for example how to unload a bundle:
https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSBundle_Class/Reference/Reference.html#//apple_ref/occ/instm/NSBundle/unload

That means once the library that installed the callbacks gets unloaded you'll crash/corrupt on the next dlopen/dlclose. This basically makes D unusable in a plugin environment.

We install callbacks for dyld events but there is no mechanism to uninstall them.

@MartinNowak
Copy link
Member

but it seems unloading dylibs

You can simply dlclose what you dlopen'ed. This will unload the shared library from the current process, thus the callbacks become stale pointers.

@jacob-carlborg
Copy link
Contributor Author

We install callbacks for dyld events but there is no mechanism to uninstall them.
Once you unload a D library from your process you will crash or corrupt.

I did some tests my self and I noticed this as well. This is some lousy implementation from Apple :(
I did some more digging and found a solution that works:

http://pastebin.com/qZfiWRbN

The only difference compared to the implementation we have now is that the callback called when unloading a dylib is not called for the dylib that installed it. In the example above that would mean that onUnloadis not called when foo.dylibis unloaded. Note that the callback is still called when bar.dylibis unloaded

@jacob-carlborg
Copy link
Contributor Author

Oh, I forgot one thing. The problem with the implementation above is that dyld_register_image_state_change_handler is a private undocumented function in libdyld.dylib.

@jacob-carlborg
Copy link
Contributor Author

I posted about this issue in the d-runtime mailing list.

@jacob-carlborg jacob-carlborg deleted the 7995 branch May 29, 2016 14:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants