Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string_id::NULL_ID isn't always initialized in time #17197

Closed
Coolthulhu opened this issue Jun 15, 2016 · 19 comments
Closed

string_id::NULL_ID isn't always initialized in time #17197

Coolthulhu opened this issue Jun 15, 2016 · 19 comments
Labels
<Crash / Freeze> Fatal bug that results in hangs or crashes.

Comments

@Coolthulhu
Copy link
Contributor

Causes #17177, currently hacked around with #17179 but it's still a big risk of future crashes and so should be fixed by actually addressing the problem.

@Coolthulhu Coolthulhu added the <Crash / Freeze> Fatal bug that results in hangs or crashes. label Jun 15, 2016
@Coolthulhu Coolthulhu added this to the 0.D milestone Jun 15, 2016
@BevapDin
Copy link
Contributor

I say the item factory (which causes the linked bug) should be initialized from the game class (maybe from its constructor). Currently it's statically created, the same as the NULL_ID and that causes the problem.

Alternatively it could be created as static object inside a static wrapper function (similar to DynamicDataLoader::get_instance).

@kevingranade
Copy link
Member

kevingranade commented Jun 16, 2016 via email

@Coolthulhu
Copy link
Contributor Author

Would it be possible to have a single "has ID" class with an "init null ID" method, invoke the method in init.cpp and statically warn if a class uses a null ID but doesn't have said class?

Alternatively, ensure that using null ID causes a crash if it isn't explicitly invoked somewhere.

@BevapDin
Copy link
Contributor

Why not change NULL_ID to a static method:

static const string_id<Foo> &string_id<Foo>::NULL_ID() {
    static const string_id<Foo> instance( "..." );
    return instance;
}

That has no external dependencies and is initialized as soon as it's needed.

@Coolthulhu
Copy link
Contributor Author

What would happen if it was invoked circularly? Crash or unspecified behavior?

@BevapDin
Copy link
Contributor

It's simply not invoked circular. The constructor (that gets invoked when instance is created) should not invoke any other functions (besides copy of std::string).

If we have a circular dependency, it needs to be changed anyway. Function static objects are the best solution because C++ guarantees their constructor is invoked at most once (it's even thread-safe).

@kevingranade
Copy link
Member

kevingranade commented Jan 12, 2017 via email

@Coolthulhu
Copy link
Contributor Author

When did local reference return cause crashes? It should cause a compiler warning rather than undefined behavior later on.

NULL_ID not being initialized in time is not related to references, but rather static initialization. At the moment NULL_IDs are initialized in static context, which means the order varies between OSes and compilers (Windows+mingw doesn't sort alphabetically, for example).

@kevingranade
Copy link
Member

kevingranade commented Jan 12, 2017 via email

@Coolthulhu
Copy link
Contributor Author

Yes, but the compiler should warn about it. I'm sure it does in the most common case.
The only "local" references I've seen returned were ones to statics, declared before returning them. And this "singleton" kind of design is generally safe and in most cases avoids producing extra objects just for tiny comparisons such as type of an item.

It is only risky in the case of objects that change a lot between worlds. But I don't recall this happening in the core game (someone would need to redefine "null" object or something like that) and those objects also benefit the most from not being rebuilt every time.

@kevingranade
Copy link
Member

this "singleton" kind of design is generally safe and in most cases avoids producing extra objects just for tiny comparisons such as type of an item.

In theory yes, in practice this isn't the first time this specific idiom (returning a const reference) has broken things, and I guarantee it won't be the last. My question is, do we have evidence that the performance benefit of doing it this way is worth the fragility that comes with it? People seem to find it easy to internalize the "return objects by const reference is good" mantra, but when one of these methods needs to return a null object, they frequently do not know or think to use the safe idiom.

I'm proposing that we can simply make the method return a copy instead, which has none of this fragility. The only down side to this approach is a potential performance penalty, which AFAIK is totally unquantified, which is why I said:

Do we have a clear idea of the benefit of this idiom, because the cost so far is crash bugs scattered around in the code?

@Coolthulhu
Copy link
Contributor Author

In theory yes, in practice this isn't the first time this specific idiom (returning a const reference) has broken things

When did it break things? The only fragile part of it I can think of here is item factory, which changes between save+load and sometimes during single game. But then, it HAS to return references/pointers in many cases, because items need pointers to types.

My question is, do we have evidence that the performance benefit of doing it this way is worth the fragility that comes with it?

I don't see that fragility. I recall multiple crashes related to pointer/reference getting scrambled, but I can't think of a single case when fixing it by returning a value was a real option.
It causes some minor problems with design, but then removes other problems by allowing many functions to count as cheap performance-wise.

The crash related to NULL_ID would still happen just fine with by-value returns since it depends on constructors being invoked in unspecified order, which by-value return doesn't affect.

I'm proposing that we can simply make the method return a copy instead, which has none of this fragility.

You mean the null id function or everything in general?
In case of null id, there is no risk of any fragility if it is a static. It doesn't depend on anything and doesn't change between calls. The only thing that would change would be wrapping it in a function and initializing it on demand rather than in static order.

It's the static order that causes the crash here, because item_factory is initialized in static context, as opposed to say, through a getter function.

As for performance:
NULL_ID is compared quite a lot in NPC-related code (scanning for harvestable plants, grading items by their weapon value).
It may be used a lot in vehicle code or may not be used much - I can't really tell because it's behind an operator bool() and it's hard to search for uses of it.
It could be changed into ugly two-pass code to avoid it (first check if it isn't null, then check what it actually is) like I did it with map::i_at and map::has_items to speed up fire calculations, but this approach is ugly IMO.

@mugling
Copy link
Contributor

mugling commented Jan 15, 2017

I can't really tell because it's behind an operator bool() and it's hard to search for uses of it.

Remove the definition but leave the declaration for that operator. You should get a series of linker errors for each usage. You may need to adjust your compiler flags to suppress the '...and 43 other occurrences' contraction.

@kevingranade kevingranade added this to Blockers in 0.D Release Mar 2, 2017
@Coolthulhu
Copy link
Contributor Author

This is purely a code quality issue.
It shouldn't really be a blocker for 0.D - the hacks we have now are safe, just ugly, but fixing it could actually be less safe.

@Coolthulhu
Copy link
Contributor Author

Can I take it off from blockers?
It can't cause crashes in its current form, it only makes the code uglier.

@kevingranade kevingranade removed this from the 0.D milestone Mar 21, 2017
@Coolthulhu Coolthulhu moved this from Blockers to Open Issues in 0.D Release Mar 25, 2017
@feinorgh
Copy link
Contributor

feinorgh commented Apr 7, 2017

I don't know if this is wholly related, but clang-3.8 and above warns about undefined instantiations of ::NULL_ID variants in various templated classes.

Example:

src/string_id.h:67:50: warning: instantiation of variable 'string_id<furn_t>::NULL_ID' required here, but no definition is available [-Wundefined-var-template]
        string_id( const null_id_type & ) : _id( NULL_ID._id ), _cid( NULL_ID._cid ) {}
                                                 ^
src/mapdata.h:71:68: note: in instantiation of member function 'string_id<furn_t>::string_id' requested here
                      drop_group("EMPTY_GROUP"), ter_set(NULL_ID), furn_set(NULL_ID) {};
                                                                   ^
src/string_id.h:155:35: note: forward declaration of template entity is here
        static const string_id<T> NULL_ID;
                                  ^
src/string_id.h:67:50: note: add an explicit instantiation declaration to suppress this warning if 'string_id<furn_t>::NULL_ID' is explicitly instantiated in
      another translation unit
        string_id( const null_id_type & ) : _id( NULL_ID._id ), _cid( NULL_ID._cid ) {}

I don't know if adding explicit instatiation declarations where they are needed will help the problem, but at least this current design is causing build failures with the current setup with RELEASE=1 on clang-3.8+.

@Leland
Copy link
Contributor

Leland commented May 17, 2017

@kevingranade with #20946 merged, is this closed?

@kevingranade
Copy link
Member

Good catch, the new template code does force us to in it everything up front, so this should not recur.

@kevingranade
Copy link
Member

I was mistaken, we're still vulnerable to this:
https://travis-ci.org/CleverRaven/Cataclysm-DDA/builds/233488125

@kevingranade kevingranade reopened this May 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
<Crash / Freeze> Fatal bug that results in hangs or crashes.
Projects
None yet
Development

No branches or pull requests

6 participants