Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Remove allocation when comparing type info objects #370

Closed
wants to merge 7 commits into
from

Conversation

Projects
None yet
8 participants
Contributor

Ingrater commented Dec 24, 2012

As request by Andrei Alexandrescu on the newsgroup some time ago, here a pull request for my changes to type info comparison. It has several advantages over the current approach:

-Fixes next() for TypeInfo_Const, TypeInfo_Typedef and TypeInfo_Vector
-Comparing two type info objects will no longer allocate two strings and compare them. This is one step needed to get a non leaking d-runtime when working without a GC.
-Variadic functions can be implemented more easily and efficiently. For a example see: https://github.com/Ingrater/thBase/blob/master/src/thBase/format.d#L257
-It is a lot easier to unqualify type info objects across dll boundaries. For a example see https://github.com/Ingrater/thBase/blob/master/src/thBase/format.d#L13

Ingrater added some commits Dec 24, 2012

-Added type property to all TypeInfo classes.
-Comparing two TypeInfo objects no longer allocates and compares strings

@alexrp alexrp and 3 others commented on an outdated diff Dec 24, 2012

src/object_.d
+ {
+ if(lhs is null && rhs is null) //compare complete
+ return true;
+
+ if(lhs is null || rhs is null) //type info chains have different length
+ return false;
+
+ auto lhsType = lhs.type;
+ auto rhsType = rhs.type;
+ if(lhsType != rhsType)
+ return false;
+
+ switch(lhsType)
+ {
+ case TypeInfo.Type.Struct:
+ {
@alexrp

alexrp Dec 24, 2012

Member

Please dedent one level (and you probably don't need those braces either).

@Ingrater

Ingrater Dec 24, 2012

Contributor

Dedent what exactl? The entire case blocks?

@alexrp

alexrp Dec 24, 2012

Member

Yes. Usually we just leave out the braces and indent the code within by one level.

@jmdavis

jmdavis Dec 24, 2012

Member

Yuck. Really? I'd argue that the cases need to be indented. It's the fact that the brace was further indented was the problem. I'd never not indent the cases and am not aware of anywhere in Phobos which doesn't (though it may be in there somewhere). And there's nothing wrong with using braces inside the case statements. I usually do and think that it's cleaner if you do (though unlike in C, it's not actually needed for scoping). They just need to line up with the case statements.

@alexrp

alexrp Dec 24, 2012

Member

My point is this is right:

switch (foo)
{
    case bar:
        baz();
}

Or this, if you really want:

switch (foo)
{
    case bar:
    {
        baz();
    }
}

But not this:

switch (foo)
{
    case bar:
        {
            baz();
        }
}
@jmdavis

jmdavis Dec 24, 2012

Member

That, I agree with, but that means that he needs to change it again, because he misunderstood you and made the cases line up with the switch.

@alexrp

alexrp Dec 24, 2012

Member

Oh, I see. Yeah, that's not what I meant. :)

@complexmath

complexmath Dec 28, 2012

Member

Weird. I've always seen cases outdented because they're labels. Indenting the makes the flow less clear IMO. I think outdenting is consistent with the classic style layouts as well. But as long as the code is consistently formatted I'm not going to quibble about spacing.

@alexrp alexrp and 3 others commented on an outdated diff Dec 27, 2012

src/object.di
@@ -67,6 +67,51 @@ struct OffsetTypeInfo
class TypeInfo
{
+ enum Type
+ {
+ Info,
@alexrp

alexrp Dec 27, 2012

Member

The convention is lower case for enum members.

For names that conflict with keywords, we append a _.

@Ingrater

Ingrater Dec 29, 2012

Contributor

Do we rellay want this in this case? Almost all of these are keywords and would end up with a underscore at the end. Will look quite ugly when using these in code.

@jmdavis

jmdavis Dec 29, 2012

Member

Yes. That's what we want. That's the coding style. It's what we've done elsewhere when we've needed symbol names which match keywords (e.g. std.traits does it).

@jmdavis

jmdavis Dec 29, 2012

Member

Not to mention, most code won't be using these anyway, so even if you think they're a bit ugly, they'll affect a rather small number of programs. Regardless though, it is the agreed upon style for enums.

@andralex

andralex Dec 29, 2012

Owner

yes, camelCase and append underscores if necessary please

@alexrp alexrp commented on an outdated diff Dec 27, 2012

src/object_.d
@@ -240,6 +238,51 @@ struct OffsetTypeInfo
*/
class TypeInfo
{
+ enum Type
+ {
+ Info,

@alexrp alexrp and 1 other commented on an outdated diff Dec 27, 2012

src/object_.d
+ bool compareTypeInfo(const TypeInfo lhs, const TypeInfo rhs)
+ {
+ if(lhs is null && rhs is null) //compare complete
+ return true;
+
+ if(lhs is null || rhs is null) //type info chains have different length
+ return false;
+
+ auto lhsType = lhs.type;
+ auto rhsType = rhs.type;
+ if(lhsType != rhsType)
+ return false;
+
+ switch(lhsType)
+ {
+ case TypeInfo.Type.Struct:
@alexrp

alexrp Dec 27, 2012

Member

Indent like this:

switch (lhsType)
{
    case TypeInfo.Type.struct_:
        ...
    ...
}
@andralex

andralex Dec 28, 2012

Owner

I find the current indent just fine.

@9rnsr 9rnsr commented on an outdated diff Dec 29, 2012

src/rt/typeinfo/ti_AC.d
@@ -100,4 +100,6 @@ class TypeInfo_AC : TypeInfo_Array
{
return typeid(Object);
}
+
+ @property override Type type() const nothrow pure { return Type.Array; }
@9rnsr

9rnsr Dec 29, 2012

Member

Why this is necessary? You already override TypeInfo.type()in TypeInto_Array. Is this redundant, no?
Also TypeInfo_Ar, TypeInfo_Aq, and others.

@9rnsr 9rnsr and 1 other commented on an outdated diff Dec 29, 2012

src/object_.d
@@ -193,12 +193,10 @@ bool opEquals(TypeInfo lhs, TypeInfo rhs)
// Factor out top level const
// (This still isn't right, should follow same rules as compiler does for type equality.)
- TypeInfo_Const c = cast(TypeInfo_Const) lhs;
- if (c)
- lhs = c.base;
- c = cast(TypeInfo_Const) rhs;
- if (c)
- rhs = c.base;
+ if (lhs.type == TypeInfo.Type.Const)
+ lhs = (cast(TypeInfo_Const)cast(void*)lhs).base;
+ if (lhs.type == TypeInfo.Type.Const)
+ rhs = (cast(TypeInfo_Const)cast(void*)rhs).base;
@9rnsr

9rnsr Dec 29, 2012

Member

Why cast(void*) is necessary? When rhs.type == TypeInfo.Type.Const, cast(TypeInfo_Const)rhs is always non-null. Is there any exception?

@Ingrater

Ingrater Dec 29, 2012

Contributor

Casting it to void* first makes it a unsafe cast, which will not invoke the regular D cast. This is safe because I alerady know it is a TypeInfo_Const and it is faster.

@9rnsr

9rnsr Dec 29, 2012

Member

Maybe, you are intended to avoid dynamic casting (== vtbl lookup overhead)?
if it is true, this is reasonable.

@Ingrater

Ingrater Dec 29, 2012

Contributor

Yes thats what I intended.

@9rnsr

9rnsr Dec 29, 2012

Member

Oh, we commented at the same time. I was convinced.

But it is hard to understand. I want to propose writing a small utility function in object.d:

@property T staticCast(T)(TypeInfo obj) { return cast(T)cast(void*)obj; }

And use:

rhs = base.staticCast!TypeInfo_Const;

@Ingrater

Ingrater Dec 29, 2012

Contributor

I've seen it in many other places inside druntime where it is used exactly like this. So why does there need to be a utitlity function now?

@9rnsr

9rnsr Dec 29, 2012

Member

Because I couldn't immediately understand why it is necessary.
Conversely, is it a language feature which all druntime readers should be known? I couldn't think it is right.
It seems to me that is code smell.

@9rnsr 9rnsr commented on the diff Dec 29, 2012

src/object_.d
override @property uint flags() nothrow pure const { return base.flags; }
override const(void)[] init() nothrow pure const { return base.init(); }
override @property size_t talign() nothrow pure const { return 16; }
+
@9rnsr

9rnsr Dec 29, 2012

Member

Please remove trailing spaces.

@WalterBright

WalterBright Mar 22, 2013

Owner

Running tolf on all source files before checking in will take care of this.

@9rnsr 9rnsr commented on an outdated diff Dec 29, 2012

src/object_.d
override @property uint flags() nothrow pure const { return base.flags; }
override const(void)[] init() nothrow pure const { return base.init(); }
override @property size_t talign() nothrow pure const { return 16; }
+
+ @property override Type type() nothrow pure const @safe { return Type.Vector; }
@9rnsr

9rnsr Dec 29, 2012

Member

Use override @property to fit with others.

@9rnsr 9rnsr commented on an outdated diff Dec 29, 2012

src/object_.d
@@ -281,7 +324,48 @@ class TypeInfo
if (this is o)
return true;
auto ti = cast(const TypeInfo)o;
- return ti && this.toString() == ti.toString();
+ if(ti is null)
@9rnsr

9rnsr Dec 29, 2012

Member

Insert one space after if to fit with others.

@9rnsr 9rnsr commented on an outdated diff Dec 29, 2012

src/object_.d
+ return false;
+
+ bool compareTypeInfo(const TypeInfo lhs, const TypeInfo rhs)
+ {
+ if(lhs is null && rhs is null) //compare complete
+ return true;
+
+ if(lhs is null || rhs is null) //type info chains have different length
+ return false;
+
+ auto lhsType = lhs.type;
+ auto rhsType = rhs.type;
+ if(lhsType != rhsType)
+ return false;
+
+ switch(lhsType)
@9rnsr

9rnsr Dec 29, 2012

Member

Insert one space after switch.

@9rnsr 9rnsr commented on an outdated diff Dec 29, 2012

src/object_.d
+ auto lhsClass = cast(const(TypeInfo_Class))cast(void*)lhs;
+ auto rhsClass = cast(const(TypeInfo_Class))cast(void*)rhs;
+ return lhsClass.name == rhsClass.name;
+ case TypeInfo.Type.Interface:
+ auto lhsInterface = cast(const(TypeInfo_Interface))cast(void*)lhs;
+ auto rhsInterface = cast(const(TypeInfo_Interface))cast(void*)rhs;
+ return lhsInterface.info.name == rhsInterface.info.name;
+ case TypeInfo.Type.Typedef:
+ case TypeInfo.Type.Enum:
+ auto lhsTypedef = cast(const(TypeInfo_Typedef))cast(void*)lhs;
+ auto rhsTypedef = cast(const(TypeInfo_Typedef))cast(void*)rhs;
+ return lhsTypedef.name == rhsTypedef.name;
+ default:
+ }
+
+ return compareTypeInfo(lhs.next(), rhs.next());
@9rnsr

9rnsr Dec 29, 2012

Member

TypeInfo.next is a property function, so () is not necessary.

Member

9rnsr commented Dec 29, 2012

This change breaks Phobos unittests. Could you post a fix-up pull for Phobos?

Contributor

Ingrater commented Dec 29, 2012

I didn't have time to check what exactly breaks within the phobos unittests. But I will investigate and fix this, one way or the other.

Member

alexrp commented Jan 21, 2013

ping?

Contributor

Ingrater commented Jan 21, 2013

I'm having exams at the moment, this will have to wait for 3 weeks.

Owner

andralex commented Jan 22, 2013

good luck!

Contributor

Ingrater commented Feb 20, 2013

I have a question about some implementation detail. Please see and discuss: http://forum.dlang.org/thread/kg1pcr$197c$1@digitalmars.com

Contributor

Ingrater commented Feb 26, 2013

Anyone? Should I just choose whatever I find best?

Member

alexrp commented Feb 26, 2013

Owner

WalterBright commented Mar 22, 2013

This pull grafts tagged variants onto the TypeInfo, rather than use virtual functions. Why not use virtual functions? That also avoids the whole issue of that .next returns.

(I know that the compiler itself uses tagging sometimes! But I'm not sure that is a good idea there, either.)

Contributor

Ingrater commented Mar 22, 2013

Because I designed it with writing vararg functions in mind. With tagging you can write vararg functions by using switch case statements to decide what to do for a certain type.

Member

MartinNowak commented Mar 23, 2013

Because I designed it with writing vararg functions in mind. With tagging you can write vararg functions by using switch case statements to decide what to do for a certain type.

Sounds like you could as well compare the TypeInfo.classinfo ptr. Could you please provide a code example of what you intend to do here.

@MartinNowak MartinNowak commented on the diff Mar 23, 2013

src/object_.d
+ auto rhsStruct = cast(const(TypeInfo_Struct))cast(void*)rhs;
+ return lhsStruct.name == rhsStruct.name;
+ case TypeInfo.Type.class_:
+ auto lhsClass = cast(const(TypeInfo_Class))cast(void*)lhs;
+ auto rhsClass = cast(const(TypeInfo_Class))cast(void*)rhs;
+ return lhsClass.name == rhsClass.name;
+ case TypeInfo.Type.interface_:
+ auto lhsInterface = cast(const(TypeInfo_Interface))cast(void*)lhs;
+ auto rhsInterface = cast(const(TypeInfo_Interface))cast(void*)rhs;
+ return lhsInterface.info.name == rhsInterface.info.name;
+ case TypeInfo.Type.typedef_:
+ case TypeInfo.Type.enum_:
+ auto lhsTypedef = cast(const(TypeInfo_Typedef))cast(void*)lhs;
+ auto rhsTypedef = cast(const(TypeInfo_Typedef))cast(void*)rhs;
+ return lhsTypedef.name == rhsTypedef.name;
+ default:
@MartinNowak

MartinNowak Mar 23, 2013

Member

How can you ensure that every unhandled Type has a valid TypeInfo.next field.
You should instead list all TypeInfos with a valid .next field explicitly.
It's also fairly confusing to read that TypeInfo.Type.double_ is handled by a recursive call with .next is null.

@Ingrater

Ingrater Mar 25, 2013

Contributor

They don't need a valid next field. Next may return null which will terminate the (tail) recursion. If TypeInfo.next would not return a const TypeInfo the tail recursion could be replaced by a loop.
Sorry but I don't see a TypeInfo.Type.double_ in the code?

@MartinNowak

MartinNowak Mar 26, 2013

Member

If TypeInfo.next would not return a const TypeInfo the tail recursion could be replaced by a loop.

This has been resolved with #359, please use a loop.

@alexrp

alexrp Mar 26, 2013

Member

It's kind of bad that we have a const system that basically requires tail recursion in cases like this one (if you don't want to break the type system), and yet we are (for some reason??) allergic to recursion.

@MartinNowak

MartinNowak Mar 26, 2013

Member

This is an isolated issue with const object references and there is Rebindable to work around it.

@Ingrater

Ingrater Mar 26, 2013

Contributor

Whats the problem with tail recursion? D supports functional programming, so why not use it?
Also I'm going to wait until there is agreement that this fix is actually wanted before investing even more time into it which might be wasted.

@MartinNowak

MartinNowak Mar 26, 2013

Member

Let's stop that discussion here. I'm basically nitpicking on the readability of this method but it's not a big deal.

Contributor

Ingrater commented Mar 25, 2013

No I can not compare the type info ptr because then it will not work across dll boundaries.
The description of the pull request says exactly what I'm trying to do here.

Member

MartinNowak commented Mar 25, 2013

No I can not compare the type info ptr because then it will not work across dll boundaries.

That's an ODR violation and does not need a fix in TypeInfo, see #142.

Contributor

Ingrater commented Mar 25, 2013

Well but until this ODR violation gets fixed a lot of time may pass.
Also the other two issues remain:

  1. Comparing type info objects produces garbage (leaks memory)
  2. Writing d-vararg functions is unneccessary complicated
Member

MartinNowak commented Mar 26, 2013

No I can not compare the type info ptr because then it will not work across dll boundaries.

OK, that's partly correct, e.g. for TypeInfo_Class of template classes.
Those are COMDAT but are unfortunately used in multiple DLLs/SOs unless we find a better solution.
The same is not true for basic types, those are strong symbols in druntime and we shouldn't simply hack up something that pretends working DLL support.

As it stands the best solution is to handle this differences in opEquals.
Because TypeInfo_Class can be COMDATs you additionally compare the names and baseclasses but you wouldn't for TypeInfo_i.
BTW I really dislike the idea of comparing names, because IMO it's perfectly valid to load an updated version of a class which is a distinct type with the same name.

  1. Writing d-vararg functions is unneccessary complicated

What's the problem of replacing

switch (ti.type)
{
case TypeInfo.Type.UByte: format(va_arg!ubyte(argptr)); break;
case TypeInfo.Type.UShort: format(va_arg!ushort(argptr)); break;
case TypeInfo.Type.UInt: format(va_arg!uint(argptr)); break;
default: break;
}

with

if (ti == typeid(ubyte)) format(va_arg!ubyte(argptr));
else if (ti == typeid(ushort)) format(va_arg!ushort(argptr));
else if (ti == typeid(uint)) format(va_arg!uint(argptr));

?

Note that the virtual opEquals will correctly handle comparison.

Member

MartinNowak commented Mar 26, 2013

BTW I just thought we should create a TypeInfo_Template which is emitted together with a TypeInfo_Class/Struct/... into EVERY object filed using the template instantiation. It's only purpose would be to specialize opEquals for multiple instances.
All typeinfos for non-templated UDT would ONLY go into the object that defines that type.
The former ones should be COMDATs to reduce the binary size but latter ones should not as they must be unique.

Owner

WalterBright commented Mar 26, 2013

I'm beginning to think that this would be much simpler if the test for equality was just a comparison of the pointers, and if the pointers are not equal, then the mangled type string is compared. The compiler can be adjusted to emit the mangled string for each typeinfo instance.

Contributor

Ingrater commented Mar 26, 2013

What's the problem of replacing

switch (ti.type)
{
case TypeInfo.Type.UByte: format(va_arg!ubyte(argptr)); break;
case TypeInfo.Type.UShort: format(va_arg!ushort(argptr)); break;
case TypeInfo.Type.UInt: format(va_arg!uint(argptr)); break;
default: break;
}

with

if (ti == typeid(ubyte)) format(va_arg!ubyte(argptr));
else if (ti == typeid(ushort)) format(va_arg!ushort(argptr));
else if (ti == typeid(uint)) format(va_arg!uint(argptr));

?

Your way has the complexity O(N). My way has O(1). Even std.format does not use your way. std.format partially parses the the mangeled type string and does a switch on that. Its also currently not easily possible to unqualify a type info object. If you find out that it is a const T, you can't get the T because calling next() will jump both the const and the T type info object. Just try implementeing a full printf wrapper with the current TypeInfo system. I'm my eyes its unneccessarily difficult to make it work for all types.

I'm beginning to think that this would be much simpler if the test for equality was just a comparison of the pointers, and if the pointers are not equal, then the mangled type string is compared. The compiler can be adjusted to emit the mangled string for each typeinfo instance.

But this would compare manageld type strings in most cases, because the common case is that the type info pointers don't match. With my implementation it will just compare two ints in most cases.

The same is not true for basic types, those are strong symbols in druntime and we shouldn't simply hack up something that pretends working DLL support.

AFAIK we already pretend working dll support?

BTW I really dislike the idea of comparing names, because IMO it's perfectly valid to load an updated version of a class which is a distinct type with the same name.

I disagree. I very recently wanted exactly that. I loaded a updated version of the class and wanted it to be the same type so I can make changes to code without restarting the application. See: http://3d.benjamin-thaut.de/?p=25

Member

MartinNowak commented Mar 26, 2013

I disagree. I very recently wanted exactly that. I loaded a updated version of the class and wanted it to be the same type so I can make changes to code without restarting the application. See: http://3d.benjamin-thaut.de/?p=25

I'll get more into the TypeInfo problems and we will find an appropriate solution but I'll have to defer this until next week.

Member

MartinNowak commented Mar 26, 2013

Your way has the complexity O(N). My way has O(1).

You can switch to template vararg functions for performance and I doubt that replacing N pointer comparisions with 1 table lookup would get you a much faster printf wrapper.

Member

MartinNowak commented Apr 8, 2013

I loaded a updated version of the class and wanted it to be the same type

Two distinct TypeInfo_Class instances with different vtables and init[] data should not compare equal.
I also don't see that your code relies on that fact.

Member

MartinNowak commented Apr 8, 2013

I'm beginning to think that this would be much simpler if the test for equality was just a comparison of the pointers, and if the pointers are not equal, then the mangled type string is compared. The compiler can be adjusted to emit the mangled string for each typeinfo instance.

In case that two shared libraries instantiate the same template class we get two weak TypeInfo instances.
Those get merged if at least one definition is available when linking the executable.
In the case that both shared libraries are loaded at runtime the weak symbols won't get merged and we need to resort to comparing the mangled name (also vtable size and init size must match, not their content though).

NB:
C++

Member

MartinNowak commented Apr 8, 2013

The compiler can be adjusted to emit the mangled string for each typeinfo instance.

That would potentially consume a lot of space. AFAIU we only need mangle comparison for weak TypeInfos. So we should use a dedicated wrapper that forwards all methods similar to TypeInfo_Typedef.

class TypeInfo_Weak : TypeInfo
{
    override bool opEquals(Object o)
    {
        if (this is o)
            return true;
        else if (o.classinfo is TypeInfo_Weak.classinfo)
        {
            auto c = *cast(const TypeInfo_Weak*)&o;
            return this.base == c.base &&
                this.mangledName == c.mangledName;
        }
        return false;
    }

    // other overrides like TypeInfo_Typedef

    TypeInfo base;
    string mangledName;
}
Member

MartinNowak commented Apr 8, 2013

I don't see a very convincing argument to add tagged classes. Most TypeInfo comparison are simple pointer comparisons and weak TypeInfo can get a mangledName field as fallback.

It's true that cross-DLL TypeInfo comparison currently fails as well as cross-DLL exception handling.
This is a consequence of linking DLLs against a static phobos/druntime thereby violating ODR. This is a
longstanding bug that gets solved by finalizing Windows DLL support.

Owner

andralex commented May 6, 2013

@dawgfoto what's the status of this?

Member

MartinNowak commented May 6, 2013

I don't think faster vararg functions are a compelling enough use cases to make TypeInfo a tagged variant. For now replacing the tag comparisons with ti.classinfo == TypeInfo_Struct.classinfo should do the job.

Contributor

Ingrater commented May 8, 2013

I don't think faster vararg functions are a compelling enough use cases to make TypeInfo a tagged variant. For now replacing the tag comparisons with ti.classinfo == TypeInfo_Struct.classinfo should do the job.

This will still break in case of dlls until a shared druntime is implemented (which will not happen on all plattforms in the near future in my opinion). But if you don't think this is worth it I don't care anymore.

@Ingrater Ingrater closed this May 8, 2013

Member

MartinNowak commented May 8, 2013

I opened a Bugzilla 10048 for the issue.

This will still break in case of dlls until a shared druntime is implemented

Because it uses == instead of is, it should work, as long as we account for it in TypeInfo_Class.opEquals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment