New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Dictionary #4

Merged
merged 2 commits into from Oct 22, 2017

Conversation

Projects
None yet
2 participants
@bsinky
Collaborator

bsinky commented Oct 14, 2017

Implemented most of what was discussed in issue #3.

BalancedTree<K,V> is used as the primary backing data structure of the Dictionary. This appears to work well, and doesn't present the issues that Map was. There may be drawbacks to using BalancedTree for all types of keys, but I haven't discovered them.

@:arrayAccess is only allowed on abstract types in Haxe, hence why Dictionary is an abstract instead of a class.

toMap is left unimplemented as returning a Map abstract posed the same problems during compilation I detailed in #3. We could have it return IMap<K,V> and just return the internal BalancedTree reference, or a copy of it.

Please let me know if anything in the unit tests, documentation, or implementation is lacking. Thanks in advance!

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 14, 2017

This is a pretty large code review. I'm going to take my time going through it, including looking things up and asking questions, to make sure I understand all the implications of the implementation.

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 14, 2017

I don't think BalancedTree is a good choice. According to the docs, it uses Reflect.compare to compare objects, which is "unspecified" for non-string non-integer values. This seems like overkill to have this executed at runtime for every operation.

I'm not clear on why we need another InternalDictionary type with some methods (shouldn't these just be part of Dictionary?). I'm also not clear on your use of @:abstract and @:forward, but let's put that side for now. This seems like a lot of extra code to maintain and a lot of it is not trivially obvious as to why it's written that way.

Can we try a simpler approach? What about a simple ObjectMap<Any, Any> as the internal storage? Any is supposed to be more type-safe than Dynamic, although you may need some type constraints and/or explicit typecasts to make that work.

Does it make sense to try that in a branch?

@bsinky

This comment has been minimized.

Collaborator

bsinky commented Oct 14, 2017

InternalDictionary is needed to be the foundation of the Dictionary abstract in order to provide an O(1) count property implementation. I suppose we might not need count to be O(1) and might find the O(N) implementation acceptable, but I believe .NET's Dictionary.Count is a O(1) operation so I was aiming for that.

Dictionary must be an abstract in Haxe to allow the use of @:arrayAccess, as it is not allowed on class types. But, an abstract cannot contain instance fields of its own (e.g. the count property).

So, without the InternalDictionary class, count on Dictionary would be O(N) since we would need to loop over all the items in the collection just to obtain a count, as none of the standard library Map implementations (and related classes like BalancedTree) seem to contain a length or count property.

For the simpler approach, I did a quick test with ObjectMap<Any,Any> here, but the compiler does not even allow this, raising the following error:

Test.hx:5: characters 22-40 : Constraint check failure for haxe.ds.ObjectMap.K
Test.hx:5: characters 22-40 : Any should be { }

Still no replies on the mailing list topic, but 18 views. Perhaps someone somewhere is ruminating on it.

It would seem that you're right about BalancedTree not being a proper candidate for Key types that are not Int or String. But, EnumValueMap gets around this by simply extending BalancedTree and overriding the compare method. This doesn't help in making the implementation simpler, but barring any miracles at this point, it doesn't seem like there is a simple way to accomplish this.

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 15, 2017

I spent about an hour messing around with Map, IMap, Any, and Dynamic, and I still can't get a simple solution that just works.

I think we'll go with your solution; it just needs more comments so it's clear to future readers why the design is how it is (which is really not obvious unless you read deeply into Haxe).

I'll try to post detailed code-review comments tomorrow.

Also, overriding BalanceMap and compare won't work. It means everyone who uses Dictionary has to also override compare, which makes it (much?) harder to use. I know Java has something similar in terms of requirements (you have to override hashcode and equals if you want a well-distributed map.)

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 15, 2017

@bsinky Simn's comment on that thread makes me very worried. I would not like to find out that we have poor performance on this or that target. I think they chose performance via a compile-time switch (through the use of @:abstract) over usability. HaxeSharp will chose usability over performance, but I don't know if that's the right decision.

I thought about (and tried) making the data backing of Dictionary an IMap and switching based on the key type, but it seems like (unlike C#) there's no easy way to do that in Haxe. There's no if T == Int type of call you can make. (Std.is doesn't seem to work either.)

Can you try another experiment for me? (You can clone this repo and try it in a branch so you don't destroy your changes.) I still think we can get away with using the strategy pattern.

What I'm suggesting:

  • Make the data backing of Dictionary an ObjectMap with an object type of the key/value types
  • Wrap the key type (if it's int/string/enum) into an object
  • Make all the methods (get, set, hasKey, etc.) work as expected

To do this cleanly, without duplicating code or horrible if type == ... if-statements, we probably need to create some sort of KeyWrapper class. This can take the actual dictionary value, and store it internally; then all the operations just call var val = KeyWrapper.getValue and everything should work as-is. (data will be an instance of IMap<KeyWrapper, T> I think.)

Does that make sense? Can we try that and get an idea of how bad/good a solution it is?

Future change: we should implement IMap so users of IMap can quickly switch to using Dictionary.

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 15, 2017

I created a rough prototype of what I had in mind.

The one (major) problem is that we can't check keys for equality because they're objects. If we can solve that (or reduce that to an O(1) operation). For example, ObjectMap (in some targets) keeps an internal key; we could do a two-step process:

  • For any object, create an ObjectMap<ObjectWrapper, Int> that maps the wrapper to an auto-incrementing integer in O(1)
  • Create an IntMap that maps the integer key to the value
  • Whenever we need to get the thing with key K, we look up the integer key, then look up the value, and we're done in two operations

The runtime is pretty good (still O(1)) but the additional space usage is not great. It does give us O(1) read access, though ...

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 20, 2017

@bsinky did I make you frustrated and rage-quit? Sorry :/

@bsinky

This comment has been minimized.

Collaborator

bsinky commented Oct 20, 2017

Sorry for the delay, unfortunately I just haven't found time to work on this.

I haven't rage-quit, but all the issues that have come up so far have been a bit discouraging. It's unfortunate that it seems like we'll be locked into using just ObjectMap, and can't get a solution that makes full use of all the standard library Map implementations. I imagine if we could use IntMap or StringMap directly for storage, performance would be better, since (I think) there are per-target externs for those classes, which I assume was to achieve the most efficient implementation possible.

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 20, 2017

Yeah, I agree that it's discouraging. It just seems like Haxe doesn't provide enough reflection/syntax functionality for us to check details about a generic type T at runtime, other than to add constraints (which doesn't work for us here) -- and method overloading isn't an option here.

I think we just fundamentally need to pick between performance and usability. I'm not 100% sure usability will win.

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 22, 2017

Let's just go with this to start. I have no major qualms with this, although I'm not a fan of the resulting API (which requires you to know a lot more about Haxe).

@nightblade9 nightblade9 merged commit ecfe3cc into nightblade9:master Oct 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@nightblade9 nightblade9 referenced this pull request Oct 22, 2017

Closed

Add Dictionary Support #3

@nightblade9

This comment has been minimized.

Owner

nightblade9 commented Oct 23, 2017

Ah, I forgot that I didn't read through your test code in excruciating detail. I'll post the code review now; please make the appropriate changes whenever you get some time.

@nightblade9

Needs a bit of work, mostly superficial stuff.

public function keys()
{
return [
for (key in this.keys())

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

I don't really like this syntax style, although it is quite succinct. Could you please add curly-braces and collapse the for-statement into one line, eg. for (key in x) { key } ?

public function values()
{
return [
for (value in this)

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

Ditto here please.

// can't remove from Dictionary while iterating through keys, so retrieve list of all keys here first
var allKeys = keys();
for (key in allKeys)

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

Please add curly-braces to all for/if statements. (It often leads to bugs when people add a second statement, not realising that it's not part of the if/while loop).

}
/**
Extends BalancedTree to maintain an internal count of keys/values.

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

You can replace this block comment with simple // inline comments because it's internal.

/**
Extends BalancedTree to maintain an internal count of keys/values.
Not intended to be used outside of HaxeSharp.

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

You can replace this with @:dox(hide) instead. Much cleaner IMO 😄

/**
An exception indicating that the given arguments are not valid
*/
class ArgumentNullException extends Exception

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

Shouldn't this extend ArgumentException?

This comment has been minimized.

@bsinky

bsinky Oct 24, 2017

Collaborator

You're right; fixed.

/**
The exception that is thrown when a null reference is passed to a method that does not accept it as a valid argument.
*/
class ArgumentException extends Exception

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

+1 for adding two more .NET common exception classes.

public function dictionaryAllowsIntKeys()
{
testDictionaryGetSetAndRemove(1, 1, 7);
testDictionaryMapConstructor([

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

I think it's really interesting how you refactored out all the common code in a type-safe way (I couldn't figure this out). The test is not easy to read as a result, but I think I'm okay with that trade-off.

}
@Test
public function addThrowsArgumentExceptionOnDuplicateKey()

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

I was going to say "are you missing a test where set throws on duplicate keys" but I realized that add vs. set have different semantics. I'm not sure if this is a Bad Thing.

But, this is a normal byproduct of our array accessors. This is fine.

This comment has been minimized.

@bsinky

bsinky Oct 24, 2017

Collaborator

My idea was the duplicate the functionality of .NET Dictionary's [] setter, which will not throw on duplicate keys.

}
@Test
public function removeThrowsArgumentNullExceptionOnNullKey()

This comment has been minimized.

@nightblade9

nightblade9 Oct 23, 2017

Owner

I don't see a test for "remove removes the object" (verify via get), is that test missing?

This comment has been minimized.

@bsinky

bsinky Oct 24, 2017

Collaborator

Good point, I a bit of code to ensure that containsKey returns false after the specified key is removed.

@bsinky

This comment has been minimized.

Collaborator

bsinky commented Oct 23, 2017

Thanks for the review, I'll try to start on it later today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment