New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New implementation of LinkedHashSet/LinkedHashMap #11369

Open
joshlemer opened this Issue Dec 27, 2018 · 23 comments

Comments

Projects
None yet
6 participants
@joshlemer
Copy link
Member

joshlemer commented Dec 27, 2018

Recently, @szeiger rewrote the implementations of mutable.HashMap and mutable.HashSet here. So far as I see there's no reason not to implement LinkedHashSet/Maps the same way, and in fact doing so will allow LinkedHashMaps/Sets to participate in hashcode sharing between:

  • s.c.i.HashMap
  • s.c.i.HashSet
  • s.c.i.VectorMap
  • s.c.m.HashMap
  • s.c.m.HashSet
  • s.c.m.HashMap#keySet
  • s.c.i.HashMap#keySet
  • s.c.i.VectorMap#keySet

as started here

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Jan 7, 2019

@szeiger would you approve this?

@szeiger

This comment has been minimized.

Copy link

szeiger commented Jan 10, 2019

I haven't paid much attention to LinkedHashMap so far. I think any performance improvement would be welcome, especially if we can share code again with the new HashMap implementation.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 12, 2019

I would like to contribute here.

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Jan 12, 2019

@mghildiy it's all yours if you want it 😄

@adriaanm adriaanm transferred this issue from scala/scala-dev Jan 16, 2019

@adriaanm

This comment has been minimized.

Copy link
Member

adriaanm commented Jan 16, 2019

Is this still under consideration for RC1?

@adriaanm adriaanm added this to the 2.13.0-RC1 milestone Jan 16, 2019

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 16, 2019

I went through mutable.HashMap and found changes broadly as:

  • mixesin trait mutable.AbstractMap

  • Class level parameters added: initialCapacity: Int, loadFactor: Double

  • Replaced HashTable field with Array named: private[this] var table = new ArrayNode[K, V]

  • field 'contentSize' intoduced

  • method 'size' overriden

  • inline method 'computeHash' added

  • inline method 'index' added

  • inline metod 'findNode' added to find an entry for input key(k) in array 'table'

  • API methods overriden: sizeHint, addAll, remove

  • abstract iterator class HashMapIterator added

  • private classes Node, DeserializationFactory added

  • Modified methods: apply, getOrElse, getOrElseUpdate, put

@szeiger

This comment has been minimized.

Copy link

szeiger commented Jan 17, 2019

"changes" is an understatement - it's a completely new implementation. I think the idea here is to start with a copy of the new HashMap implementation and make the required changes for keeping track of the links.

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Jan 17, 2019

@mghildiy @szeiger is right, as well as for LinkedHashSet

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 17, 2019

Thanks for the inputs.

@pavelpavlov

This comment has been minimized.

Copy link
Member

pavelpavlov commented Jan 18, 2019

I would also like to contribute if there's need for it.
We use linked collections in our code very heavily.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 19, 2019

Some methods like writeObject, readObject are never called in current implementation of LinkedHashMap.
I would remove them.
Also any other code too which may not be needed as now many implementations would be inherited from HashMap.

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Jan 19, 2019

@mghildiy actually those methods are being used, these are kind of "magic methods" that Java knows to call during Java serialization/deserialization. Also I would recommend not extending hashmap to implement linked hash map, but maybe some of the hashtable code can be factored out for reuse.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 20, 2019

OK, so you mean a better approach would be to pull out reusable code from mutable.HashTable into a class/object and use it in classes wherever needed, like those enlisted in original post.
PS: Extending HashMap is approach also used in java''s LinkedHashMap.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Jan 27, 2019

As far as I understand, new implementation of HashMap/HashSet no more has hashtable related code.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Feb 6, 2019

Hi,

Any more inputs here?

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Feb 7, 2019

@mghildiy If I were doing it, I would simply copy paste mutable.HashMap as LinkedHashMap and make necessary changes to make the nodes linked (taking the linking logic from the current LinkedHashMap), and same for LinkedHashSet

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Feb 7, 2019

@mghildiy the first step to changing the mutable.HashMap code into LinkedHashMap would be to change its private Node[K, V] class to include fields var earlier: Node[K, V], var later: Node[K, V]

Let me know if you have any more questions 😄

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Feb 7, 2019

Thanks for the inputs @joshlemer . Would look into it this weekend.

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Feb 10, 2019

There are some methods,like apply, which are not implemented in current version of immutable.LinkedHashMap, but they are there in immutable.HashMap.
Should they be implemented?

@SethTisue

This comment has been minimized.

Copy link
Member

SethTisue commented Feb 10, 2019

some methods [...] are not implemented

you mean because the implementations are inherited? in many cases the inherited implementations are fine and there's no reason to override them. concrete collections may override some of them, usually for efficiency

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Feb 11, 2019

Sorry, my mistake. I was actually referring to mutable versions, not immutable..

@mghildiy

This comment has been minimized.

Copy link

mghildiy commented Feb 14, 2019

I am done with changes in LinkedHashMap.
Do I need to write a benchmark class too, jst like done for HashMap?
Also, currently there is a single test for LinkedHashMap(in LinkedHashMapTest.scala). Should I add tests for entire code in LinkedHahsMap.scala?

@joshlemer

This comment has been minimized.

Copy link
Member Author

joshlemer commented Feb 14, 2019

Do I need to write a benchmark class too, jst like done for HashMap?

No not necessarily,

Also, currently there is a single test for LinkedHashMap(in LinkedHashMapTest.scala). Should I add tests for entire code in LinkedHahsMap.scala?

I think if you just get all existing LinkedHashMap tests to pass, that should be enough to put up a preliminary [WIP] pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment