-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add immutable TreeSeqMap (a SeqMap implemented via a customized IntMap/HashMap pair) #7146
Conversation
I gave this a quick look this morning, and it looks really good. I will try to give it a full review later this week or this weekend. cheers! |
I was thinking that since there's nowhere where you're using the TreeMap's lookup functionality, you could get by with a persistent min-max heap. Unfortunately none exist in the standard library, but maybe we could create a |
Nice work, I think in general a properly optimized |
Thanks, @mdedetrich! I must stress that this PR is highly inspired by, and partly based on, your |
@odd Yup completely understand, I am curious what the real world results. Apologies if I comment came off as being too critical. The devil is usually in the detail with these things, I will also try and optimize both implementations if I have enough time |
I pushed an updated version which replaces the |
For reference, the performance improvements to |
supposing we merge this, would there be any reason to keep |
@SethTisue Assuming both |
@SethTisue Afaik the issue is its hard to say which collection is overall better. Afaik Then there is also memory to to take into account, I think that |
I have now (finally) updated and pushed my latest benchmarking code to collection-benchmark; here are the results comparing the three
|
101bf9c
to
c23b977
Compare
@odd Thanks for the benchmarks, I think they vaguely confirmed my suspicions @SethTisue I think from the results its hard to argue that My personal vote is that |
@mdedetrich Thanks for the feedback, I disagree with your conclusion though 🤓 By looking at the benchmark results a couple of things can be noted:
All together I believe the points above speaks for making |
@odd Honestly I am personally borderline, completely understand (and agree!) with your points, the issue I have is that Its the whole using roughly 50% more memory than a collection which is already borderline in terms of acceptable memory usage that I have an issue with (and collections that use a lot of memory pose their own problems, with such high memory churn you are often competing with cache lines versus other resources which isn't something that shows up in benchmarks. Compounded ontop of this we are dealing with the JVM environment which doesn't have the best track record when it comes to memory conservation). So to me the whole equation is whether we should balance using a huge amount of memory against an operation which is at least acceptable? in terms of performance. @Ichoran @NthPortal @joshlemer @julienrf What are your thoughts on this? |
I'm inclined to agree with @odd that in their current incarnations, On the other hand, I'm a little surprised at the results for @mdedetrich, do you have time to track down why |
Also @odd how do you know that (I've run collections-laws on |
@Ichoran I've been meaning to run |
@mdedetrich I will rerun the benchmarks later today on a local Scala build which includes #7508 and also an updated |
@Ichoran Regarding the name
|
|
@joshlemer I considered |
Hmm, well I don't like it a lot because there isn't actually a Seq anywhere backing it, like a
|
@mdedetrich Here are the results of the latest benchmark runs:
|
@joshlemer But the principal interface is called |
I think it makes most sense to call it a TreeSeqMap because what it is is a SeqMap (not a HashMap, though it contains one that is not what it is). It's a SeqMap which is backed by both a HashMap and a Tree, so I think the last word should be SeqMap, and then at the front put something like I think the |
👍 to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to rename to TreeSeqMap
and then I guess it's okay. I still am uneasy about the ordering policy, but I will leave that to someone else to decide.
d52cdcc
to
ad4d0c6
Compare
@Ichoran I have changed the name and squashed and pushed but it still looks like there is change requested by you. The problem is that I cannot find your comments any more (I had to do a force push to avoid the earlier commit that failed by Codacy so the change request might have got sidelined somehow). Please let me know if you think anything more should be changed (besides your reservation about the modification ordering). |
Looks good! (Still uneasy about the insertion/modification thing, but not so uneasy that we should hold this up.) |
@odd Thanks for the benchmarks, there are definitely substantial improvements in performance to Is it also possible to document how you generate your benchmarks, don't want to ping you every-time I run them! |
/rebuild |
#8041 proposes to change the default to |
This is an alternative
SeqMap
implementation to compare againstVectorMap
. It uses a customizedIntMap
to provide insertion/modification iteration ordering and aHashMap
to allow efficient lookup by key.