Ordered Dict based on RobinDict implementations #38155

PallHaraldsson · 2020-10-23T17:30:08Z

No description provided.

…ict.

to try to silence test error. Change Set to, so no unordered used.

skip_deleted_floor! from the old Dict (now UDict), that no ordered dict has and ht_keyindex2! (OrderedDict only has ht_keyindex2; swissdict.jl does, but it's not ordered).

JeffBezanson · 2020-10-23T20:06:56Z

Please stop opening PRs like this. Do you really expect somebody to just click "merge" on a large PR with zero description? A change this significant, in particular, would need a detailed proposal and benchmarks.

PallHaraldsson · 2020-10-23T22:32:18Z

Sorry about that, no I was trying out a different implementation, and wasn't expecting anyone to even look until tests passed. I'm having trouble testing on my local machine with them slowing down so much. I actually thought this one would work, then would have filled out.

oscardssmith · 2020-10-23T22:49:11Z

Note that you can mark PRs as drafts to signal to people that you know it isn't ready and are just experimenting.

PallHaraldsson · 2020-10-23T23:36:22Z

Thanks, I'm just new to this, and have now done it to #38145, but it seems too late here (and maybe would be unlikely to be merged anyway; at least I can no longer see the CI errors). Anyway, I'm not sure I'll finish this, it's getting very frustrating.

JeffBezanson · 2020-10-24T03:33:40Z

You can put in the description something like "Just opening this to see if CI passes". Then we will know what's going on.

KristofferC · 2020-10-24T09:12:36Z

Anyway, I'm not sure I'll finish this, it's getting very frustrating.

I can understand that. However, I must say that it doesn't seem like you appreciate the magnitude of the change you are trying to make. You are trying to replace a core data structure which is used everywhere and has very high requirements on its performance. In addition, it introduces a new concept into Base (the concept of a dict being ordered). A PR like that would require extensive benchmarks, a discussion about the trade-offs, potential ways it might break code, how other data structures like Set are influenced etc etc. It might have been a gold idea to discuss the idea and make sure enough people are on board with the idea itself before starting fighting with CI.

PallHaraldsson · 2020-10-24T16:13:26Z

I do realize "the magnitude of the change", by now... :) and my new proposal avoids all those issues, by not replacing Dict, the extensive benchmarks are not needed, this wouldn't break any code, since a new type, wouldn't change Set either, and I also think Dict is often not a performance bottleneck.

StefanKarpinski · 2020-10-24T18:39:09Z

But adding a new Base type is an even bigger change.

PallHaraldsson · 2020-10-24T20:41:46Z

Why "even bigger change"? I guess we should rather continue the discussion at my proposal #38163, as the implementation hasn't been decided (i.e. maybe not based on [Ordered]RobinDict, this failed PR).

The size of the PR is similar or smaller, what I meant is it's way easier to get through CI.

JeffBezanson · 2020-10-26T03:15:15Z

There is no reason to simply move the code for OrderedDict (which already exists) into Base. The change (if any) to consider here is whether Base.Dict should be ordered.

timholy · 2020-10-26T10:28:21Z

@PallHaraldsson, I think part of what people are saying is that there seems to be a mismatch between the seriousness of the change you're proposing and the depth of justification, analysis, and implementation you're providing. All the code you're introducing to base Julia already exists and is available to any Julian in the world---there is no new functionality that your PRs enable that isn't already achievable, and there are precious few advantages to having it in Base over a package. The only question is what should the "default" implementation in Julia be? This is a question that touches on many issues and requires serious benchmarking, profiling, and use-case analysis. None of this work has appeared in any of your PRs; your justifications are essentially quotes pulled from discourse. Reading what the community says and trying to take action to address their concerns is noble, and I endorse the spirit with which you're doing this work (I know you are genuinely trying to make Julia better), but the reason it hasn't been done yet is that it's a lot more complicated than copying code in OrderedCollections.jl and DataStructures.jl and pasting it here.

How to do this more productively? Here are examples of some questions that need to be addressed: are there bugs in the existing implementation? What kind of code gets easier to write if we make them ordered, and are there examples where things get more confusing? (For example, ordering seems to lead to periodic requests to support indexing by order-of-insertion, but this sets up difficult-to-resolve questions: JuliaCollections/OrderedCollections.jl#64. The absence of what seems to be natural functionality can cause additional confusion that the existing implementation does not.) What kinds of workloads do we want to optimize for, which will get faster, and which will get slower? Where are the performance bottlenecks in various implementations? How do changes affect both compile-time latency and runtime performance? There is essentially no benefit to submitting PRs such as these without such analysis.

The other issue to keep in mind is that there are nearly 1000 watchers of this repository, and they get an email or notification for each commit or comment you make. Too many "trial balloons" and people could start feeling like they are getting spammed. Likewise, 50 posts each reporting one additional tweak or tiny chance in a benchmark would also be annoying; much better to spend several days developing a whole suite and then posting a serious analysis of the results.

I'd suggest that you abandon the notion of getting this change in for Julia 1.6, and focus instead on building a more sustainable path for your contributions to Julia: choose a simpler starting project, start building familiarity with the tools you need to analyze code and its weaknesses, and develop an area of expertise where you deeply understand the issues and tradeoffs that affect design. From there, you can slowly grow your "portfolio" until you arrive at the issues (like core data structures and compile latency) that you seem to find interesting but which have high demands in terms of making contributions. In addition to Julia itself, there are many packages where you could surely gain this kind of expertise, and the fact that some packages are not "under the microscope" in the way that Julia itself is may make it easier for you to iterate through the cycle of problem identification, proposed fix, and seeing your changes merged. The more times you get through this cycle, the faster you learn, and if you start with a less high-profile project than the language itself, you may find that your learning curve is steeper and gets you faster to where you want to be.

Páll Haraldsson added 10 commits October 22, 2020 21:41

Change Dict to be ordered by default, changing older to unexported Ud…

9addde2

…ict.

Fixes docs? And revert making WeakKeyiDict ordered.

cbd8120

Silence doc error?

abc8309

Fix the name you get back.

e641421

One more doc fix.

a6c69d7

Change WeakKeyDict to use the new Dict (it was failing with old)

755a156

to try to silence test error. Change Set to, so no unordered used.

Change Set back to unordered, i.e. UDict, as it invasively uses

760fe11

skip_deleted_floor! from the old Dict (now UDict), that no ordered dict has and ht_keyindex2! (OrderedDict only has ht_keyindex2; swissdict.jl does, but it's not ordered).

Change Dict to be ordered by default based on RobinDict implementations

e74f506

Fix whitspace

6570fbe

Drop extra file

a3518b0

JeffBezanson closed this Oct 23, 2020

PallHaraldsson deleted the ordered_dict_based_on_robin branch October 23, 2020 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ordered Dict based on RobinDict implementations #38155

Ordered Dict based on RobinDict implementations #38155

PallHaraldsson commented Oct 23, 2020

JeffBezanson commented Oct 23, 2020

PallHaraldsson commented Oct 23, 2020 •

edited

Loading

oscardssmith commented Oct 23, 2020

PallHaraldsson commented Oct 23, 2020

JeffBezanson commented Oct 24, 2020

KristofferC commented Oct 24, 2020

PallHaraldsson commented Oct 24, 2020 •

edited

Loading

StefanKarpinski commented Oct 24, 2020

PallHaraldsson commented Oct 24, 2020

JeffBezanson commented Oct 26, 2020

timholy commented Oct 26, 2020

Ordered Dict based on RobinDict implementations #38155

Ordered Dict based on RobinDict implementations #38155

Conversation

PallHaraldsson commented Oct 23, 2020

JeffBezanson commented Oct 23, 2020

PallHaraldsson commented Oct 23, 2020 • edited Loading

oscardssmith commented Oct 23, 2020

PallHaraldsson commented Oct 23, 2020

JeffBezanson commented Oct 24, 2020

KristofferC commented Oct 24, 2020

PallHaraldsson commented Oct 24, 2020 • edited Loading

StefanKarpinski commented Oct 24, 2020

PallHaraldsson commented Oct 24, 2020

JeffBezanson commented Oct 26, 2020

timholy commented Oct 26, 2020

PallHaraldsson commented Oct 23, 2020 •

edited

Loading

PallHaraldsson commented Oct 24, 2020 •

edited

Loading