-
Couldn't load subscription status.
- Fork 34
feat: Add stable B tree set in stable structures #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add stable B tree set in stable structures #204
Conversation
|
|
|
I appreciate the idea of having explicit types in stable structures, but I would like to see a good justification for WHY we need to add an explicit I understand that maybe a tailored implementation of BTreeMap has functional tests for empty value (which is essentially a set). And performance benchmarks will show the same result (since it's a wrapper implementation) with the downside of eating the limited budget of canbench tests (total number, time of execution, maintainance, etc). Without renaming the API methods developer can simply get a zero-cost alias for the set via: pub type BTreeSet<K, M> = BTreeMap<K, (), M>;Additionally that's exactly the recommendation from the Roman's original blog post:
A zero‐cost type alias already provides a fully tested set semantics, so introducing an explicit BTreeSet only duplicates code and adds maintenance overhead with no clear benefit. |
|
StableStructure is owned by the execution team; they are not owned by any specific member of the team, and all members should appreciate each other’s work. The rationale for this change is to improve the developer experience, which is one of the primary focuses of the organisation, and below is an example of how that is achieved. This is not a novel PR; it has been ongoing for over a year, based on Islam’s initial proposal, as per community request (I do not currently have a link to that discussion thread, but here is one mention of it). The PR was left unfinished to prioritise the completion of last year’s improvements for StableStructure, and with remaining test and benchmarks to be added. The performances of BTreeSet are consistent with those of BTreeMap, meeting expectations as intended by Islam. Tests conducted on BTreeSet are designed to ensure the absence of future errors. As mentioned in the preceding comment, there are no performance benchmarks for BTreeMap in the scenario where the value is
As discussed in the previous meeting, the removal of artificial limitations is imperative. Presently, these limitations do not present any challenges, and they should not be employed as a justification for hindering the progress of PR. Enclosing BTreeSet within a BTreeMap wrapper represents an enhancement to the developer experience, as this aspect has been a primary focus of the organisation for the past year. Considering this, employing a solution such as
I do not believe this improves the developer experience when you can call The Roman blog post, last edited in February 2023, outlines a workaround for the absence of BTreeSet in our system. The ticket for incorporating BTreeSet was assigned to me after the blog’s last edit. Furthermore, the use of a dedicated type facilitates future enhancements if required. Could this Pull Request potentially conflict with your ongoing work? |
they do, in my current PR the number of total btreemap benchmarks increased significantly, from ~105 to ~450, with total name size almost hitting 20k chars limit, also increasing the total time of running tests to ~7 minutes. Which all together pushed me to carefully consider and limit my current subset which is still in works, see #285. |
You've made a change that will increase the character limit though so I don't think that's a strong reason as it will be not a problem soon (once a release is made that uses the new limit). Re the extra runtime: do we really need 450 benchmarks? Can we maybe get away with fewer? Also, how many does this PR add? I would expect something in the order of a dozen or so? If that's the case, then the additional overhead should be acceptable imo. More generally, it seems to me that the number of benchmarks somehow seems to be a limiting factor for making changes. That seems off to me and we certainly cannot be blocking work to add useful features and quality of life improvements that improve the developer experience because we can't afford the benchmarks for such changes. We might need to revisit our approach here for the longer term. |
one big table is 2 dimensions (keys/values) x 9 points (4,8,16,...1024) v1 small table has 3 points (for combination of types u64/blob8) btreemap_v1 blob 2x9 = 18 insert 67 6x67 = 402 then traversing items, kes, values (different sizes: small, medium, large) -- 3 x 6 = 18 total: 402 + 18 + 9 = 439 With average test name of 35 chars it's total 15.3k chars. I pushed it back to 401 tests now, maybe will decrease it even further. This also brings up a question if some tests (and correspondingly features) are more important than the other ones, which ones are essential and which ones are just 'nice to have'. That's why I'm pushing back for 'nice to have's. |
I would try to consider carefully how many of these points we need for each dimension. The same arguments can be made for other data structures and how many combinations they need to check. E.g. if we wanted to add a I strongly believe that number/runtime of benchmarks should not be the blocker for adding something that provides value to end users. We need to find a way to make adding such benchmarks less of an impact. E.g we could consider having only the benchmarks for the data structure changed in a PR run instead of all of them. That should make things more scalable when adding more data structures.
This should not be a problem soon after the change mentioned above so I would not consider it a blocker. |
I agree with this. That's why I added optimistically most of the reasonable ones to see how much the cost actually is. 7 min running time and unreadable hundreds line report -- is way too much if you ask me. I am also very much into simplifying and reducing the size BUT(!) without sacrificing the essentials. Hope that makes sense. P.S. btw we may want to start having separate |
Yeah this is where I'm arriving as well. Needs some more thinking but looks like the way forward to avoid getting into the same discussions all the time. |
Co-authored-by: Dimitris Sarlis <dimitrios.sarlis@dfinity.org>
This PR adds BTreeSet to StableStructures.