Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debug checks for misusage of Stash indices #6

Open
Robbepop opened this issue Feb 20, 2017 · 6 comments
Open

Add debug checks for misusage of Stash indices #6

Robbepop opened this issue Feb 20, 2017 · 6 comments

Comments

@Robbepop
Copy link
Contributor

Robbepop commented Feb 20, 2017

In insaneinside's crate symtern (https://github.com/insaneinside/symtern) the system checks for misusage of indices into the string interner during debug builds and runs. For release builds these checks are disabled for performance reasons. Yet, this feature enables a very rich debugging experience when using his string interner.

The same techniques could be applied to stash-rs !

The checks are quite simple:

  • check if an index passed to methods such as get, get_mut and take was generated by this instance of Stash.
  • check if an index passed to methods such as get, get_mut and take was generated before this instance of Stash removed it or was cleared: accesses an "old" entry that is no longer present or has been updated meanwhile.

Both checks could be implemented with tags and timestamps.
For example in debug builds, every instance of Stash receives and stores a unique identifier (which I will call tag: should be u64 for stability purposes). This tag must also be stored for every outgoing index generated by put. This enforces a generic implementation of indices: other people have already suggested this for Stash.
Also every full entry within an instance of a Stash requires a unique timestamp which is again just a number that is incremented for an instance of Stash whenever the user puts a new item into it. This also has to be stored in debug-mode indices generated by put.

With this structure it is then very easy to debug programs for invalid accesses to instances of Stash:

When calling a method such as get, get_mut or take with an index idx the following checks are made:

  • idx.tag == self.tag: This ensures that the index was generated by this Stash instance
  • idx.timestamp == self[idx].timestamp: This ensures that the entry that is associated with the timestamp of idx has not been updated (e.g. taken and put in again) during the lifetime of idx.

Possible timestamp storage design

To store the lifetimes for every bucket it should be possible to create a separate array which stores a lifetime for every bucket. Whenever a bucket is assigned a new Entry::Full or whenver it is cleared we simply increase the timestamp for that bucket in the Stash. This design makes it easy to adjust the code for debug builds.

Behaviour on clear

After clear has been called on an instance s of Stash all extern indices idx should be invalidated. This can be achived by setting s.tag to a new global unique identifier. No other changes are required besides that.

How are failures handled?

The assertions should be handled with debug_assert! macro. This way the assertions are only enabled for debug builds and the API doesn't need to change to Result<_> types.

Feature gate or not?

This feature could also be provided behind a feature gate instead of being active by default for debug builds.

@Robbepop Robbepop changed the title Add debug checks for misusage of Stash indices Add debug checks for misusage of Stash indices Feb 20, 2017
@Stebalien
Copy link
Owner

check if an index passed to methods such as get, get_mut and take was generated before this instance of Stash removed it or was cleared: accesses an "old" entry that is no longer present or has been updated meanwhile.

That's what UniqueStash is for (although you can't disable the check). Do you think I should allow disabling the check at runtime?

check if an index passed to methods such as get, get_mut and take was generated by this instance of Stash.

I could do this in UniqueStash (debug only). However, I generally expect users to implement this themselves if needed (e.g., OwnedTag { owner: u64, tag: Tag } (at runtime) or TagForServiceX(Tag) (at compile time)).


Unfortunately, I can't do either for Stash because:

  1. It usize tickets because lots of other libraries use them as tokens.

  2. I can't even try to encode this information in the usize because I guarantee that if you never have more than N items in a stash at any given point in time, the highest "tag" will be N-1. See Make stashes generic over index types #3.

@Stebalien
Copy link
Owner

Also, sorry about the late reply. GitHub doesn't do reliable notification email delivery 😿.

@Robbepop
Copy link
Contributor Author

Robbepop commented Feb 22, 2017

@Stebalien
I didn't know that UniqueStash was exactly for that usage purpose.
But doesn't UniqueStash behave quite differently compared to Stash and wouldn't Stash also very much profit from a debug-build based checking of those constraints?

You are completely right that this feature would require an overhaul of the usize approach for the tickets into a Stash but I do think that the profit would be huge for debugging crates using your Stash while it would introduce absolutely no runtime on release build which is what you want normally. Of course, opinions are different so maybe you could offer different versions of feature flags for different kinds of usages.

In my string-interner crate I also used a generic approach for the tickets and also offered a DefaultStringInterner that has its ticket type set to usize which is a nice design in my opinion.
With this feature implemented, the usize for example would be wrapped by another entity that handles all the checks for debug builds but would be a plain usize for release builds.

@Stebalien
Copy link
Owner

The problem isn't the overhaul, it's that using usize tickets is a feature. Specifically, I designed it to work with MIO (see http://rustdoc.s3-website-us-east-1.amazonaws.com/mio/master/mio/struct.Token.html).

But doesn't UniqueStash behave quite differently compared to Stash and wouldn't Stash also very much profit from a debug-build based checking of those constraints?

The only difference is that each "slot" is versioned (your second check).

@Robbepop
Copy link
Contributor Author

The good thing is that you can use Mio's trait Index to make your tickets generic over types that implement it. So you still support usize directly. For example in my current application where I use your Stash I always have to convert between my internal index type and usize which is quite ugly compared to directly using my type in your Stash internally due to it being generic. With genericity it would be easily possible to implement a wrapper with

struct TaggedIndex<T> where T: ::mio::Index {
    tag: Tag,
    timestamp: Timestamp,
    ticket: T
}

impl<T> ::mio::Index for TaggedIndex<T> where T: ::mio::Index { ... }

But maybe I haven't understood what you exactly meant by "I designed it to work with MIO".

@Stebalien
Copy link
Owner

The good thing is that you can use Mio's trait Index to make your tickets generic over types that implement it.

Actually, it appears that slab (the crate that used to define Index) now just uses Into + From`. I turned that down in #3 because I didn't feel that it was worth it but now that two people have asked for it, I'll reopen it.

However, I'd still need an extra "DebugIndex" trait for debug checks:

struct Metadata {
    slot_version: u64,
    stash_id: u64,
}

// The `From` and `Into` implementations should probably panic.
trait DebugIndex: From<usize> + Into<usize> {
    fn from(metadata: Metadata, index: usize) -> Self;
    fn into(self) -> (Metadata, usize);
}

The best way to make this work would be specialization. If one wanted debug assertions, one would implement:

#[cfg(debug_assertions)]
impl DebugIndex for MyIndexType ...

If you can figure out a maintainable way to do this that has only opt-in runtime costs and ensures, I'd be happy to accept a patch. However, I don't really have time right now to get something like this working myself.

But maybe I haven't understood what you exactly meant by "I designed it to work with MIO".

It has to be possible to pick an index that's freely convertible to/from a usize. Indices compatible with MIO would be unable to implement DebugIndex`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants