Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[catnap] Implementing Missing System Calls #68

Merged
merged 6 commits into from
Apr 7, 2022
Merged

Conversation

ppenna
Copy link
Contributor

@ppenna ppenna commented Apr 5, 2022

@ppenna ppenna added the bug Something Isn't Working label Apr 5, 2022
@ppenna ppenna requested a review from BrianZill April 5, 2022 18:33
@ppenna ppenna self-assigned this Apr 5, 2022
@ppenna ppenna requested a review from iyzhang April 5, 2022 18:36
@ppenna ppenna force-pushed the bugfix-pdpix-missing-calls branch from 7a670c4 to 6e397df Compare April 5, 2022 20:38
src/demikernel/libos.rs Outdated Show resolved Hide resolved
src/demikernel/libos.rs Outdated Show resolved Hide resolved
src/catnap/mod.rs Outdated Show resolved Hide resolved
}

// Put back operation in the scheduling queue.
handle.into_raw();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing? If the idea is to remove the key from the SchedulerHandle, "handle.take_key()" would be clearer. The comment doesn't really explain what is happening either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change it to handle.take_key(), but the bad semantics that you have noticed should be fixed in the scheduler, when a better reference counting mechanism is implemented and the scheduling system is re-architected. For now, I have just decoupled it and made it more to work on abstract futures.

Overall, the use of the SchedulerHandle is overload and it is used to refer to both, short-living and long-living operations. In the latter case the reference count in the underlying WakerPageRef is bumped, and in the former case the reference count is not.

To summary up, this line is doing what the comment says: putting back the underlying operation in the scheduling queue (instead of dropping it). What is going on is the following:

  • When we call get_handle() the scheduler performs a lookup in its internal tables and gets a unique identifier for the given operation. When it does so, we create a SchedulerHandle from a given WakerPageRef, but we don't bump the associated reference count.
  • When the SchedulerHandle is dropped, it would call the custom Drop trait implementation. This would decrement the reference counting of the underlying WakerPageRef, if key is not a None value.
  • If the target ScheduleHandle is a short-living one, the drop will happen afterwards, when the associated future completes. Thus, to avoid a double-free scenario, we take out the underlying key in the ScheduleHandle, leaving a None on its place and preventing drop() to run.

This solution is ugly, the exposed interface is not clear and I'm not happy with it.

For now, I've created issue https://github.com/demikernel/catwalk/issues/2 to keep track of this.

Let me know if you have any suggestions on the comment to make the whole mess clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. Yes, that is ugly. I would change the comment to say "Return this operation to the scheduling queue by removing the associated key (which would otherwise cause the operation to be freed)" or something like that.

I would also change the code to be just "handle.take_key();". There's no need to "into_raw()" or "unwrap()" because you're not doing anything with the result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@ppenna ppenna removed the request for review from iyzhang April 5, 2022 23:58
ppenna added a commit that referenced this pull request Apr 6, 2022
@ppenna ppenna requested a review from BrianZill April 6, 2022 12:54
ppenna added a commit that referenced this pull request Apr 6, 2022
Copy link
Contributor

@BrianZill BrianZill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes.

@ppenna ppenna force-pushed the bugfix-pdpix-missing-calls branch from 32edc53 to 5902239 Compare April 7, 2022 11:32
@ppenna ppenna merged commit fbf1a57 into dev Apr 7, 2022
@ppenna ppenna deleted the bugfix-pdpix-missing-calls branch April 7, 2022 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something Isn't Working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants