Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create API and stub to communicate checkpoint requests to CrashMonkey #38

Closed
ashmrtn opened this issue Sep 13, 2017 · 5 comments
Closed
Assignees

Comments

@ashmrtn
Copy link
Member

ashmrtn commented Sep 13, 2017

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part is meant to provide user workloads the ability to tell the CrashMonkey test harness that they want to create a checkpoint. CrashMonkey should provide both a stub binary to accomplish this task (similar to the current stubs in the user_tools directory) as well as a small API for tests subclassed from BastTestCase.h. This utility can make use of the sockets class available in the utils/communication directory.

For checkpoints, we can assume 2 things:

  1. the user has just performed a sync/fsync request of some form
  2. this call will block until all parts of the checkpoint are completed

The stub program or API for this part should do 2 things:

  1. send a message via socket to the CrashMonkey test harness requesting a checkpoint be made
  2. wait for the CrashMonkey test harness to respond that the checkpoint has been completed

After the stub has received confirmation the checkpoint operation completed, it should exit with no error (for the binary) or return to the caller.

@vijay03
Copy link
Member

vijay03 commented Sep 13, 2017

Its unclear from the description whether checkpoints are done automatically whenever a sync() or fsync() happens, or whether they are called manually. It would be great if it is automatic.

@ashmrtn
Copy link
Member Author

ashmrtn commented Sep 13, 2017

If we want checkpointing to be done automatically, we have 2 options:

  1. require the user to tell us what sort of sync operation they want to perform, and on what
  2. 'tap into' what the program is doing via something like ptrace to monitor for sync/fsync calls

If neither of the above are implemented, CrashMonkey won't have any idea what the user program is doing. It will only know that the user program requested a checkpoint, and it will assume that the preconditions for a checkpoint (ex. some sort of sync) has been completed.

@vijay03
Copy link
Member

vijay03 commented Sep 13, 2017

If it doesn't degrade performance too much, I was thinking of something like strace/ptrace to identify fsync() or sync() calls. It doesn't matter which file they are calling fsync() on, we want checkpoints at that point anyway.

But perhaps we should do this as a later optimization. Lets just do manual checkpoints for now.

@ashmrtn
Copy link
Member Author

ashmrtn commented Sep 13, 2017

I agree that having ptrace watching the workload would be great. Despite that, I also feel that we should first get manual checkpoints working before we try to work in ptrace. Almost all of the infrastructure created for manual checkpoint operations will be used by automatic checkpoints. This is pretty much the only part that will not be used anymore I think.

@vijay03
Copy link
Member

vijay03 commented Sep 13, 2017

Agreed. Lets get manual checkpoints working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants