-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Checkpoints #12
Comments
To summarize the email chain, we were thinking that we could use checksums to help us save the user-visible state when Checkpoint is called. For simplicity, we could likely use a hashmap keyed on file paths whose values are the checksums. Each checkpoint could generate a hashmap by walking the directory structure of the file system and checksumming the files found. |
We may also want to checksum at least some of the data available from calls to |
To consolidate what we've said so far and what I've been thinking about this issue: Overview:
The user-space CrashMonkey test harness needs to be able to receive Checkpoint requests from other processes. Since we are also expanding CrashMonkey to run in the background and have the user kick off their own workload (not one that implements CrashMonkey's
Collecting Data: Implementation Thoughts: I was planning on using local sockets when implementing #1, thus giving us flexibility down the line if we want to allow RPC calls into CrashMonkey functionality. I believe the implementation for this could also use local sockets as they allow bidirectional communication across processes and can be treated much like files in C code. On a local machine, they may not be as flexible as shared memory regions, but they avoid some of the synchronization/locking issues of |
We also need to be able to associate checkpoints with points in our logged We can assume that the user has just completed a |
Sockets sounds reasonable. To associate Checkpoints with the stream of data sent to the device, we should have either a file inside the device (lets say called Flag), that is written to everytime there is a checkpoint. Using writes to Flag, we can then associate each checkpoint in the dat stream. Another approach would be to have a in-kernel counter that is incremented every time the user calls Checkpoint (via an ioctl for example). Using the counter we associate checkpoints with bios. |
@domingues @ashmrtn progress seem to have stalled on this? Are we blocked on something? |
I have two doubts by now:
|
Lets ignore lost and found for now. By "user test", do you mean a test that the user runs on top of the mounted file system? If so, this is the default version of that user test. Once the file system mounts, we are basically testing that the data/metadata we expect is in there. If the file system does not mount at all, we just report an error and return. |
I think implementing checkpoints is a very large task, that is unlikely to be merged in with a single pull request. @domingues, could you merge in parts of it with pull requests as you code it up? |
@vijay03 I think it might be advantageous to split the functionality of the original checkpoint idea into 2 things:
|
I'm going to split this issue up into several smaller ones since both checkpoints and watches somewhat complicated and require support across different parts of CrashMonkey. |
Should we close this issue now @ashmrtn ? |
I have a long email thread in my inbox with @ashmrtn about this. Will add the summary from that thread here later.
In short, we want to have some mechanism to know what data/metadata to expect in each crash state. The idea is to allow users to call Checkpoint, which captures the user-visible state (directory tree + data) of the file system somewhere. On a crash, we go back to the latest Checkpoint and see if we have all the data in there.
The text was updated successfully, but these errors were encountered: