Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-task dumping #1284

Merged
merged 15 commits into from
Apr 17, 2023
Merged

Single-task dumping #1284

merged 15 commits into from
Apr 17, 2023

Conversation

mkeeter
Copy link
Collaborator

@mkeeter mkeeter commented Apr 14, 2023

This PR implements single-task dumping in Hubris, mediated by the supervisor task. It includes three new APIs:

  • Dump a single task
  • Dump a subregion of memory from a single task
  • Reinitialize dump areas starting at a particular index

(see oxidecomputer/humpty#5 for the associated UDP types)

Single-task dumping is between the supervisor and the kernel, and doesn't require the RoT!

Most of this PR is plumbing of various kinds:

  • UDP messages to dump-agent → Idol messages to Jefe
  • Idol messages to dump-agent -> Idol messages to Jefe
  • Idol messages to Jefe → actual work being done

Within jefe::dump, dump_task splits into four functions:

  • Setup (dump_task_setup)
  • Run (dump_task_run)
  • dump_task, which now uses the above
  • The new dump_task_region

To be very clear – even though it may go without saying – tasks talking to jefe through these APIs should not be able to crash the supervisor.

The most suspicion is directed towards dump_task_region, because it specifies an arbitrary address + length. To make sure the region is legal, jefe checks this region against the kernel's report of valid dump regions for the task (kipc::get_task_dump_region) and bails out early if it's invalid.

Copy link
Collaborator

@cbiffle cbiffle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some incorrect comparisons (noted below). Rest of this looks okay.

I know you're just moving existing undocumented unsafe blocks around, but if you understand humpty well enough to put a Safety: comment on them, I would be grateful.

Cargo.toml Outdated Show resolved Hide resolved
@@ -22,7 +22,6 @@ enum Trace {
DeserializeHeaderError(hubpack::Error),
SendError(SendError),
WrongVersion(u8),
Hi(u8, usize),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hiiiiii

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋🏻

//
// We need to claim a dump area. Once it's claimed, we have committed
// to dumping into it: any failure will result in a partial or otherwise
// corrupted dump.
//
let area = humpty::claim_dump_area(
base,
DumpContents::SingleTask,
if full {
DumpContents::SingleTask
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit a mild preference for exposing this enum in place of bool in the function signature, as I think it'd make it clearer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do you one better and make a new type, since DumpContents has other variants that aren't relevant here.

Done in c61c063

let mem = start..start + length;
let mut okay = false;
loop {
// This is Accidentally Quadratic; see the note in `dump_task`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either having the kernel return an error for an out-of-range region (as we discussed in chat), or exposing a "validate region" entry point, would be good ways of eliminating this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that's going on my list for later cleanups

if task_index == 0 {
// Can't dump the supervisor
return Err(DumpAgentError::NotSupported.into());
} else if task_index as usize > self.task_states.len() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like all of these comparisons should be >=

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed in bc20ab5

dump::dump_task(self.dump_areas, task_index as usize)
.map_err(|e| e.into())
}
fn dump_task_region(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(would like blank lines between functions, ideally)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, not sure where it went!

Fixed in d359897

@mkeeter mkeeter enabled auto-merge (squash) April 17, 2023 15:48
@mkeeter mkeeter merged commit 6538971 into master Apr 17, 2023
@mkeeter mkeeter deleted the single-task-dump branch April 17, 2023 15:56
mkeeter added a commit to oxidecomputer/humility that referenced this pull request Apr 17, 2023
This PR follows
- oxidecomputer/humpty#5
- oxidecomputer/hubris#1284

It implements a few things:
- Single-task dumping using the new APIs from the PRs above
- Letting the `NetCore` read RAM using the task region dump API

Things spiraled a _little_ out of control as I attempted to tame the Humility
dependency tree.  This mostly involved splitting modules into crates of their
own, to ease the blockage at `humility` and `humility_cmd`:

- Moved `NetCore` from `humility` to a separate crate `humility-net-core`
- Moved `hiffy`, `idol`, and `doppel` out of `humility_cmd` into standalone
  crates
- Added a new `humility-dump-agent` crate, which is used in both
  `humility_cmd_dump` and `NetCore`
- Fixed everything that these changes broke

The reorganization isn't done – it's still too easy to trigger a rebuild of every single
command – but it's a step in the right direction.
bcantrill pushed a commit to oxidecomputer/humility that referenced this pull request Jun 22, 2023
This PR follows
- oxidecomputer/humpty#5
- oxidecomputer/hubris#1284

It implements a few things:
- Single-task dumping using the new APIs from the PRs above
- Letting the `NetCore` read RAM using the task region dump API

Things spiraled a _little_ out of control as I attempted to tame the Humility
dependency tree.  This mostly involved splitting modules into crates of their
own, to ease the blockage at `humility` and `humility_cmd`:

- Moved `NetCore` from `humility` to a separate crate `humility-net-core`
- Moved `hiffy`, `idol`, and `doppel` out of `humility_cmd` into standalone
  crates
- Added a new `humility-dump-agent` crate, which is used in both
  `humility_cmd_dump` and `NetCore`
- Fixed everything that these changes broke

The reorganization isn't done – it's still too easy to trigger a rebuild of every single
command – but it's a step in the right direction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants