Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ktest] Implement real parallel ktest and measure ktest time #834

Closed
wants to merge 1 commit into from

Conversation

junyang-zh
Copy link
Collaborator

This is all you want.

Fix #696.

Also, ktest time is measured in milliseconds to be a performance hint.

@junyang-zh junyang-zh force-pushed the parallel_ktest branch 6 times, most recently from 42b0f7f to 2bbab26 Compare May 11, 2024 14:09
Copy link
Contributor

@tatetian tatetian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the clever trick of introducing KtestDependencies so that ktest can work regardless of the runtime environment.

My big question is whether this PR is solving a problem that the users really care. I think the users want to enable benchmarks via cargo osdk bench.

#[cfg(ktest)]
mod tests {
    use super::*;
    use ktest::Bencher;

    #[kbench]
    fn bench_add_two(b: &mut Bencher) {
        b.iter(|| add_two(2));
    }
}

Not measuring the elapsed time of individual tests.

    let start_millis = (deps.monotonic_millis)();
    let test_result = test.run(&deps.catch_unwind);
    let duration_millis = (deps.monotonic_millis)() - start_millis;

A test usually repeats an operation repeatedly. Showing the total time of a test is not quite helpful.

framework/libs/ktest/src/runner.rs Outdated Show resolved Hide resolved
@junyang-zh
Copy link
Collaborator Author

My big question is whether this PR is solving a problem that the users really care. I think the users want to enable benchmarks via cargo osdk bench.

No this PR doesn't cover till this point.

Measuring test time is also useful to prevent performance regression, as there are tests that contains iterations.

And we need more efforts to implement the kbench feature, which requires a dedicated library for black boxing, deciding number of iterations to run, and so. I did not investigate much on this topic.

@tatetian
Copy link
Contributor

tatetian commented May 14, 2024

Measuring test time is also useful to prevent performance regression, as there are tests that contains iterations.

Tests are usually built with debug mode. The resulting execution time cannot represent the true performance. Performance regression should be detected with dedicated benchmarks.

The overarching principle in designing OSDK is to make cargo osdk <action> behave the same way as cargo <action>. This gives the users least surprise. cargo test does not show the execution time of individual tests, neither should cargo osdk test.

In the same spirit, I think cargo osdk test should measure and show the total time spent by running all tests. This is what cargo test does.

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

@junyang-zh
Copy link
Collaborator Author

Measuring test time is also useful to prevent performance regression, as there are tests that contains iterations.

Tests are usually built with debug mode. The resulting execution time cannot represent the true performance. Performance regression should be detected with dedicated benchmarks.

Ok then. I may remove it in this PR.

@junyang-zh junyang-zh force-pushed the parallel_ktest branch 2 times, most recently from 9f5800b to 136f28b Compare May 14, 2024 07:40
@junyang-zh
Copy link
Collaborator Author

In the same spirit, I think cargo osdk test should measure and show the total time spent by running all tests. This is what cargo test does.

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

The total time is reported now, although the measurement is really primitive and is inaccurate if tasks can be preempted. They can't be preempted right now so it is OK currently.

@tatetian
Copy link
Contributor

The measurement is really primitive and is inaccurate if tasks can be preempted.

Preemption is not really a concern. The reported time is intended to give the user a sense of how long it takes to run these tests. The running time is affected by many factors, one of which is preemption. This factor has been faithfully reflected by the reported time. So we are good.


// Wait for all spawned tests.
while FINISHED.load(Ordering::Relaxed) < crate_.nr_tot_tests() {
(deps.yield_fn)()
Copy link
Contributor

@tatetian tatetian May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inefficient.

How about replacing yield_fn with join_fn?

    /// Spawn a new task, returning the task ID.
    pub spawn_fn: fn(fn() -> ()) -> u32,
    /// Join the task of a given ID.
    pub join_fn: fn(u32),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not planned. We haven't got a join() method in the frame yet. This is indeed inefficient for FIFO scheduling...

framework/libs/ktest/src/runner.rs Outdated Show resolved Hide resolved
}
}

spawn_ktest(deps, test);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spawning too many tasks is not an ideal configuration. A more sensible configuration is to have as many tasks as the number of CPUs. cargo test even allow the user to specify the concurrency at run time (e.g., cargo test -- --test-threads=2). I think the ktest framework should take such an argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently (spawning == creating and running immediately) and no preemption is allowed. So it would be main -> test1 -> main -> test2 -> main -> test3 ... currently in single CPU.

It is not a big deal of problem currently I guess. But supporting test workers is indeed better.

@junyang-zh
Copy link
Collaborator Author

OSDK failure seems to be a known issue. Not related to this PR.

@tatetian
Copy link
Contributor

tatetian commented Jun 8, 2024

I think it is time to revisit the design decision of making ktest relatively independent from the aster-frame.

Code like this

/// A set of functions needed to perform ktests.
#[derive(Clone)]
pub struct KtestDependencies {
    /// The corresponding utility of `std::panic::catch_unwind`.
    pub catch_unwind_fn: CatchUnwindFn,
    /// The print function to print the test prompts for the user.
    pub print_fn: fn(core::fmt::Arguments),
    /// The function returning monotonic milliseconds to measure the time.
    pub monotonic_millis_fn: fn() -> u64,
    /// The function to spawn a test.
    pub spawn_fn: fn(fn() -> ()) -> (),
    /// Yield the current task.
    /// The main task may be busy looping to wait all tasks. This is helpful
    /// for performance if there are little cores in the system.
    pub yield_fn: fn(),
}

and this

/// Define a spinlocked static mutable variable.
/// Example:
/// ```ignore
/// spinlock! {
///     pub static GLOBAL_COUNT: usize = 0;
/// }
/// ```
/// To access the variable, use [`lock_and`].

have clearly shown how much efforts we have to make to keep ktest independent from aster-frame.

I don't expect the ktest crate to have any other users than aster-frame itself and crates that are based on aster-frame. Giving up the independence of ktest can simplify its implementation significantly.

Copy link
Contributor

@tatetian tatetian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR aims to introducing two aspects of improvements.

  1. Timekeeping;
  2. Parallel execution.

It has done a good job of achieving the first goal, but not the second one. This can been seen from code like this

            // FIXME: This spawns every ktest as a task, which may be inefficient.
            // We need to spawn runners that take tasks and run them.
            spawn_ktest(deps, test, spawned);
            spawned += 1;

and this

    // Wait for all spawned tests.
    while FINISHED.load(Ordering::Relaxed) < spawned {
        (deps.yield_fn)()

If we want to run tests in parallel, we must do it the right way. Since Asterinas does not have multiprocessor support for now, doing it the "wrong" way does not bring any true benefit. In fact, doing the "wrong" way is even harmful: having a runnable thread that keeps yielding will make it impossible to do benchmark using the ktest infrastructure as our task scheduler is not smart enough to guarantee the benchmark task to have most of the CPU time.

So I suggest (1) accomplishing the two goals in separate PRs, and (2) running parallel tests in the right way.

@junyang-zh junyang-zh mentioned this pull request Jun 26, 2024
@junyang-zh junyang-zh added the S-stale Questions that no one has followed up on for a long time label Jun 26, 2024
@junyang-zh
Copy link
Collaborator Author

Um, we may rethink ktest and this PR should be a reference implementing #975 , closing.

@junyang-zh junyang-zh closed this Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-stale Questions that no one has followed up on for a long time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting running kernel mode unit test in kernel threads
2 participants