Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create simulator interface which provides thread scheduling + speculative fetching #5843

Closed
derekbruening opened this issue Jan 31, 2023 · 1 comment · Fixed by #6029
Closed

Comments

@derekbruening
Copy link
Contributor

Rather than having each simulator figure out how to schedule traced software threads onto simulated cores in their own ad hoc way, we would like to provide a scheduler service, which should result in several benefits:

  • Ease of use: a new simulator use case has one less thing to implement
  • Consistency: all simulators can now use the same approach
  • Fill in gaps in trace-based simulation:
    • We can re-schedule threads even when simulating the recorded hardware to deflate context switches increased by tracing overhead
    • We can more easily combine multiple single-workload traces
    • We can provide speculative path fetching using various schemes (from heuristics to additional data recorded during tracing) to help bridge the gap with execution-driven simulation

Xref #5694: provide per-core iterator.
That may become subsumed by this new broader-scope feature.

derekbruening added a commit that referenced this issue Feb 16, 2023
Adds a new scheduler component to drmemtrace which provides
flexibility in combining input traces and is meant to supply key
features for simulation of traces.

This first stage adds a base scheduler which only supports the two
analyzer modes: parallel software thread streams or a single serial
stream.

The input file opening code and the input-to-worker code is moved from
the analyzer to the scheduler.  The analyzer now has to look at the
tid fields in the stream records to identify shards to tools, but the
input-to-worker does belong in the scheduler.

Removes the analyzer external iterator interface; tools should instead
use the scheduler directly.  Updates histogram_launcher and two tests
to do this.

Adds a new scheduler unit test with a mocked reader that takes vectors
of records, containing some initial sanity tests.

The scheduler takes in either file paths and opens its own readers for
those, or it can be passed readers.  This latter interface is used for
online IPC readers, as well as for the unit test using a mocked
reader.  The IPC reader requires a delayed init() call which is
handled by paying for a flag check on each stream advance.

To support -skip_instrs, region-of-interest code is implemented here.
However, it requires fixing a problem in reader_t::skip_instructions()
by adding a queue and a new use-prior-record method.  (The queue can
be merged with the file_reader_t queue later.)  It might be nicer to
separate that out but that would leave -skip_instrs not working.

Future work includes moving the serial mode interleaving from the file
reader to the scheduler, and then adding new scheduling and simulation
features.

Issue: #5843
@derekbruening
Copy link
Contributor Author

There are many design points here; documenting some smaller ones and will probably put the rest in a separate doc:

Lots of little issues with the scheduler -- here is one: the output streams
have to keep their own record counts (b/c they combine multiple inputs).
Yet the inputs sometimes "hide" records like the synthetic headers after a
skip:

<--record#-> <--instr#->: <---tid---> <record details>
------------------------------------------------------------
           0          63:      296231 <marker: timestamp 13319413770947393>
           0          63:      296231 <marker: tid 296231 on core 10>
          90          64:      296231 ifetch       4 byte\(s\) @ 0x0000000000401028 48 83 eb 01          sub    \$0x0000000000000001 %rbx -> %rbx
          91          65:      296231 ifetch       4 byte\(s\) @ 0x000000000040102c 48 83 fb 00          cmp    %rbx \$0x0000000000000000

The scheduler though does the skip and asks the zipfile reader what the new
record# is so it can update its count and is told 0 so it goes from there
and doesn't see the 90 2 entries later (only queries input on a skip):

<--record#-> <--instr#->: <---tid---> <record details>
------------------------------------------------------------
           0          63:      296231 <marker: timestamp 13319413770947393>
           1          63:      296231 <marker: tid 296231 on core 10>
           2          64:      296231 ifetch       4 byte(s) @ 0x0000000000401028 48 83 eb 01          sub    $0x0000000000000001 %rbx -> %rbx
           3          65:      296231 ifetch       4 byte(s) @ 0x000000000040102c 48 83 fb 00          cmp    %rbx $0x0000000000000000

What is the best solution?

Does it have to query both ordinals before and after every input advance?

Have separate "effective" and "presented" ordinals?

Use the same get_last_record_ordinal() proposed for scheduler-inserted
"doesn't count" cpuid markers w/ ords of 0?

Maybe these inserted records should all be reported as the same prior
ordinal and we add a separate flag "inserted" and the view tool looks for
"inserted" and displays 0 in that case. Or abandon the 0 and leave it
blank or as "--" or sthg? But will that be confusing if the view tool
shows one thing and the direct query shows another? I guess it's the 0
that's confusing: anything else seems compatible with the direct query
showing the prior record ordinal.

I seem to recall a prior discussion where we came up with the 0 and liked
it though, for synthetic records, which include the post-skip headers
above plus the scheduler inserting
cpuid markers for synthetic schedules: we decided those would not interrupt
the original record count.

Decison:

  • Add to memtrace_stream_t: is_current_record_synthetic()
  • Remove reader_t games where record ordinals are 0 for a few records
  • Have get_record_ordinal() return the previous record's ordinal for
    synthetic records not present in the original stream
  • Have view_t use the new API to display "--" or some other non-numeric
    indicator for synthetic records

derekbruening added a commit that referenced this issue Feb 24, 2023
Adds a new scheduler component to drmemtrace which provides flexibility
in combining input traces and is meant to supply key features for
simulation of traces.

This first stage adds a base scheduler which only supports the two
analyzer modes: parallel software thread streams or a single serial
stream.

The input file opening code and the input-to-worker code is moved from
the analyzer to the scheduler. The analyzer now has to look at the tid
fields in the stream records to identify shards to tools, but the
input-to-worker does belong in the scheduler.

Removes the analyzer external iterator interface; tools should instead
use the scheduler directly. Updates histogram_launcher and two tests to
do this.

Adds a new scheduler unit test with a mocked reader that takes vectors
of records, containing some initial sanity tests.

The scheduler takes in either file paths and opens its own readers for
those, or it can be passed readers. This latter interface is used for
online IPC readers, as well as for the unit test using a mocked reader.
The IPC reader requires a delayed init() call which is handled by paying
for a flag check on each stream advance.

To support -skip_instrs, region-of-interest code is implemented here.
However, it requires fixing a problem in reader_t::skip_instructions()
by adding a queue and a new use-prior-record method. (The queue can be
merged with the file_reader_t queue later.) It might be nicer to
separate that out but that would leave -skip_instrs not working.

To support skipping with multiple inputs, changes how synthetic records are
treated:
    
    Eliminates synthetic records being considered to have a 0 record
    ordinal: instead they have the ordinal of the prior record.  A new
    memtrace_stream_t function is_record_synthetic() is introduced for
    identifying synthetic records.  This change is required to allow the
    scheduler_t layer to properly figure out output stream orderinals.
    
    Updates the reader, zipfile reader, and tests.  Adds a new test to
    test both synthetic and real headers after a skip.

Future work includes moving the serial mode interleaving from the file
reader to the scheduler, and then adding new scheduling and simulation
features.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 7, 2023
Implements timestamp ordering in scheduler_t rather than relying on
the old implementation inside file_reader_t.

Adds a sanity test.

Removing the file_reader_t code, along with eliminating the
thread-as-sub-reader API routines, will be done as a separate
refactoring.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 8, 2023
Implements timestamp ordering in scheduler_t rather than relying on the
old implementation inside file_reader_t.

Adds a sanity test.

Fixes a bug with only_threads and adds a simple test.

Removing the file_reader_t code, along with eliminating the
thread-as-sub-reader API routines, will be done as a separate
refactoring.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 9, 2023
Removes multi-input support from file_reader_t and other readers now
that the scheduler_t owns that.  Specifically:

+ Removes read_next_thread_entry() and requires that read_next_entry()
  always check the queue (via a provided helper function).

+ Removes skip_thread_instructions() and refactors the pre-skip header
  reading and the post-skip walking while remembering timestamps.
  Places these latter two inside reader_t for use by all readers, with
  zipfile overriding just the fast skip in the middle and sharing all
  the other code.  This refactoring and sharing solves the problem of
  missing timestamps when skipping from the middle.

+ Removes the arrays of data for multiple inputs from file_reader_t
  and all subclasses.

Updates the view_test to use a scheduler for its multiple-input mock
reader.

While at it, removes is_complete().

Issue: #5843, #5538
derekbruening added a commit that referenced this issue Mar 13, 2023
Removes multi-input support from file_reader_t and other readers now
that the scheduler_t owns that. Specifically:

+ Removes read_next_thread_entry() and requires that read_next_entry()
always check the queue (via a provided helper function).

+ Removes skip_thread_instructions() and refactors the pre-skip header
reading and the post-skip walking while remembering timestamps. Places
these latter two inside reader_t for use by all readers, with zipfile
overriding just the fast skip in the middle and sharing all the other
code. This refactoring and sharing solves the problem of missing
timestamps when skipping from the middle.

+ Removes the arrays of data for multiple inputs from file_reader_t and
all subclasses.

Updates the view_test to use a scheduler for its multiple-input mock
reader.

While at it, removes is_complete().

Issue: #5843, #5538
derekbruening added a commit that referenced this issue Mar 16, 2023
Adds get_input_stream_count() and get_input_stream_name() to the
scheduler_t drmemtrace interface.

Adds a test of these to the scheduler unit tests which uses real files
and also serves as a test of only_threads for real files, whose code
paths are different enough it had a bug which we fix here as well.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 16, 2023
Adds get_input_stream_count() and get_input_stream_name() to the
scheduler_t drmemtrace interface.

Adds a test of these to the scheduler unit tests which uses real files
and also serves as a test of only_threads for real files, whose code
paths are different enough it had a bug which we fix here as well.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 22, 2023
Adds to the scheduler interface a query to obtain the current input
stream's memtrace_stream_t handle.

Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by
default for parallel mode so the output stream's ordinals are
suppressed and instead the current input stream's ordinals are
presented on the output stream.  This fixes a problem where the
parallel analysis tool framework saw accumulated ordinals across
inputs.

Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes
the first flag to be set if there is a single input and single output.
This solves a serial mode problem where an analysis tool does want to
see input gaps when there is no interleaving as there is only one
input.

Adds a test.

Also manually tested a real analysis tool to confirm by tweaking the
view tool to operate in parallel:

  Before:
    ===========================================================================
    [analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
    [analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480
    ------------------------------------------------------------
            1480         585:     3443921 <marker: version 4>
            1481         585:     3443921 <marker: filetype 0x240>
    ===========================================================================

  After:
    ===========================================================================
    [analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
    [analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480
    ------------------------------------------------------------
               1           0:     3443921 <marker: version 4>
               2           0:     3443921 <marker: filetype 0x240>
    ===========================================================================

Issue: #5843
derekbruening added a commit that referenced this issue Mar 23, 2023
Adds to the scheduler interface a query to obtain the current input
stream's memtrace_stream_t handle.

Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by
default for parallel mode so the output stream's ordinals are
suppressed and instead the current input stream's ordinals are
presented on the output stream.  This fixes a problem where the
parallel analysis tool framework saw accumulated ordinals across
inputs.

Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes
the first flag to be set if there is a single input and single output.
This solves a serial mode problem where an analysis tool does want to
see input gaps when there is no interleaving as there is only one
input.

Adds a test.

Also manually tested a real analysis tool to confirm by tweaking the
view tool to operate in parallel:

  Before:
    ===========================================================================
    [analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
    [analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480
    ------------------------------------------------------------
            1480         585:     3443921 <marker: version 4>
            1481         585:     3443921 <marker: filetype 0x240>
    ===========================================================================

  After:
    ===========================================================================
    [analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
    [analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480
    ------------------------------------------------------------
               1           0:     3443921 <marker: version 4>
               2           0:     3443921 <marker: filetype 0x240>
    ===========================================================================

Issue: #5843
derekbruening added a commit that referenced this issue Mar 23, 2023
Fixes some fencepost errors in scheduler input region of interest handling.

Adds a test of regions of interest which actually contains timestamps,
which is what revealed the errors.

Refactors the scheduler unit tests to use trace_entry_t instead of
memref_t, which is required to properly test the scheduler's input
readers, as that is the record type they operate on.  This results in
no longer needing reader_t::use_prev() which is removed here.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 23, 2023
Adds to the scheduler interface a query to obtain the current input
stream's memtrace_stream_t handle.

Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by
default for parallel mode so the output stream's ordinals are suppressed
and instead the current input stream's ordinals are presented on the
output stream. This fixes a problem where the parallel analysis tool
framework saw accumulated ordinals across inputs.

Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes the
first flag to be set if there is a single input and single output. This
solves a serial mode problem where an analysis tool does want to see
input gaps when there is no interleaving as there is only one input.

Adds a test.

Also manually tested a real analysis tool to confirm by tweaking the
view tool to operate in parallel:

  Before:

===========================================================================
[analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
[analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480
    ------------------------------------------------------------
            1480         585:     3443921 <marker: version 4>
            1481         585:     3443921 <marker: filetype 0x240>

===========================================================================

  After:

===========================================================================
[analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480
               1           0:     3443916 <marker: version 4>
               2           0:     3443916 <marker: filetype 0x240>
             ...
            1479         585:     3443916 <thread 3443916 exited>
[analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480
    ------------------------------------------------------------
               1           0:     3443921 <marker: version 4>
               2           0:     3443921 <marker: filetype 0x240>

===========================================================================

Issue: #5843
derekbruening added a commit that referenced this issue Mar 24, 2023
Fixes some fencepost errors in scheduler input region of interest
handling.
    
Adds a test of regions of interest which actually contains timestamps,
which is what revealed the errors.

Refactors the scheduler unit tests to use trace_entry_t instead of
memref_t, which is required to properly test the scheduler's input
readers, as that is the record type they operate on.  This results in
no longer needing reader_t::use_prev() which is removed here.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 24, 2023
Adds initial support for MAP_TO_ANY_OUTPUT with multiple outputs.
Uses a simple queue of ready-to-schedule inputs and implements
an instruction-based scheduling quantum.

Adds a test.

Issue: #5843
derekbruening added a commit that referenced this issue Mar 29, 2023
Adds initial support for MAP_TO_ANY_OUTPUT with multiple outputs. Uses a
simple queue of ready-to-schedule inputs and implements an
instruction-based scheduling quantum.

Adds a test.

Adds new types input_ordinal_t and output_ordinal_t and corresponding invalid constants and updates all existing code to use these.

Issue: #5843
derekbruening added a commit that referenced this issue May 1, 2023
Implements initial speculation support, supplying nops.  Speculation
is separated into its own class where we can fill in different
strategies in the future.

The start_speculation() function takes a flag controlling whether the
scheduler queues up the current record and re-returns it as the first
record after speculation stops.  This is often what a simulator wants
as it has to read the instruction record following a branch to
determine whether it is on the wrong path, and it would like to resume
with that already-read instruction after speculation.

Adds a unit test.

Issue: #5843
derekbruening added a commit that referenced this issue May 2, 2023
Implements initial speculation support, supplying nops. Speculation is
separated into its own class where we can fill in different strategies
in the future.

The start_speculation() function takes a flag controlling whether the
scheduler queues up the current record and re-returns it as the first
record after speculation stops. This is often what a simulator wants as
it has to read the instruction record following a branch to determine
whether it is on the wrong path, and it would like to resume with that
already-read instruction after speculation.

Adds a unit test.

Issue: #5843
derekbruening added a commit that referenced this issue May 4, 2023
Adds a lock for each input to enforce missing synchronization during
scheduling decisions.

Fixes a bug with the existing scheduler lock.

Adds a multi-threaded test.

Tested a similar multi-threaded test under ThreadSanitizer which now
reports no races (it did before these code changes).

Fixes #5843
derekbruening added a commit that referenced this issue Oct 9, 2023
Improves two instances of push_back by replacing with emplace_back.

Issue: #5843
derekbruening added a commit that referenced this issue Oct 19, 2023
Add epoll_pwait2, sendmmsg, recvmmmsg, and membarrier to the
maybe-blocking syscall list.  These don't always block: e.g.,
membarrier has some sub-operations for which it never blocks.

Updates the DR syscall headers to include recently added syscalls,
including epoll_pwait2.  The uapi headers are only partly updated due
to lack of easy access to a header to fill in the other SYS_ defines.

Issue: #5843
derekbruening added a commit that referenced this issue Oct 19, 2023
Add epoll_pwait2, sendmmsg, recvmmsg, and membarrier to the
maybe-blocking syscall list. These don't always block: e.g., membarrier
has some sub-operations for which it never blocks.

Updates the DR syscall headers to include recently added syscalls,
including epoll_pwait2. The uapi headers are only partly updated due to
lack of easy access to a header to fill in the other SYS_ defines.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 1, 2023
Adds a new marker type TRACE_MARKER_TYPE_DIRECT_THREAD_SWITCH for use
with custom kernel scheduling features where one thread directly
switches to another on the same cpu.

Refactors raw2trace marker processing code to allow a subclass to
insert the new marker.

Makes the raw2trace blocking syscall code virtual to allow a subclass
to label custom syscalls as blocking.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 2, 2023
Adds a new marker type TRACE_MARKER_TYPE_DIRECT_THREAD_SWITCH for use
with custom kernel scheduling features where one thread directly
switches to another on the same cpu.

Refactors raw2trace marker processing code to allow a subclass to insert
the new marker.

Makes the raw2trace blocking syscall code virtual to allow a subclass to
label custom syscalls as blocking.

Given that the changes are used in separate code it is not simple to
make a test of the raw2trace refactoring + virtual. For the marker:
tests that use the marker will be forthcoming in scheduler_unit_tests.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 7, 2023
Adds a flexible priority queue class which tracks indices and so
supports asking whether an entry is in the queue and removing an entry
from anywhere in the queue.  Adds a simple unit test.

Changes the scheduler to use this new queue class, in anticipation of
needing both new features to handle direct targeted thread switches.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 8, 2023
Adds a flexible priority queue class which tracks indices and so
supports asking whether an entry is in the queue and removing an entry
from anywhere in the queue. Adds a simple unit test.

Changes the scheduler to use this new queue class, in anticipation of
needing both new features to handle direct targeted thread switches.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 8, 2023
Adds support for the TRACE_MARKER_TYPE_DIRECT_THREAD_SWITCH marker,
when it appears after TRACE_MARKER_TYPE_MAYBE_BLOCKING_SYSCALL.  The
scheduler directly switches to the target thread if it is on the ready
queue.  Performing a forced migration if the target is running on
another output is not yet implemented.  Once i/o wait states are
added, waking up a target thread will be added, but that is future
work as well.

Adds a simple unit test.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 12, 2023
Adds support for the TRACE_MARKER_TYPE_DIRECT_THREAD_SWITCH marker, when
it appears after TRACE_MARKER_TYPE_MAYBE_BLOCKING_SYSCALL. The scheduler
directly switches to the target thread if it is on the ready queue.
Performing a forced migration if the target is running on another output
is not yet implemented. Once i/o wait states are added, waking up a
target thread will be added, but that is future work as well.

Adds a DEPENDENCY_DIRECT_SWITCH_BITFIELD and renames
DEPENDENCY_TIMESTAMPS
to DEPENDENCY_TIMESTAMP_BITFIELD so we can combine them, and makes a new
enum entry DEPENDENCY_TIMESTAMPS which combines the two bitfields, which is
what nearly every use case should want while still giving us control and
without really breaking compatibility (and by providing bits and
combinations the enum type is all that's needed still).

Adds a unit test where the schedule would clearly be different without
the switch target.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 16, 2023
Rather than context switching on every syscall labeled maybe-blocking,
the scheduler uses the now-available syscall latency to decide whether
the syscall should block and result in a context switch.

Adds two new command line options, -sched_syscall_switch_us (default
500us) and -sched_blocking_switch_us (default 100us), and
corresponding scheduler_t inputs, to control the latency thresholds.
To avoid relying too much on the maybe-blocking labels, we do consider
a very-high-latency syscall not marked as maybe-blocking to block.

Adds a new unit test.

Tested in a large proprietary app where this reduces the context
switch rate from ~100x too high down to ~10x too high.  The next step
of adding i/o wait times should further improve the
representativeness.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 16, 2023
Rather than context switching on every syscall labeled maybe-blocking,
the scheduler uses the now-available syscall latency to decide whether
the syscall should block and result in a context switch.

Adds two new command line options, -sched_syscall_switch_us (default
500us) and -sched_blocking_switch_us (default 100us), and corresponding
scheduler_t inputs, to control the latency thresholds. To avoid relying
too much on the maybe-blocking labels, we do consider a
very-high-latency syscall not marked as maybe-blocking to result in
a context switch.

Adds a new schedule_stats unit test.

Tested in a large proprietary app where this reduces the context switch
rate from ~100x too high down to ~10x too high. The next step of adding
i/o wait times should further improve the representativeness.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 17, 2023
Fixes a < assert from PR #6458 to be <=, to allow the pre-syscall
timestamp to equal the post-syscall timestamp.

Adds a test that fails without the fix.

Issue: #5843
derekbruening added a commit that referenced this issue Nov 17, 2023
Fixes a < assert from PR #6458 to be <=, to allow the pre-syscall
timestamp to equal the post-syscall timestamp.

Adds a test that fails without the fix.

Issue: #5843
derekbruening added a commit that referenced this issue Dec 11, 2023
Changes the quanta accounting to match the real kernel by accumulating
it across executions if a prior execution was terminated early due to
a voluntary context switch.

Adds new testing, and updates old tests with the behavior change.
Scheduler unit test string changes were carefully vetted.  E.g., for
test_synthetic_with_syscalls_multiple(): the output strings changed
because H's quantum accumulates and it hits a preempt in the middle of
its second HH sequence, which decrements B's quantum, causing B to
become available sooner.

Issue: #5843
derekbruening added a commit that referenced this issue Dec 12, 2023
Changes the quanta accounting to match the real kernel by accumulating
it across executions if a prior execution was terminated early due to a
voluntary context switch.

Adds new testing, and updates old tests with the behavior change.
Scheduler unit test string changes were carefully vetted. E.g., for
test_synthetic_with_syscalls_multiple(): the output strings changed
because H's quantum accumulates and it hits a preempt in the middle of
its second HH sequence, which decrements B's quantum, causing B to
become available sooner.

Issue: #5843
derekbruening added a commit that referenced this issue Apr 11, 2024
Adds a new scheduler option field honor_direct_switches and a
corresponding command-line parameter -sched_disable_direct_switches to
allow a way to disable direct thread switches, primarily for
scheduling experimentation.

Adds a unit test.

Issue #5843
derekbruening added a commit that referenced this issue Apr 11, 2024
Adds a new scheduler option field honor_direct_switches and a
corresponding command-line parameter -sched_disable_direct_switches to
allow a way to disable direct thread switches, primarily for scheduling
experimentation.

Adds a unit test.

Issue #5843
derekbruening added a commit that referenced this issue Jun 26, 2024
Fixes an inconsistency in the CLI drmemtrace scheduler quantum and the
internal API by making them both the same at 6 million.  We pick 6
million to match 2 instructions per nanosecond with a 3ms quantum.
The scheduler_launcher default is also made to match.

Issue: #5843
derekbruening added a commit that referenced this issue Jun 27, 2024
Fixes an inconsistency in the CLI drmemtrace scheduler quantum and the
internal API by making them both the same at 6 million. We pick 6
million to match 2 instructions per nanosecond with a 3ms quantum. The
scheduler_launcher default is also made to match.

Issue: #5843
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant