Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider fork-like interface that doesn't schedule new coroutine immediately #1541

Closed
marlonjames opened this issue Mar 27, 2020 · 17 comments · Fixed by #2023
Closed

Consider fork-like interface that doesn't schedule new coroutine immediately #1541

marlonjames opened this issue Mar 27, 2020 · 17 comments · Fixed by #2023
Labels
category:codebase:scheduler relating to cocotb's coroutine scheduler, triggers, and coroutine objects type:feature new or enhanced functionality

Comments

@marlonjames
Copy link
Contributor

Currently when calling fork, the new forked coroutine gets to run immediately, and the originating coroutine waits until fork returns before continuing. It might be useful to provide a way to queue the forked coroutine to run after the current one yields back to the scheduler.

Originally posted by @garmin-mjames in #1526 (comment)

@eric-wieser
Copy link
Member

That's how asyncio.ensure_future behaves, right?

@marlonjames
Copy link
Contributor Author

Yes I believe so. From a StackOverflow example:

task = asyncio.create_task(coro)
await asyncio.sleep(0) # return control to loop so task can start

@ktbarrett
Copy link
Member

ktbarrett commented Mar 28, 2020

What's the use case for this?

As mentioned in #1526, this can be implemented by adding coroutines to the scheduler using scheduler.queue. Which seems trivial until you realize there are other problems.

We don't really have a way to force a reschedule. NullTrigger fires immediately, not allowing any other coroutines to run. Timer(0) unnecessarily creates a GPI call and leaves behavior up to the simulator (Icarus fails if await Timer(0) follows an await ReadOnly()). So we need a way to force a reschedule.

I also brought up in my comment that because you must yield control to the scheduler for the coroutines to start, and you may forget to do this, it's possible for the test to end before the coroutine runs; resulting in unexpected or invalid behavior. We need a way to force the scheduler to run after a call to run_soon, but in a way the user has control over. We could implement this with something akin to Trio's nursery where we have an async with block that would allow us to implement run_soon as a delayed call to fork, which is run when the context exits. Perhaps there are other ways of doing this.

@ktbarrett
Copy link
Member

I'll amend this "What's the use case for supporting both behaviors?" I wouldn't be against changing the behavior of fork, but that would probably have to wait for a major version change.

@eric-wieser eric-wieser added the type:feature new or enhanced functionality label Apr 9, 2020
@ktbarrett
Copy link
Member

After diagnosing the root cause of #1688 and #1347, I am even more on board with the idea of fork not causing an immediate reschedule. Scheduler re-entry seems like a bad idea on it's surface, and obviously leads to strange bugs. So if/when we make a rewrite, we should design the scheduler around avoiding that.

@marlonjames
Copy link
Contributor Author

It also puts the originating Task in a weird state, where it's not pending a trigger but it's also not the running task.
Maybe we should add a new trigger, like Join but only one task can wait on it and it's run when the forked task completes or awaits a trigger. That might require changing the semantics though, i.e.:

await fork(coro())

@ktbarrett
Copy link
Member

@garmin-mjames What's the purpose of that trigger?

@marlonjames
Copy link
Contributor Author

The task that calls fork is the currently scheduled task, so as you said we re-enter the scheduler and call schedule on the new task. The original task is now in a "paused" state while the other task runs, but it's not pending a trigger like all other tasks that aren't currently running, it's just on the stack waiting for fork to return.

I guess it's sort of a half-way between the two options, where it creates the new task without scheduling it but then immediately yields to the scheduler to run the new task. It would preserve current behavior whereby the scheduler runs the new task then returns to the task that called fork.

@ktbarrett
Copy link
Member

ktbarrett commented Apr 24, 2020

@garmin-mjames I think we should just add the forked coroutine to the list of pending tasks, so when the current task yields control back to the scheduler, the forked coroutine starts up. If you want it to start up immediately, you can simply immediately yield control with a NullTrigger.

future = fork(coro())  # `RunningTask` is 90% of the way to a future
# ...
await NullTrigger()
# ...
res = await future

@marlonjames
Copy link
Contributor Author

@ktbarrett I agree.
But, NullTrigger won't work, as you noted up-thread:

We don't really have a way to force a reschedule. NullTrigger fires immediately, not allowing any other coroutines to run.

@marlonjames marlonjames added the category:core (DEPRECATED) Issues in the core of cocotb (scheduler, error reporting, etc.) label Apr 24, 2020
@ktbarrett
Copy link
Member

ktbarrett commented Apr 24, 2020

@garmin-mjames I think we should change that behavior. If we are going to break the scheduler interface, we should just break it and do what we need to make it better. We need to queue up issues and detail what the end product of a scheduler rewrite should look like.

This is the test I used to check the behavior of NullTrigger():

@cocotb.test()
async def test_nulltrigger_reschedule(dut):

    @cocotb.coroutine
    async def reschedule(n):
        for i in range(4):
            cocotb.log.info("Fork {}, iteration {}".format(n, i))
            await NullTrigger()

    await Combine(*(reschedule(i) for i in range(4)))

The output shows NullTrigger not rescheduling.

make results.xml
make[1]: Entering directory '/home/kaleb/dev/cocotb/tests/test_cases/test_cocotb'
MODULE=test_scheduler \
        TESTCASE=test_nulltrigger_reschedule TOPLEVEL=sample_module TOPLEVEL_LANG=verilog COCOTB_SIM=1 \
        /usr/bin/vvp -M /home/kaleb/dev/cocotb/cocotb/libs -m libcocotbvpi_icarus   sim_build/sim.vvp 
     -.--ns INFO     cocotb.gpi                         ..mbed/gpi_embed.cpp:71   in set_program_name_in_venv        Did not detect Python virtual environment. Using system-wide Python interpreter
     -.--ns INFO     cocotb.gpi                         ../gpi/GpiCommon.cpp:106  in gpi_print_registered_impl       VPI registered
     0.00ns INFO     cocotb.gpi                                gpi_embed.cpp:332  in embed_sim_init                  Running on Icarus Verilog version 10.3 (stable)
     0.00ns INFO     cocotb.gpi                                gpi_embed.cpp:333  in embed_sim_init                  Python interpreter initialized and cocotb loaded!
     0.00ns INFO     cocotb                                      __init__.py:145  in _initialise_testbench           Running tests with cocotb v1.4.0.dev0 from /home/kaleb/dev/cocotb/cocotb
     0.00ns INFO     cocotb                                      __init__.py:162  in _initialise_testbench           Seeding Python random module with 1587743335
     0.00ns WARNING  cocotb.regression                         regression.py:187  in initialise                      No tests were discovered
     0.00ns INFO     cocotb.regression                         regression.py:194  in initialise                      Found test test_scheduler.test_nulltrigger_reschedule
     0.00ns INFO     cocotb.regression                         regression.py:422  in execute                         Running test 1/1: test_nulltrigger_reschedule
     0.00ns INFO     ..trigger_reschedule.0x7fdfde86aac8       decorators.py:253  in _advance                        Starting test: "test_nulltrigger_reschedule"
                                                                                                                     Description: None
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 0, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 0, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 0, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 0, iteration 3
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 2, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 2, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 2, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:217  in reschedule                      Fork 2, iteration 3
     0.00ns ERROR    cocotb.scheduler                            __init__.py:204  in _sim_event                      Failing test at simulator request before test run completion: Simulator shutdown prematurely
     0.00ns ERROR    cocotb.regression                         regression.py:388  in _score_test                     Test error has lead to simulator shutting us down
                                                                                                                     cocotb.result.SimFailure: Failing test at simulator request before test run completion: Simulator shutdown prematurely
     0.00ns ERROR    cocotb.regression                         regression.py:232  in tear_down                       Failed 1 out of 1 tests (0 skipped)
     0.00ns INFO     cocotb.regression                         regression.py:477  in _log_test_summary               ****************************************************************************************************
                                                                                                                     ** TEST                                        PASS/FAIL  SIM TIME(NS)  REAL TIME(S)  RATIO(NS/S) **
                                                                                                                     ****************************************************************************************************
                                                                                                                     ** test_scheduler.test_nulltrigger_reschedule    FAIL            0.00          0.00         0.00  **
                                                                                                                     ****************************************************************************************************
                                                                                                                     
     0.00ns INFO     cocotb.regression                         regression.py:494  in _log_sim_summary                *************************************************************************************
                                                                                                                     **                                 ERRORS : 1                                      **
                                                                                                                     *************************************************************************************
                                                                                                                     **                               SIM TIME : 0.00 NS                                **
                                                                                                                     **                              REAL TIME : 0.00 S                                 **
                                                                                                                     **                        SIM / REAL TIME : 0.00 NS/S                              **
                                                                                                                     *************************************************************************************
                                                                                                                     
     0.00ns INFO     cocotb.regression                         regression.py:239  in tear_down                       Shutting down...
make[1]: Leaving directory '/home/kaleb/dev/cocotb/tests/test_cases/test_cocotb'

Funnily enough, that test fails too! Yay scheduler!

With COCOTB_SCHEDULER_DEBUG: sim.log.

EDIT: forking the coroutines before passing them to Combine shows it runs all the coroutines, but still in order and it still fails.

@eric-wieser
Copy link
Member

Nice test-case @ktbarrett, thanks

@marlonjames
Copy link
Contributor Author

marlonjames commented Apr 24, 2020

I'm working on an overhaul of the scheduler debug logging, and as part of that I found something interesting that's related.
Currently, react is not used to start a test, so we aren't in the normal event loop. We enter the event loop when certain Python triggers fire, or after we yield a GPITrigger and return.

If I add a await Timer(1) before the await Combine(...), it works:

@cocotb.test()
async def test_nulltrigger_reschedule(dut):

    @cocotb.coroutine
    async def reschedule(n):
        for i in range(4):
            cocotb.log.info("Fork {}, iteration {}".format(n, i))
            await NullTrigger()

    await Timer(1)

    await Combine(*(reschedule(i) for i in range(4)))

Result:

     0.00ns INFO     cocotb.gpi                                gpi_embed.cpp:332  in embed_sim_init                  Running on Aldec HDL Simulator version 10.01.3088
     0.00ns INFO     cocotb.gpi                                gpi_embed.cpp:333  in embed_sim_init                  Python interpreter initialized and cocotb loaded!
     0.00ns INFO     cocotb                                      __init__.py:145  in _initialise_testbench           Running tests with cocotb v1.4.0.dev0 from xxxx
     0.00ns INFO     cocotb                                      __init__.py:162  in _initialise_testbench           Seeding Python random module with 1587769016
     0.00ns WARNING  cocotb.regression                         regression.py:187  in initialise                      No tests were discovered
     0.00ns INFO     cocotb.regression                         regression.py:194  in initialise                      Found test test_scheduler.test_nulltrigger_reschedule
     0.00ns INFO     cocotb.regression                         regression.py:428  in execute                         Running test 1/1: test_nulltrigger_reschedule
     0.00ns INFO     .._nulltrigger_reschedule.0xe9d8ab0       decorators.py:253  in _advance                        Starting test: "test_nulltrigger_reschedule"
                                                                                                                     Description: None
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 0, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 1, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 2, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 3, iteration 0
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 0, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 1, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 2, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 3, iteration 1
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 0, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 1, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 2, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 3, iteration 2
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 0, iteration 3
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 1, iteration 3
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 2, iteration 3
     0.00ns INFO     cocotb                                test_scheduler.py:216  in reschedule                      Fork 3, iteration 3
     0.00ns INFO     cocotb.regression                         regression.py:366  in _score_test                     Test Passed: test_nulltrigger_reschedule
     0.00ns INFO     cocotb.regression                         regression.py:235  in tear_down                       Passed 1 tests (0 skipped)
     0.00ns INFO     cocotb.regression                         regression.py:483  in _log_test_summary               ****************************************************************************************************
                                                                                                                     ** TEST                                        PASS/FAIL  SIM TIME(NS)  REAL TIME(S)  RATIO(NS/S) **
                                                                                                                     ****************************************************************************************************
                                                                                                                     ** test_scheduler.test_nulltrigger_reschedule    PASS            0.00          0.04         0.05  **
                                                                                                                     ****************************************************************************************************

     0.00ns INFO     cocotb.regression                         regression.py:500  in _log_sim_summary                *************************************************************************************
                                                                                                                     **                                 ERRORS : 0                                      **
                                                                                                                     *************************************************************************************
                                                                                                                     **                               SIM TIME : 0.00 NS                                **
                                                                                                                     **                              REAL TIME : 0.05 S                                 **
                                                                                                                     **                        SIM / REAL TIME : 0.04 NS/S                              **
                                                                                                                     *************************************************************************************

     0.00ns INFO     cocotb.regression                         regression.py:239  in tear_down                       Shutting down...

When we are in the event loop, yielding a NullTrigger will add it to the scheduler's _pending_triggers rather than fire immediately:

cocotb/cocotb/scheduler.py

Lines 332 to 334 in a3fb4b5

if self._is_reacting:
# queue up the trigger, the event loop will get to it
self._pending_triggers.append(trigger)

I changed add_test to use react to start the test, mostly to get the "handing control back to simulator" logging to work correctly, and found this. It may also fix other issues.

This was referenced Apr 25, 2020
@ktbarrett
Copy link
Member

ktbarrett commented Aug 18, 2020

Bumping this because this issue is now directly blocking a feature a user (me) wants to implement.

One of our testbenching methodologies is built around the CSP model. Each part of the testbench is an independent process communicating to other parts via channels. The processes are implemented using cocotb.fork (well our compat layer that allows it to run in asyncio as well).

The limitation is that processes must be instantiated and then started as separate steps. This is because we don't want the process to start immediately (due to cocotb.fork's current behavior) on instantiation and send messages before the consumer is instantiated; data would be dropped.

Because we have to start the process as a separate step, we cannot support anonymous processes. All processes must be managed by some object, that object needs to know about the processes it manages, which starts them up after all testbenching structure is instantiated.

We were hoping we could use the concept of anonymous processes for testbench customization. The idea is we delete an anchor point, anonymous processes which are are only referenced by their channels, weakly, will be automatically pruned. For example, this is useful for customizing a Scoreboard in a reuse VIP, or removing a Driver from an Agent when it should be passive. All Processes which connect to the deleted elements would be automatically pruned and not left dangling, in case it would improve performance or prevent errors.

Changing cocotb.fork to not immediately schedule would allow our use of anonymous process. Processes would not start until the Task that is doing the testbench construction and customization yields control. At which point, all testbenching structure should be in place and properly connected.

@eric-wieser
Copy link
Member

eric-wieser commented Aug 18, 2020

What does the code in #1541 (comment) do on master? (edit: it does what that comment describes, it's tested in #1705).

My impression was we "fixed" NullTrigger so that it now gives you what you need.

@eric-wieser
Copy link
Member

eric-wieser commented Aug 18, 2020

@ktbarrett: Can you get what you need with

def fork_later(coro):
    e = Event()
    async def wrapped():
        await e.wait()
        return (await coro)
    return e, cocotb.fork(wrapped())

used as

e, task = fork_later(coro)

# later

e.set()

@ktbarrett
Copy link
Member

ktbarrett commented Aug 19, 2020

@eric-wieser I forgot that was fixed... The second suggestion is equivalent to what we have now but uses an Event, instead of simply calling fork later. But the first should be sufficient for our needs.

But if await NullTrigger() does what we need it to, maybe we could codify that pattern and add documentation; that way we can consider this issue closed? I'm thinking something simple like:

def run_task_soon(coro: Union[RunningTask, coroutine]) -> RunningTask:
    """Runs a coroutine concurrently, starting after the current coroutine yields control"""
    async def later():
        await NullTrigger()
        await coro
    return cocotb.fork(later())

I took this issue as a general complaint about the scheduler being recursive and causing a number of bugs, but that should probably be in a separate issue.

@ktbarrett ktbarrett added category:codebase:scheduler relating to cocotb's coroutine scheduler, triggers, and coroutine objects and removed category:core (DEPRECATED) Issues in the core of cocotb (scheduler, error reporting, etc.) labels Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:codebase:scheduler relating to cocotb's coroutine scheduler, triggers, and coroutine objects type:feature new or enhanced functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants