Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coprocessor stalls indefinitely if result transactions are not accepted immediately #59

Closed
moimfeld opened this issue May 4, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@moimfeld
Copy link
Contributor

moimfeld commented May 4, 2022

Hi @michael-platzer

This issue might not be reproducible without the UVM environment. It is planned to open-source the environment in the next week.

UVM environment

As discussed yesterday, I am in the progress of setting up a UVM environment to verify Vicuna. The environment drives the core-side signals of the x-interface channels and therefore "emulates" a core. Any handshake signal controlled by the environment can be configured to have a random delay. Here are a few examples of what this means:

  • commit_valid can be configured to have a random delay (in clock cycles) w.r.t. the issue handshake
  • mem_ready can be configured to have a random delay (in clock cycles) w.r.t. the assertion of the mem_valid signal
  • mem_result_valid can be configured to have a random delay (in clock cycles) w.r.t. the memory request/response handshake

(This is not a complete list of all signals with random delay)

Note: Even though there is random delay on certain transactions, the core-side is strictly in-order. So no transaction initiated by the environment is OoO.

Issue

When turning on random delay for the result_ready signal of the result interface the coprocessor stalls indefinitely when result transactions is not immediately accepted. Below you can find a picture of the x-interface signals. After 590 ns the coprocessor stalls indefinitely. I have not further investigated this observation.

issue_59

Problematic Instruction sequence

#  ------------------------------------------------------------
# |
# |
# |	Next Instruction Sequence Info
# |
# | 		Number of Instructions:           7
# |
# | 		 1. Instruction: 	0002f2d7
# | 		 2. Instruction: 	02050007
# | 		 3. Instruction: 	5e000157
# | 		 4. Instruction: 	0002f2d7
# | 		 5. Instruction: 	00058107
# | 		 6. Instruction: 	0002f2d7
# | 		 7. Instruction: 	02050127
# |
# |
#  ------------------------------------------------------------

This sequence corresponds to the following assembly (vle8_8.S), where only the vector instructions are offloaded:

    la              a0, vdata_start     
    li              t0, 16              
    vsetvli         t0, t0, e8,m1,tu,mu
    vle8.v          v0, (a0)
    vmv.v.v         v2, v0
    vsetvli         t0, t0, e8,m1,tu,mu
    vle8.v          v2, (a1), v0.t
    li              t0, 16              
    vsetvli         t0, t0, e8,m1,tu,mu
    vse8.v          v2, (a0)
    la              a0, vdata_start     
    la              a1, vdata_end       
    j               spill_cache         

Expected execution (where random result_valid delay is disabled)

issue_59_60_61_correct_execution

@michael-platzer
Copy link
Contributor

Hi @moimfeld, I fixed the logic that was responsible for the indefinite stall triggered by this particular test case. However, there are at least two more cases where de-asserting result_ready over a long period of time could cause troubles, which I will fix after PR #43 is merged (the required changes would conflict with the PR).

@moimfeld
Copy link
Contributor Author

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants