Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot do subsequent runs using xrt driver without hot resetting the board #71

Open
PedrooHR opened this issue Jun 27, 2022 · 4 comments · Fixed by #134
Open

Cannot do subsequent runs using xrt driver without hot resetting the board #71

PedrooHR opened this issue Jun 27, 2022 · 4 comments · Fixed by #134
Assignees
Labels
bug Something isn't working

Comments

@PedrooHR
Copy link
Contributor

The first execution of the test/host/xrt/test.cpp test works fine (using hardware). When doing a subsequent runs of this test, it fails and shows errors like the ones present in the subsequent-run.log file (subsequent runs work fine in the pynq tri test). We are only able to execute the test again successfully, using the xrt driver, if doing a hot reset on the board.

@quetric quetric self-assigned this Jun 28, 2022
@quetric quetric added the bug Something isn't working label Jun 28, 2022
@quetric
Copy link
Collaborator

quetric commented Jun 28, 2022

Thanks @PedrooHR for reporting this. I was able to replicate this issue on our end.

@quetric quetric mentioned this issue Dec 9, 2022
@quetric
Copy link
Collaborator

quetric commented Dec 15, 2022

Appears that #134 improves soft reset functionality but is not a complete fix.

@PedrooHR
Copy link
Contributor Author

PedrooHR commented Feb 8, 2023

I'm still not able to do subsequent runs.

Now, when I correctly call ACCL destructor I get this error

terminate called after throwing an instance of 'std::runtime_error'
  what():  internal error: missing kds device

I'm not sure why, but I've tracked the source and the error occurs during the execution of the call for config scenario with cfgFunc::reset_periph function in the ACCL::deinit() function

EDIT 1: It was a problem of the ending operations when ending OMPC runtime (due the "non-natural" way we do that). But I still have other problems that I am investigating.

EDIT 2: deinit() operation seems fine now. Although, I am get an error in the subsequent run when opening the ports:

terminate called after throwing an instance of 'xrt_core::system_error'
  what():  failed to launch execution buffer: Resource deadlock avoided

@quetric
Copy link
Collaborator

quetric commented Feb 9, 2023

your deinit() issues are probably a race condition with XRT destruction. Since the ACCL destructor uses XRT to access the CCLO and clean it up, it must always run before the XRT device is destroyed. The EDIT2 issue might come from the TCP stack being incorrectly deinitialized when exiting the previous run - I will investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants