Skip to content

Conversation

nick1udwig
Copy link
Member

@nick1udwig nick1udwig commented Feb 21, 2024

Problem

kit run-tests not working

Solution

  1. Fix tester processes here.
  2. Add a starting "verbosity" level command-line option.
  3. Add a "timed out" print.

Docs Update

Corresponding docs PR

Notes

Solution 3 may be controversial, so please take a look at it.

@nick1udwig nick1udwig marked this pull request as ready for review February 21, 2024 23:57
@nick1udwig nick1udwig requested a review from dr-frmr February 21, 2024 23:57
@nick1udwig
Copy link
Member Author

nick1udwig commented Feb 22, 2024

To test you must use this PR, this kit, and this core_tests. You'll have to edit the core_tests tests.toml to point at your Kinode repo with this branch selected.

@nick1udwig
Copy link
Member Author

Some notes:

  1. Tests are obliged to wait for a timeout in kns_indexer/app_store before proceeding. It'd be nice if this weren't the case, but I didn't dig into what is going on here. I'd guess the problem is with the nodes not being connected to an ETH RPC.
  2. run-tests: don't print ERROR when pinging to connect to node kit#92
  3. capture errors when runtime_verbose = false core_tests#7

let our = self.metadata.our.clone();
let timeout_handle = tokio::spawn(async move {
tokio::time::sleep(std::time::Duration::from_secs(timeout_secs)).await;
print(&send_to_terminal, 0, format!("{our}: timed out")).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, controversial indeed.. i don't think this is a good timeout-tracking solution

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually necessary or just a visual debugging tool?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite hard to debug timeouts without an indication of what is going on. We could increase verbosity level or try to figure out some alternative way to do help devs debug, but the status quo of not informing devs is bad (I spent a day tracking down timeout-related bug because it wasn't clear to me what timeouts were happening, whereas if there were some type of print, it would've been much faster).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard for me to relate since timeouts do show up in the event loop trace, and unless there's a kernel bug in how they get sent back to processes, they should come up in the receive() message queue. In almost all the programs I've written, I have a print that shows any errors that come into my process. Is this an issue that's specific to runtime extension development?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I think maybe this was due to the specific path I followed in bringing run-tests into a more usable state or perhaps getting confused about multiple timeouts hitting. Regardless, I've removed the print in f2b8d38

Copy link
Contributor

@dr-frmr dr-frmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, everything looks good now

@nick1udwig nick1udwig merged commit b2a75df into develop Feb 24, 2024
@nick1udwig nick1udwig deleted the hf/fix-tests branch February 24, 2024 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants