Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to how timeouts are handled. #1043

Merged
merged 15 commits into from
Jul 26, 2022
Merged

Improvements to how timeouts are handled. #1043

merged 15 commits into from
Jul 26, 2022

Conversation

Lucaweihs
Copy link
Collaborator

We have code that times out the AI2-THOR unity process if it doesn't respond to a request in some number of seconds when using the WSGI server. Similar code did not seem to exist for the, now default, fifo server. This PR adds this timeout functionality (with tests) and adds a few small typing improvements.

@lgtm-com
Copy link

lgtm-com bot commented Jul 19, 2022

This pull request introduces 3 alerts when merging ebf9519 into 391b3fa - view on LGTM.com

new alerts:

  • 2 for Module is imported with 'import' and 'import from'
  • 1 for Unused import

warnings.warn(message)
self.start(width=self.width, height=self.height, x_display=self.x_display)
self.reset()
raise RestartError(message)
except Exception as e:
self.server.stop()
raise (TimeoutError if isinstance(e, TimeoutError) else RuntimeError)(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more clear if we had an except branch for TimeoutError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree regarding clarity but I'd prefer to have a single branch here as, otherwise, we would be duplicating the same self.server.stop() etc code twice.

ai2thor/controller.py Outdated Show resolved Hide resolved
self.last_event = self.server.receive(
timeout=max(
60.0 * 5,
self.server_timeout * 5 if self.server_timeout is not None else 0.0
Copy link
Collaborator

@AlvaroHG AlvaroHG Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this multiplied by 5 wouldn't it be better for the value to be the actual timeout_value so there aren't unexpected behaviors?

Also seems like the if condition will never be true because there is an assert for self.server_timeout is None in the constructor

Copy link
Collaborator Author

@Lucaweihs Lucaweihs Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to just create a server_start_timeout parameter. I was trying to keep the number of initialization parameters low but this was probably unnecessary.

Also seems like the if condition will never be true because there is an assert for self.server_timeout is None in the constructor

The assert you're talking about checks that self.server_timeout is None or that it's positive.

start_t = time.time()
message = b""

while message_size > 0:
Copy link
Collaborator

@AlvaroHG AlvaroHG Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this way of reading with the while loop and directly from the file descriptor any slower? I don't think it should, but an option could be if server_timeout is None then we could use the old function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's already there if look at lines 116-117.

max_attempts = (
float("inf")
if timeout is None or timeout == float("inf") else
max(int(math.ceil(timeout / 0.5)), 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A line is a bit hard to read, any reason for / 0.5 vs * 2, making it floating point?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this a bit clearer, the idea is that the 0.5 references the 0.5 used by the que.get(...) call a few lines layer.

@@ -1987,6 +1987,16 @@ abstract public class BaseFPSAgentController {
actionFinished(true);
}

protected IEnumerator waitForSecondsRealtime(int seconds) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see this being referenced anywhere

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's being used by the new Sleep action which is used by the new tests that check that make sure the timeout code works properly.

@AlvaroHG
Copy link
Collaborator

A couple of comments this is great! LGTM after that

@Lucaweihs Lucaweihs merged commit 79e92c5 into nanna Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants