Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #800 - Invalid Command Response on Database Outage #1009

Conversation

benjamin-lawson
Copy link
Contributor

Closes #800.

@Askaholic
Copy link
Collaborator

Awesome, thanks for the PR! Sorry if it takes me a little while to get to reviewing it properly, but at first glance it looks great!

@benjamin-lawson
Copy link
Contributor Author

Awesome, thanks for the PR! Sorry if it takes me a little while to get to reviewing it properly, but at first glance it looks great!

No worries at all! Just wanted to give you a heads up. I'll try to look at some of the other open issues.

Copy link
Collaborator

@Askaholic Askaholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for you patience. Sorry it took me a while to get to this. I finally replaced the battery on my laptop so now I can use it without having to be plugged in all the time which makes me significantly more likely actually open it up haha.

@@ -216,6 +216,15 @@ async def on_message_received(self, message):
self.get_user_identifier(),
cmd
)
except OperationalError:
# When the database goes down, SqlAlchemy will throw an OperationalError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we still want to log the exception here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can go ahead and add logging here. I was avoiding it only because if the DB is truly down, it could cause a lot of logs to the logging service to be ingested due to failed commands. But if we feel like the logging service can handle that volume without issue, I have no qualms about adding it in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question. I'm not sure it's every happened before, but I think it's safer to have it log something even if it ends up spamming a lot of messages than to have it error silently and require users to start complaining before we can see that there's an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too easy. I've added that logging in the most recent version.

Comment on lines +223 to +226
"style": "error",
"text": "Unable to connect to database. Please try again later."
})
# Make sure to abort here to avoid a thundering herd problem.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to use the kick style to prevent the client from trying to automatically reconnect. @Sheikah45?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that's what https://github.com/FAForever/server/pull/1009/files#diff-de94292bdcec0d8f98b8eb76040f97830866bed0c1fb5b55165a9e54f175ea0bR227 would do for the client, but if that is not the case, we for sure need to send the client a "Stop and don't retry" message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, there's currently no good system for telling the client that it should not attempt to reconnect. I believe the kick style message is all we have. It might be cleaner to add a dedicated message for that but that would be for another PR and also require client changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://github.com/FAForever/downlords-faf-client/blob/ca8554d8767e748498dee37caa39cf3f657284a6/src/main/java/com/faforever/client/remote/FafServerAccessor.java#L115 could be an issue since the logic is if we were connected and suddenly were disconnected, retry the login and connect. Net new connections should be okay since it will be going from connecting to disconnected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem with the kick style is that it will also close the client. If this is desired behavior then sure but otherwise it shouldn't be kick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sheikah45 @Askaholic I think this should work for what we want then. This follows the same pattern as if someone were banned where it sends a message and aborts the connection.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, we'll leave it like this then.

@Askaholic Askaholic merged commit 6312b1e into FAForever:develop Apr 20, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Command invalid when db is down
3 participants