Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very long startup with badgerds #8034

Closed
RubenKelevra opened this issue Mar 29, 2021 · 4 comments · Fixed by #8040
Closed

Very long startup with badgerds #8034

RubenKelevra opened this issue Mar 29, 2021 · 4 comments · Fixed by #8040
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@RubenKelevra
Copy link
Contributor

RubenKelevra commented Mar 29, 2021

I noticed that go-ipfs will take quite a while to startup when using badgerds. By default the systemd service files allow for 2 minute startup times. I've configured this to be 15 minutes and the service has a hard time completing this in time.

I've recommended a while ago to extend the startup time since badgerds takes a while sometimes to start-up - as well as the --migration step might be interrupted when stopping the service after 2 minutes.

I don't think it's a good practice to provide a systemd file that might interrupt the --migration after an update, as well as cleanup operations on badgerds, which leaves the service needing manual interventions.

Sending systemd a signal like 'startup completed' on the other hand might break some other services which wait for the service to be ready for network operations, so we can't just send a ready signal and continue fiddling around with the database...

The right solution would be to extend the startup timeout as well as the shutdown timeout by sending EXTEND_TIMEOUT_USEC=… to systemd when a database operation is still working on the operation. This ensures systemd isn't killing the daemon prematurely.

Maybe we should add a notification while this signal is sent to the log, to make sure the administrator is aware of what is going on.

I've captured such a startup's stack trace with a kill -SIGABRT after 222 seconds runtime:

long_startup.log

Version information:

go-ipfs version: 0.9.0-dev-3f9c3f455
Repo version: 11
System version: amd64/linux
Golang version: go1.16.2

@RubenKelevra RubenKelevra added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Mar 29, 2021
@RubenKelevra
Copy link
Contributor Author

#6741 is talking about a similar limitation, but it seems to be related to a flatfs limitation.

@RubenKelevra
Copy link
Contributor Author

#7269 is similar, but only covers database migrations. We need this also when opening a badger datastore.

@gammazero
Copy link
Contributor

We should consider #7895 if/when addressing this issue.

Stebalien added a commit that referenced this issue Mar 31, 2021
@Stebalien Stebalien removed the need/triage Needs initial labeling and prioritization label Mar 31, 2021
@RubenKelevra
Copy link
Contributor Author

Thanks for the quick workaround :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants