Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaking goroutines through model and leveldb #5505

Closed
imsodin opened this issue Feb 3, 2019 · 6 comments

Comments

Projects
None yet
3 participants
@imsodin
Copy link
Member

commented Feb 3, 2019

While running a model test with -count 1000 I ran into OOM. Looking at a stacktrace I found these culprits:

    340 github.com/syncthing/syncthing/lib/model.(*ProgressEmitter).Serve
    340 github.com/syndtr/goleveldb/leveldb.(*DB).compactionError
    340 github.com/syndtr/goleveldb/leveldb.(*DB).mCompaction
    340 github.com/syndtr/goleveldb/leveldb.(*DB).mpoolDrain
    340 github.com/syndtr/goleveldb/leveldb.(*DB).tCompaction
    340 github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).drain

The progress emitter is straight forward: It should be added to the supervisor not manually run in a goroutine. The leveldb stuff is because we do not close the the database, both in tests and in cmd/syncthing's main. In main the database is manipulated before passing it to the model, but not concurrently with the model. So there's two options

  1. Close the db on model.Close: Nice because we don't need to think about it in all the tests.
  2. Close the db where it is created, meaning in cmd/syncthing's main and every single test that uses a db. The "cleaner" approach because the same function that opens it also closes it but it's tedious.

Opinions? @AudriusButkevicius @calmh

@imsodin imsodin added the bug label Feb 3, 2019

@imsodin imsodin self-assigned this Feb 3, 2019

@AudriusButkevicius

This comment has been minimized.

Copy link
Member

commented Feb 3, 2019

I think the db should be closed where it was opened.

@imsodin

This comment has been minimized.

Copy link
Member Author

commented Feb 4, 2019

Actually, looking at your locations package, I wan't to open and close the db in model - main has no business handling the database. And locations allows for that to happen.

@calmh

This comment has been minimized.

Copy link
Member

commented Feb 4, 2019

I don't think the model package should carry the responsibility for opening, upgrading, etc the database. If nothing else that's quite late in the startup, and it's least one thing that isn't currently under the jurisdiction of model. I think open+defer-close, in main and in tests.

@imsodin

This comment has been minimized.

Copy link
Member Author

commented Feb 4, 2019

Ok, got it - also db.OpenMemory for tests...

@imsodin

This comment has been minimized.

Copy link
Member Author

commented Feb 4, 2019

And obviously this is more complicated than I thought: leveldb.DB panics when any function is called on it after Close (except Close again). It's in no way assured that after model.Stop there won't be any such calls (exported functions on Model, go sendIndexes, probably more). Now I could add some closed channel to Model and try to check for that wherever necessary, but that's both tedious and error prone. A bit less error prone but just as or even more tedious would be to make leveldb.DB a named member of db.Lowlevel and add such a close condition on db.Lowlevel, which then protects all functions on leveldb.DB. And all of that just for tests, as otherwise we never create multiple models.
Any ideas for elegant solutions?

imsodin added a commit to imsodin/syncthing that referenced this issue Apr 16, 2019

imsodin added a commit to imsodin/syncthing that referenced this issue Apr 28, 2019

imsodin added a commit to imsodin/syncthing that referenced this issue Apr 28, 2019

@imsodin imsodin closed this in fe4daf2 May 2, 2019

@calmh calmh added this to the v1.1.3 milestone May 8, 2019

@calmh

This comment has been minimized.

Copy link
Member

commented May 10, 2019

I'm removing the bug tag (for release notes) because as far as I understand this wasn't an issue in actual production (since we don't restart the relevant components there).

@calmh calmh removed the bug label May 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.