Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce weak references #165

Merged
merged 6 commits into from Dec 3, 2022
Merged

Introduce weak references #165

merged 6 commits into from Dec 3, 2022

Conversation

mpenkov
Copy link
Collaborator

@mpenkov mpenkov commented Nov 26, 2022

Author is @Cologler

Continued from #153 because that branch was mistakenly deleted.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 26, 2022

@piskvorky @Cologler Please eyeball this PR and let me know if you spot anything odd

@Cologler
Copy link
Contributor

@piskvorky @Cologler Please eyeball this PR and let me know if you spot anything odd

Looks Good To Me 👌.

sqlitedict.py Outdated Show resolved Hide resolved
@Cologler
Copy link
Contributor

Cologler commented Nov 26, 2022

What if the user does some longtime work in the loop?

for value in a_sqlite_dict.values():
    # longtime work here, like an HTTP request
    ...

Co-authored-by: Radim Řehůřek <me@radimrehurek.com>
@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 29, 2022

What if the user does some longtime work in the loop?

for value in a_sqlite_dict.values():
    # longtime work here, like an HTTP request
    ...

I think the response queue will continue to receive new items.

Basically, there is a producer/consumer problem here. The producer is the sqlite thread and the consumer is the user. We want to produce results quickly enough for the user to consume them without overwhelming the user. If we produce too quickly (or the user is too slow to consume) then the queue grows in size and occupies more memory.

I think one way to solve that particular problem is to use fixed-size queues for the responses. If a queue is full, then it means the user is consuming faster than we are producing, and we can produce slower.

@piskvorky What do you think? Is there a reason why we use queues of unlimited size?

@piskvorky
Copy link
Owner

piskvorky commented Dec 1, 2022

If a queue is full, then it means the user is consuming faster than we are producing, and we can produce slower.

Do you mean "consuming slower"?

Anyway, limiting the queue is technically easy, here:

https://github.com/RaRe-Technologies/sqlitedict/blob/8545a174a4a32bd2f8eb7c17d89899370138fe3a/sqlitedict.py#L623-L626

But I'm not sure about the implications. The code is sufficiently complex that a blocking puts (due to "queue full") could lead to dead locks, unless we think through all the code paths carefully.

What happens when the consumers don't get (for whatever reason – generator, exception…), so that the put blocks forever?

@mpenkov
Copy link
Collaborator Author

mpenkov commented Dec 2, 2022

Do you mean "consuming slower"?

Yes, sorry.

Anyway, limiting the queue is technically easy, here:

Yes, as you point out, this is the easy part. The hard part is ensuring nothing else breaks.

What happens when the consumers don't get (for whatever reason – generator, exception…), so that the put blocks forever?

Good question...

Anyway, I think this discussion is out of scope for this PR (but definitely helpful). I think we're done for now.

@mpenkov mpenkov merged commit 109e5ea into master Dec 3, 2022
@mpenkov mpenkov deleted the Cologler_feat-weakref-reader branch December 3, 2022 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interrupted iteration over a SqliteDict instance SqliteDict memory issue
3 participants