🪲 BUG-#33: Auto-clear UID cache after iteration and warn on large caches#53
Conversation
FernandoCelmer
left a comment
There was a problem hiding this comment.
🔍 Code Review
Code issues found: 1
See inline comment below.
| for fetched in fetcher.chunks(uids): | ||
| done = min(done + size, total) | ||
| logger.info( | ||
| "Fetched %d/%d messages from %s (mode=%s)", |
There was a problem hiding this comment.
[Blocking]
Problem — The clear_cache() call inside the finally block of messages() fires whenever the generator is closed, including partial consumption. This means calling first(), which iterates only one message and then abandons the generator, will trigger clear_cache() and set _cached_uids = None.
Failure scenario — Consider this common usage pattern:
where = app.inbox.where(Q.unseen())
count = where.count() # caches UIDs via _uids()
msg = where.first() # calls messages(chunk_size=1), returns 1st msg
# generator abandoned -> finally fires -> cache cleared
count2 = where.count() # _cached_uids is None -> re-queries IMAP server!Every call to first(), last(), or partial for loop iteration will destroy the cached UIDs, forcing a full IMAP SEARCH on the next access. This is a significant performance regression for interactive or polling use cases.
Fix — Instead of unconditionally clearing in finally, only clear the cache when the generator is fully exhausted (not abandoned):
def messages(self, ...):
uids = self._uids()
...
exhausted = False
try:
for fetched in fetcher.chunks(uids):
...
exhausted = True
finally:
if exhausted:
self.clear_cache()Alternatively, consider a TTL-based expiration or a max-size eviction policy instead of aggressive clearing.
Summary
messages()generator body intry/finallysoclear_cache()is called when iteration completes or the generator is closed/GC'd, preventing unbounded UID cache growth._uids()when the cached UID list exceeds 50,000 entries, advising the caller to consider pagination.Test plan
pytest tests/ -x -q)_cached_uidsisNoneafter fully iteratingmessages()Closes #33