Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove document cache after a certain inactivity #1494

Open
fflorent opened this issue Mar 3, 2025 · 5 comments
Open

Remove document cache after a certain inactivity #1494

fflorent opened this issue Mar 3, 2025 · 5 comments
Labels
enhancement New feature or request gouv.fr

Comments

@fflorent
Copy link
Collaborator

fflorent commented Mar 3, 2025

Describe the problem to be solved

Currently, in a multi-workers architecture, the /persist folder is filled of documents that has been loaded. This cache is only removed after a doc worker restart.

Describe the solution you would like

It would be neat that once the document has been closed, and after a certain time of inactivity the local copy in /persist is deleted.

If in the meantime (so after a document has been closed) there is some activity, the document is loaded again and the timer for wiping the local cache is disabled.

@paulfitz
Copy link
Member

paulfitz commented Mar 3, 2025

The "Housekeeper" might be a natural place for this code to live:

/**
* Take care of periodic tasks:
*
* - deleting old soft-deleted documents
* - deleting old soft-deleted workspaces
* - logging metrics
*
* Call start(), keep the object around, and call stop() when shutting down.
*
* Some care is taken to elect a single server to do the housekeeping, so if there are
* multiple home servers, there will be no competition or duplication of effort.
*/
export class Housekeeper {

It may be simpler than other housekeeping tasks, since it is scoped to an individual doc worker.

@fflorent
Copy link
Collaborator Author

fflorent commented Mar 4, 2025

I thought of this place as well. But I have rather considered adding a timer in ActiveDoc for these reasons:

  • there already exists a timer for inactivity before closing the document, I thought of adding another one which would be enabled after the document is closed for inactivity;
  • instead of letting each document worker inspect their documents and wipe the unused cache, the document would remove its own cache just like it closes itself;

If that makes sense, otherwise I would be glad to hear your feedback :).

@paulfitz
Copy link
Member

paulfitz commented Mar 4, 2025

I have rather considered adding a timer in ActiveDoc

Hmm you'll need to work carefully. When a document is unused for some time, its ActiveDoc will currently shut down to recover RAM, which means that the ActiveDoc no longer exists. That would be a bit early to wipe the file from disk in my opinion, since it would be common for a document to be visited say once a day. You do want to reclaim RAM in that scenario, but disk space, arguable, given that recovery from s3 can take several seconds (depending).

It is possible to add a new state to ActiveDoc, but it will need a good mental model to avoid breaking interactions with DocManager and other classes. Definitely doable but be prepared for adventures.

@paulfitz
Copy link
Member

paulfitz commented Mar 4, 2025

The bulk of RAM is held by the sandbox associated with the ActiveDoc so holding ActiveDocs around after sandboxes shut down is certainly not ruled out, just a significant change.

@fflorent
Copy link
Collaborator Author

fflorent commented Mar 4, 2025

Oh, I see, a shutdown unregisters the ActiveDoc from the DocManager, which means, if I am correct, that reopening the document after that would mean create a new instance of ActiveDoc.

I thought the ActiveDoc instance would be reused.

Now using Housekeeper makes total sense to me. Thank you @paulfitz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gouv.fr
Projects
Status: In Progress
Development

No branches or pull requests

2 participants