Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose config option to not persist SessionPool and Statistics #1789

Closed
metalwarrior665 opened this issue Feb 14, 2023 · 2 comments · Fixed by #2213
Closed

Expose config option to not persist SessionPool and Statistics #1789

metalwarrior665 opened this issue Feb 14, 2023 · 2 comments · Fixed by #2213
Assignees
Labels
feature Issues that represent new features or improvements to existing features. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@metalwarrior665
Copy link
Member

Which package is the feature request for? If unsure which one to select, leave blank

None

Feature

This is a recurring issue that people don't want extra files to be added or extra Apify API calls being triggered by this. So we should probably have flags to easily disable these, currently you have to hack the internal APIs.

Motivation

as above

Ideal solution or implementation, and any additional constraints

...

Alternative solutions or implementations

No response

Other context

No response

@metalwarrior665 metalwarrior665 added the feature Issues that represent new features or improvements to existing features. label Feb 14, 2023
@B4nan
Copy link
Member

B4nan commented Feb 14, 2023

So you'd like to disable this on apify platform, right? Because with memory storage you can already disable persistence completely and keep things only in memory, but this should be more finegrained?

Can you maybe provide some of the hackish solutions you can use right now?

@metalwarrior665
Copy link
Member Author

metalwarrior665 commented Feb 14, 2023

Yeah, the reason would be to make it more fine-grained for cases where you e.g. want to persist queue and session but not stats (example).

I was using

import {Statistics} from 'crawlee';
Statistics.prototype.persistState = () => Promise.resolve();

The use-case was that I was running helper Crawler in the background and I didn't want it to store any extra files. Another option would be to allow disabling the stats altogether (no log, no persist).

The last trigger for this was some user complaining that he has too many KV writes after running tons of small actors because of the persisting :)

Nothing super important but wanted to thin how to kill 2 birds (local and platform concerns) with 1 stone

@mtrunkat mtrunkat added the t-tooling Issues with this label are in the ownership of the tooling team. label Sep 12, 2023
@foxt451 foxt451 self-assigned this Nov 28, 2023
B4nan added a commit that referenced this issue Dec 20, 2023
Allow disabling automatic persistence in Statistics and SessionPool.
Add additional methods for manual disabling in case it's needed, but
just not the automatic one
Closes #1789

---------

Co-authored-by: Martin Adámek <banan23@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Issues that represent new features or improvements to existing features. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants