Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed cache flush exporting files #3477

Conversation

fsilvestro
Copy link
Contributor

@fsilvestro fsilvestro commented Dec 29, 2021

The Writer class, in the write function, is flushing the cache at the end of its process.
Using the illuminate cache driver, this will result in flushing the entire cache, losing:

  • All the keys stored by other jobs/processes.
  • Other potential export jobs.

1️⃣ Why should it be added? What are the benefits of this change?
The benefit is to allow multiple exports at the same time, without losing any job/keys stored in the cache.

2️⃣ Does it contain multiple, unrelated changes? Please separate the PRs out.
No

3️⃣ Does it include tests, if possible?
Yes

4️⃣ Any drawbacks? Possible breaking changes?
No

5️⃣ Mark the following tasks as done:

  • Checked the codebase to ensure that your feature doesn't already exist.
  • Take note of the contributing guidelines.
  • Checked the pull requests to ensure that another person hasn't already submitted a fix.
  • Added tests to ensure against regression.
  • Updated the changelog

6️⃣ Thanks for contributing! 🙌

@patrickbrouwers
Copy link
Member

Hey thanks for your PR. I don't think it's a good idea to disable the entire flushing as that could clog up the cache driver. Maybe good to see if we could do something with a key and only clear that queue? Would you be up to PR'ing that?

@patrickbrouwers
Copy link
Member

Another, perhaps better, option could be to allow to specify a specific illuminate cache store. That way your could have a separate cache for just excel, that way flushing it wouldn't break anything.

@fsilvestro
Copy link
Contributor Author

fsilvestro commented Dec 30, 2021

Yes, I agree that disabling the entire flush isn't ideal. Still I think it's better to force the developers to manage themselves the flushing rather than not being able to have multiple users exporting a file.
Take this scenario as an example: User A is exporting a file, user B after 3 seconds tries to export another file. If the cache driver is illuminate, the jobs related to one of the 2 users will be flushed.
Basically the net result is that this library, configured with the illuminate cache driver, is not supporting more than 1 export at the time.
I'll have a look in January if it is possible to update this PR to use different cache stores.

@patrickbrouwers
Copy link
Member

I guess every export should hold a unique cache key then

@fsilvestro
Copy link
Contributor Author

Yes, that should be the idea. The only concern is that the cache prefix is different based on the cache driver.
As an example the prefix for people using horizon is configured in horizon.prefix while the "normal" cache prefix is in here cache.prefix.
Maybe I'm missing something but I still think the cache flush at the end of the export is not needed.
Can you please explain what are the entries needed to be flushed?
As far as I can see doing a simple keys * (i'm using redis as cache driver), when an export finishes, the items present in the cache are only the ones related to the jobs executed (AppendQueryToSheet and CloseSheet for example).
They will be trimmed when they will expire based on the app configurations.
On a side note, it can be useful to see the jobs there until they are trimmed to get the average export time for each queued job.

Please let me know your thoughts.
Thanks

@patrickbrouwers
Copy link
Member

From the phpsreadsheet docs:

As opposed to common cache concept, PhpSpreadsheet data cannot be re-generated from scratch. If some data is stored and later is not retrievable, PhpSpreadsheet will throw an exception.

That means that the data stored in cache must not be deleted by a third-party or via TTL mechanism.

So be sure that TTL is either de-activated or long enough to cover the entire usage of PhpSpreadsheet.

That means if we don't flush it at the end of the process, the cache will clog up; because it will add a cache key for every cell, which adds up on big sheets (which the caching feature is intended for)

@fsilvestro
Copy link
Contributor Author

I was under the same impression. However i tried to delete this line locally: WriterFlush and all the keys having phpsreadsheet as prefix were flushed.
I got the prefix here: PhpSpreadsheetPrefix.
My feeling is that all the keys managed by PhpSpreadsheet are manually deleted by them regardless. The only keys added by LaravelExcel are the ones related to the jobs dispatched.
Give it a try with a fresh simple queued export example and please let me know if you can see something different.
Thanks

@patrickbrouwers
Copy link
Member

@fsilvestro
Copy link
Contributor Author

Correct. That's why I couldn't understand what keys the library is trying to flush. Unless I'm missing something, it is just the keys related to the dispatched jobs.

@patrickbrouwers
Copy link
Member

Can you update the PR to remove the flush completely? (So no config setting needed)

@fsilvestro fsilvestro changed the title Added flush_writing configuration Removed cache flush exporting files Jan 4, 2022
@fsilvestro
Copy link
Contributor Author

PR updated just now.
Also changed the PR title and description to match the new goal.

@patrickbrouwers patrickbrouwers merged commit adb273c into SpartnerNL:3.1 Jan 4, 2022
@patrickbrouwers
Copy link
Member

Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants