Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote storage and cache - Huge amount of (useless ?) cache #35820

Open
1 of 5 tasks
Nuranto opened this issue Jul 26, 2022 · 16 comments · May be fixed by #37653
Open
1 of 5 tasks

Remote storage and cache - Huge amount of (useless ?) cache #35820

Nuranto opened this issue Jul 26, 2022 · 16 comments · May be fixed by #37653
Assignees
Labels
Area: Framework Component: Cache Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: PR in progress Reported on 2.4.x Indicates original Magento version for the Issue report. Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it

Comments

@Nuranto
Copy link
Contributor

Nuranto commented Jul 26, 2022

Summary (*)

We have L2 caching enabled. But I guess you can reproduce the issue with just setting up the cache backend to Filesystem.
When a S3 remote system is configured, a file cache is created for each single image (with tags flysystem and mage).

The result is that we crashed one of our server by launching bin/magento catalog:images:resize --async and bin/magento queue:consumers:start media.storage.catalog.image.resize, because we reached server's inode limits.

Proposed solution

I'm not sure if this cache is relevant, and it should probably be removed.
If I'm wrong and that cache is usefull for performance, then we should disable that caching on bulk operation like bin/magento catalog:images:resize

Additional Information

We have tried to debug this in detail and for me, the cache path is var/cache:

image

I have also observed that in order to save the cache it is using the below method from colinmollenhour
/
Cm_Cache_Backend_File:

https://github.com/colinmollenhour/Cm_Cache_Backend_File/blob/034bf73adfdc5b02057ae3ef2a2255b381f46944/File.php#L178-L205

And the $hash variable is empty in this method.

This function is calling when we have tried to debug the below command:

XDEBUG_CONFIG=idekey=PHPSTORM bin/magento remote-storage:sync


Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

  • Severity: S0 - Affects critical data or functionality and leaves users with no workaround.
  • Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
@Nuranto Nuranto added the Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it label Jul 26, 2022
@m2-assistant
Copy link

m2-assistant bot commented Jul 26, 2022

Hi @Nuranto. Thank you for your report.
To speed up processing of this issue, make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, review the Magento Contributor Assistant documentation.

Add a comment to assign the issue: @magento I am working on this

To learn more about issue processing workflow, refer to the Code Contributions.


⚠️ According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@Nuranto
Copy link
Contributor Author

Nuranto commented Jul 30, 2022

Additional informations

We tried to disable L2 cache, and use our dedicated-to-cache redis instance instead.

Here is what I can see after the image.resize is done :

127.0.0.1:6379> info memory
# Memory
used_memory:68741172840
used_memory_human:64.02G

64G of cache, for "only" 50K products and 229 869 entries in catalog_product_entity_media_gallery. With resizes, category and cms, multiple store views, we have a total of 5 663 932 files in our S3 instance.
(Of course, in the 64G, there is also other Magento's cache, but I think it is insignificant)

I guess that's not a good thing for performances especially since in our case, we use Minio as S3 remote engine which is in the same cluster, so a call to redis is probably not significantly faster than requesting Minio directly.

@Nuranto Nuranto changed the title Remote storage - File-based cache - Image data caching Remote storage and cache - Huge amount of (useless ?) cache Jul 30, 2022
@engcom-Hotel engcom-Hotel self-assigned this Aug 5, 2022
@m2-assistant
Copy link

m2-assistant bot commented Aug 5, 2022

Hi @engcom-Hotel. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).

    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.

  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 4. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 5. Add label Issue: Confirmed once verification is complete.

  • 6. Make sure that automatic system confirms that report has been added to the backlog.

@Nuranto Nuranto mentioned this issue Aug 9, 2022
5 tasks
@engcom-Hotel
Copy link
Contributor

Hello @Nuranto,

Thanks for the report and collaboration!

We have tried to reproduce the issue in Magento 2.4-develop but the issue is not reproducible for us. We have verified it in the var/cache folder.

We have followed the below steps:

  • Install the fresh Magento instance
  • Install Sample Data
  • Run the command bin/magento catalog:images:resize --async

Please let us know if we have missed anything.

Thanks

@engcom-Hotel engcom-Hotel added Issue: Cannot Reproduce Cannot reproduce the issue on the latest `2.4-develop` branch Issue: needs update Additional information is require, waiting for response labels Aug 24, 2022
@m2-community-project m2-community-project bot moved this from Ready for Confirmation to Needs Update in Issue Confirmation and Triage Board Aug 24, 2022
@m2-community-project m2-community-project bot removed Issue: Cannot Reproduce Cannot reproduce the issue on the latest `2.4-develop` branch Issue: ready for confirmation labels Aug 24, 2022
@Nuranto
Copy link
Contributor Author

Nuranto commented Aug 25, 2022

Hello @engcom-Hotel ,

Have you enabled & configured a remote storage (S3) during your test ?

@engcom-Hotel
Copy link
Contributor

Hello @Nuranto,

Yes, we have tried it with S3 enabled. But unable to reproduce the issue. Please refer to the below screenshot:

image

The below path doesn't have any changes:

var/cache

Please let us know if we have missed anything.

Thanks

@Nuranto
Copy link
Contributor Author

Nuranto commented Aug 29, 2022

Hello @engcom-Hotel,

Have you run the consumer ? bin/magento queue:consumers:start media.storage.catalog.image.resize

I think you can also run the resize command in synchronous mode, it should give the same result.

@engcom-Hotel
Copy link
Contributor

Hello @Nuranto,

Yes, we tried it with both Sync and Async commands. But the issue was not reproducible for us. Can you please suggest the path where this cache has been saved for you?

Thanks

@Nuranto
Copy link
Contributor Author

Nuranto commented Aug 31, 2022

Well, in my case I can see all that cache files with : ls -l /dev/shm/* |grep JPG

Each of this files look like this :

a:4:{s:4:"hash";s:0:"";s:5:"mtime";i:1661872995;s:6:"expire";i:1661880195;s:4:"tags";s:0:"";}
{"media\/catalog\/product\/w\/e\/we***.jpg":{"path":"media\/catalog\/product\/w\/e\/we***.jpg","dirname":"media\/catalog\/product\/w\/e","basename":"we***","extension":"jpg","filename":"we***","type":"file","size":86308,"timestamp":1646762270,"visibility":null,"mimetype":"image\/jpeg","extra":{"image-height":"600","image-width":"450"}}}

You can find the cache logic in Magento\RemoteStorage\Driver\Adapter\CachedAdapter, for example have a look at fileExists method, I guess that's where it gets generated.
See also Magento\RemoteStorage\Driver\Adapter\Cache\Generic::updateMetadata where you can see that a cache file is created on a image filepath basis.

However I don't understand why you cannot reproduce this. The $persist parameter of updateMetadata is hardcoded to true in fileExists so it should not be related to a configuration...

@engcom-Hotel
Copy link
Contributor

engcom-Hotel commented Oct 17, 2022

Hello @Nuranto,

I have tried to debug this in detail and for me, the cache path is var/cache:

image

I have also observed that in order to save the cache it is using the below method from colinmollenhour
/
Cm_Cache_Backend_File:

https://github.com/colinmollenhour/Cm_Cache_Backend_File/blob/034bf73adfdc5b02057ae3ef2a2255b381f46944/File.php#L178-L205

And the $hash variable is empty in this method.

This function is calling when we have tried to debug the below command:

XDEBUG_CONFIG=idekey=PHPSTORM bin/magento remote-storage:sync

Need your input on these findings.

Thanks

@Nuranto
Copy link
Contributor Author

Nuranto commented Oct 17, 2022

Hello @engcom-Hotel

Yes, in my case, as specified in my first message, we are using L2 caching, with basic configuration took from the docs, with this line :

'local_backend_options' => [
            'cache_dir' => '/dev/shm/'
        ],

But I don't think the cache location matters for this issue.

For the $hash being empty, I'm also not sure if this is relevant. In my previous message, you can see that this data is also an empty string : s:4:"hash";s:0:"".

But if you're looking there, it means you managed to reproduce the caching behaviour of 1 cache file per image ?

To me the whole caching of remote file's metadatas is an error that should be either removed, refactored or make as optionnal (I guess that can be usefull with few images, or when the distant S3 server is slow).

@engcom-Hotel
Copy link
Contributor

Hello @Nuranto,

Thanks for the reply!

Yes I can manage to reproduce the issue that the caching behavior of 1 cache file per image by running the below command:

bin/magento remote-storage:sync

But as I told you that this has been due to colinmollenhour
/
Cm_Cache_Backend_File repo. But we can prevent this behavior by passing false in the below function call:

$this->cache->updateMetadata($path, $cacheEntry, true);

Please suggest.

Thanks

@Nuranto
Copy link
Contributor Author

Nuranto commented Oct 24, 2022

Hello @engcom-Hotel ,

I don't understand why you think the issue is in colinmollenhour. The redis extension does just what it is asked for : save a cache entry for each remote file .

But I agree with your fix proposal : we should not persist this cache.

@engcom-Hotel
Copy link
Contributor

Hello @Nuranto,

Thanks for your reply!

As per the process, we are moving this ticket for a fix. Hence confirming the issue.

Thanks

@engcom-Hotel engcom-Hotel added Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Component: Cache Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Priority: P2 A defect with this priority could have functionality issues which are not to expectations. labels Oct 25, 2022
@m2-community-project m2-community-project bot added this to Dev In Progress in High Priority Backlog Oct 25, 2022
@engcom-Hotel engcom-Hotel added Reported on 2.4.x Indicates original Magento version for the Issue report. and removed Progress: dev in progress labels Oct 25, 2022
@github-jira-sync-bot
Copy link

✅ Jira issue https://jira.corp.adobe.com/browse/AC-6907 is successfully created for this GitHub issue.

@m2-community-project m2-community-project bot moved this from Dev In Progress to Ready for Development in High Priority Backlog Oct 25, 2022
@m2-community-project m2-community-project bot moved this from Ready for Development to Dev In Progress in High Priority Backlog Oct 25, 2022
@m2-assistant
Copy link

m2-assistant bot commented Oct 25, 2022

✅ Confirmed by @engcom-Hotel. Thank you for verifying the issue.
Issue Available: @engcom-Hotel, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

@m2-community-project m2-community-project bot moved this from Dev In Progress to Ready for Development in High Priority Backlog Oct 25, 2022
@engcom-Hotel engcom-Hotel removed the Issue: needs update Additional information is require, waiting for response label Oct 25, 2022
@Nuranto Nuranto linked a pull request Jun 22, 2023 that will close this issue
5 tasks
@m2-community-project m2-community-project bot moved this from Ready for Development to Pull Request In Progress in High Priority Backlog Jun 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Framework Component: Cache Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: PR in progress Reported on 2.4.x Indicates original Magento version for the Issue report. Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
High Priority Backlog
  
Pull Request In Progress
Development

Successfully merging a pull request may close this issue.

3 participants