Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gatsby-source-filesystem Race Issue copying Files #27984

Closed
notjosh opened this issue Nov 12, 2020 · 12 comments
Closed

gatsby-source-filesystem Race Issue copying Files #27984

notjosh opened this issue Nov 12, 2020 · 12 comments
Labels
stale? Issue that may be closed soon due to the original author not responding any more. topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) type: bug An issue or pull request relating to a bug in Gatsby

Comments

@notjosh
Copy link

notjosh commented Nov 12, 2020

Description

This issue is a somewhat "hard-ish to reproduce, but seems clear to explain" issue with gatsby-source-filesystem when repeatedly copying the same file many times as part of a build. I think the logs exhibit the issue clearly enough to make sense of it though.

I'm using gatsby-remark-relative-images to collect the images. The pages have references to an "OS/platform" (macos, linux, windows), with an icon used for each. As a result, I end up with a lot of operations for gatsby-source-filesystem to process on the same images.

The core of the issue is: Gatsby will synchronously check if a file exists, and then asynchronously perform the file copy (ref). Occasionally it gets into a state where the existsSync() is false, so it tries to copy in parallel with another queued task, and it will fail with a fairly vague error about a file not existing.

Steps to reproduce

  1. Have many references to the same file for gatsby-source-filesystem to copy
  2. Run a build
  3. Build fails

Expected result

Build should should succeed, and only attempt to copy file once.

Actual result

The build fails with a fairly vague error about the file not existing:

$ rm -rf .cache public && node node_modules/.bin/gatsby build
success open and validate gatsby-configs - 0.261s
success load plugins - 0.506s
success onPreInit - 0.021s
success delete html and css files from previous builds - 0.002s
success initialize cache - 0.003s
success copy gatsby files - 0.022s
success onPreBootstrap - 0.009s
success createSchemaCustomization - 0.118s
success Checking for changed pages - 0.000s
success source and transform nodes - 0.495s
warning Plugin `gatsby-plugin-ts-config` has customized the built-in Gatsby GraphQL type `Site`. This is allowed, but could potentially cause conflicts.
success building schema - 0.229s
info Total nodes: 211, SitePage nodes: 52 (use --verbose for breakdown)
success createPages - 0.111s
success Checking for changed pages - 0.000s
success createPagesStatefully - 0.025s
success update schema - 0.021s
success onPreExtractQueries - 0.000s
success extract queries from components - 3.029s
success write out redirect data - 0.001s
success Build manifest and related icons - 0.144s
success onPostBootstrap - 0.144s
info bootstrap finished - 6.509s
success run static queries - 0.058s - 8/8 137.84/s
error [object Object] ENOENT: no such file or directory, chmod '/Users/joshua/dev/work/notlmk/tldr.games/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg'


  Error: ENOENT: no such file or directory, chmod '/path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg'

not finished run page queries - 0.294s

I added some logging locally to gatsby-source-filesystem/src/extend-file-node.js to show when it's processing files, and dump more context when there's an error.

$ rm -rf .cache public && node node_modules/.bin/gatsby build
success open and validate gatsby-configs - 0.250s
success load plugins - 0.481s
success onPreInit - 0.022s
success delete html and css files from previous builds - 0.002s
success initialize cache - 0.003s
success copy gatsby files - 0.023s
success onPreBootstrap - 0.009s
success createSchemaCustomization - 0.111s
success Checking for changed pages - 0.000s
success source and transform nodes - 0.489s
warning Plugin `gatsby-plugin-ts-config` has customized the built-in Gatsby GraphQL type `Site`. This is allowed, but could potentially cause conflicts.
success building schema - 0.227s
info Total nodes: 211, SitePage nodes: 52 (use --verbose for breakdown)
success createPages - 0.108s
success Checking for changed pages - 0.000s
success createPagesStatefully - 0.027s
success update schema - 0.020s
success onPreExtractQueries - 0.000s
success extract queries from components - 3.017s
success write out redirect data - 0.001s
success Build manifest and related icons - 0.143s
success onPostBootstrap - 0.144s
info bootstrap finished - 6.460s
success run static queries - 0.061s - 8/8 132.03/s
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-nintendo.svg -> /path/to/project/public/static/4dedfd96921e29686f53982a636e0bad/platform-nintendo.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-nintendo.svg -> /path/to/project/public/static/4dedfd96921e29686f53982a636e0bad/platform-nintendo.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-nintendo.svg -> /path/to/project/public/static/4dedfd96921e29686f53982a636e0bad/platform-nintendo.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-nintendo.svg -> /path/to/project/public/static/4dedfd96921e29686f53982a636e0bad/platform-nintendo.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
/path/to/project/static/media/platform-windows.svg -> /path/to/project/public/static/266240296b7f64e8903f75407849806a/platform-windows.svg
/path/to/project/static/media/platform-linux.svg -> /path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg
/path/to/project/static/media/platform-macos.svg -> /path/to/project/public/static/30a64a6430f1da52db917015d61af0e1/platform-macos.svg
error [Error: ENOENT: no such file or directory, chmod '/path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'chmod',
  path: '/path/to/project/public/static/583a7fdd3703b81ef6aacdff5156efc8/platform-linux.svg'
}
error file exists? true

The smoking gun, for me, is at the end where a repeated call to fs.existsSync(publicPath) shows that the file does exist, despite the copy operation failing.

Workarounds

  1. Adding { overwrite: false, errorOnExist: false } to the fs.copy options seems to fix the issue for me. Potentially this is still vulnerable to race issues depending on the internals of fs-extra, as it's performing a few async checks internally.
  2. Using fs.copySync also fixes the issue for me, and since it will only be called once per file, I think it's a safer fallback option. I'm not seeing any performance regressions personally, but potentially large projects might? Not sure.

Environment

  System:
    OS: macOS 11.0.1
    CPU: (20) x64 Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 12.18.3 - /var/folders/q2/267x3sjd4h7478r0_4plr8n80000gn/T/yarn--1605152878518-0.8217355554857697/node
    Yarn: 1.22.10 - /var/folders/q2/267x3sjd4h7478r0_4plr8n80000gn/T/yarn--1605152878518-0.8217355554857697/yarn
    npm: 6.14.6 - ~/.nvm/versions/node/v12.18.3/bin/npm
  Languages:
    Python: 2.7.16 - /usr/bin/python
  Browsers:
    Chrome: 86.0.4240.198
    Firefox: 83.0
    Safari: 14.0.1
  npmPackages:
    gatsby: ^2.25.0 => 2.25.3
    gatsby-cli: ^2.12.91 => 2.12.117
    gatsby-link: ^2.4.2 => 2.4.16
    gatsby-plugin-catch-links: ^2.3.1 => 2.3.15
    gatsby-plugin-feed: ^2.5.1 => 2.6.0
    gatsby-plugin-google-gtag: ^2.1.1 => 2.1.13
    gatsby-plugin-manifest: ^2.4.27 => 2.5.2
    gatsby-plugin-netlify: ^2.3.13 => 2.4.0
    gatsby-plugin-netlify-cms: ^4.3.12 => 4.3.17
    gatsby-plugin-optimize-svgs: ^1.0.4 => 1.0.4
    gatsby-plugin-postcss: ^3.0.4 => 3.0.4
    gatsby-plugin-react-helmet: ^3.3.1 => 3.3.14
    gatsby-plugin-sharp: ^2.6.31 => 2.7.1
    gatsby-plugin-sitemap: ^2.4.12 => 2.5.1
    gatsby-plugin-ts-config: ^1.1.1 => 1.1.1
    gatsby-plugin-typescript: ^2.5.0 => 2.5.0
    gatsby-remark-autolink-headers: ^2.3.2 => 2.4.1
    gatsby-remark-copy-linked-files: ^2.3.13 => 2.3.19
    gatsby-remark-images: ^3.3.28 => 3.4.2
    gatsby-remark-relative-images: ^2.0.2 => 2.0.2
    gatsby-remark-responsive-iframe: ^2.4.2 => 2.4.17
    gatsby-remark-smartypants: ^2.3.1 => 2.3.13
    gatsby-remark-video: ^1.2.5 => 1.2.5
    gatsby-source-filesystem: ^2.3.27 => 2.4.2
    gatsby-transformer-remark: ^2.8.32 => 2.9.2
    gatsby-transformer-sharp: ^2.5.14 => 2.5.21
@notjosh notjosh added the type: bug An issue or pull request relating to a bug in Gatsby label Nov 12, 2020
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Nov 12, 2020
@pieh
Copy link
Contributor

pieh commented Nov 12, 2020

Workarounds

  1. Adding { overwrite: false, errorOnExist: false } to the fs.copy options seems to fix the issue for me. Potentially this is still vulnerable to race issues depending on the internals of fs-extra, as it's performing a few async checks internally.
  2. Using fs.copySync also fixes the issue for me, and since it will only be called once per file, I think it's a safer fallback option. I'm not seeing any performance regressions personally, but potentially large projects might? Not sure.

We can also add copyInProgress array/Set so once we start copy operation we push entry to that array and also check for that to cover time period between start and end of file copy to prevent duplicate copy operations (approach similar to #3859 ). This should should have lesser impact than fs.copySync but still result in similar behaviour

@pieh pieh added topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Nov 12, 2020
@pieh
Copy link
Contributor

pieh commented Nov 12, 2020

Oh, that wouldn't work if those operations are triggered by different plugins. Then I guess sync make sense

@styxlab
Copy link
Contributor

styxlab commented Nov 14, 2020

We are also seeing the same symptoms with plugin gatsby-rehype-inline-images for about 2 - 3 weeks now. This plugin also sources in a lot of images with the createRemoteFileNode method from gatsby-source-filesystem. After reading this thread, I modified extend-file-node.js to use fs.copySync as suggested above and I can confirm that this fixes the issue.

Maybe the copyInProgress approach could be retained by registering the publicPath in the global cache, I think this could work even if triggered by different plugins.

@sallakos
Copy link

sallakos commented Nov 16, 2020

We have also been experiencing similar problems for about 2-3 weeks now. Our project has a lot of images and using fs.copySync instead of fs.copy as pointed above seems to solve to problem.

@styxlab
Copy link
Contributor

styxlab commented Nov 19, 2020

@pieh: I could send in a PR for the copySync solution. Let me know, if you are interested.

@pieh
Copy link
Contributor

pieh commented Nov 19, 2020

Please, open pull request and do mention me there

@pieh
Copy link
Contributor

pieh commented Nov 23, 2020

We just published gatsby-source-filesystem@2.6.1 with the using copySync approach. Thanks @styxlab for opening pull request!

@pieh
Copy link
Contributor

pieh commented Nov 23, 2020

Maybe the copyInProgress approach could be retained by registering the publicPath in the global cache, I think this could work even if triggered by different plugins.

That would be ideal approach and something that we will try to move to in the future (probably by providing some abstraction on top of fs, so it's not copyInProgress per se, but rather something like gatsbyFS.copy (etc) that would make use of those caches), but issue with both just exposing global cache like that or using gatsbyFS is that plugins could still use fs so it's really hard to enforce migration to it :(

@styxlab
Copy link
Contributor

styxlab commented Nov 23, 2020

but issue with both just exposing global cache like that or using gatsbyFS is that plugins could still use fs so it's really hard to enforce migration to it :(

That's true, but on the one hand using just fs is fine in many cases, on the other it's plugins responsibility to keep up with latest developments. As long as the API is easy to understand and simple to use, it should be widely adopted (maybe I am too optimistic ;-)

@pieh
Copy link
Contributor

pieh commented Nov 23, 2020

I did mention gatsbyFS here, because concurrent copies is not the only thing we would want to tackle. Other parts here are:

  • we will want to be able to use different threads / processes (or even remote workers) at some point and then something like copyInProgress global cache wouldn't really work because those different threads don't have access to same memory / global caches ( hence abstraction over fs would be way for plugins to not care about which thread it executes in and complexity of cross-process checks would be handled in gatsbyFS and not responsibility of plugins)
  • not super urgent or high in priority, but there are some cases where users want to build their site artifacts in directory different than public - main reason we didn't do any work on this yet, because whole ecosystem right now uses fs directly and have hardcoded public directory

Because of all of the above we want to be able provide comprehensive solution for all those issues in single go so the plugin maintainers would have single migration to do and not force them to do many small incremental updates that also impact users of plugins because compatibility would also be very confusing for users (in cases maintainers decide to not handle backward compatibility and instead bump minimal supported gatsby core version). There's just a lot of considerations to make here and not something that we can address lightly

@github-actions
Copy link

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 60 days of inactivity. It’s been at least 20 days since the last update here.
If we missed this issue or if you want to keep it open, please reply here.
As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

@github-actions github-actions bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Dec 13, 2020
@notjosh
Copy link
Author

notjosh commented Dec 13, 2020

This has been fixed, and released via #28176, and seems to be working well for me. Closing the issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale? Issue that may be closed soon due to the original author not responding any more. topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants