Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby-source-contentful): move types into createSchemaCustomization #33207

Merged
merged 27 commits into from
Oct 13, 2021

Conversation

axe312ger
Copy link
Collaborator

@axe312ger axe312ger commented Sep 16, 2021

Replacement of #32351

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Sep 16, 2021
const fetchData = require(`./fetch`)
const { createPluginConfig } = require(`./plugin-options`)

export async function onPreBootstrap(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wardpeet we would want to make sure with v4 that this function is onPluginInit. Otherwise the workers with source nodes and createSchemaCustomization will have the wrong data, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does a lot of tricky things that I don't understand why we're doing it in onPreBootstrap or would even do it in onPluginInit.

Why are we fetching data here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you don't call this code, you will stay with the old Contentful data

Copy link
Contributor

@wardpeet wardpeet Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand but why can't we put this all in sourcenodes? This makes it super complex, I don't even understand what's happening here 😬

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wardpeet I need the content types to be available in createSchemaCustomization

The next "major" #31385 will require even more data from Contentful to available when creating the schema.

Copy link
Collaborator Author

@axe312ger axe312ger Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be possible to split up the huge fetch function into two functions:

createSchemaCustomization will load content types
sourceNodes will load content types + all of the rest

Loading data twice is bad, thats why I originally moved all of this into bootstrap. Could work around this with the Gatsby cache, but only if I can rely on the call order of createSchemaCustomization && sourceNodes

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we call contentTypes separate from data? Doesn't contentful supports that? Relying on cache will be problematic when the process crashes - you don't really know where we left off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, it is supported. I just wonder If i can avoid calling contentTypes twice.

Will try it now, let u know where I end up :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GATSBY_CONTENTFUL_EXPERIMENTAL_FORCE_CACHE is going to make it tricky, as it assumes we have all the data in one place which will not be true anymore.

I might store the content type information in a separate file 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a variant working in e2e and manual tests. Unit tests are green except the huge snapshot based tests. Will fix these after lunch, then we should be able to do another review round :)

@axe312ger
Copy link
Collaborator Author

axe312ger commented Sep 17, 2021

@wardpeet @DanielSLew

I split up the fetching and processing of content types from the rest of the data we get from Contentful.

  • Snapshots: these always shared the same content types within all file. I moved all content types data into their own fixture
  • fetch.js: now split up into two functions, one for content types, the other one for the rest
  • onPreBootstrap: will now only create the cache directory for assets
  • createSchemaCustomization:
    1. fetches content types
    2. processes them
    3. stores these in Gatsby cache for sourceNodes (is this save?) not anymore
    4. creates the (incomplete aka inheriting) schema
  • sourceNodes:
    1. gets content types and a copy of the last build content from Gatsby cache
    2. exits early and uses existing data when GATSBY_CONTENTFUL_OFFLINE === true
    3. fetches new content from Contentful
    4. fetches latest content types from Contentful
    5. merges fetched data with data from cache
    6. stores merged build data in cache
    7. resolves links
    8. touches existing nodes
    9. deletes deleted nodes
    10. create new nodes
    11. downloads assets to FS if requested
  • the flag GATSBY_CONTENTFUL_EXPERIMENTAL_FORCE_CACHE got removed as it did not work since tags got enabled and removal was approved by @benrobertsonio

@DanielSLew
Copy link
Contributor

Thank you for the detailed walk through of the changes, that really helps to follow what's happening.

  1. stores these in Gatsby cache for sourceNodes (is this save?)

I think the following comment from @wardpeet sums this up

Relying on cache will be problematic when the process crashes - you don't really know where we left off.

I'm still a little fuzzy as to what this experimental force cache does and what the purpose of it is? Is this a feature that people actually use?

@benrobertsonio
Copy link
Contributor

The experimental force cache was a way to bypass the sourcing step. It was primarily used internally when working with customers with really large datasets. It was always undocumented / experimental. I think the customer success team was the primary user of that flag, and we aren't actively using it anymore. I'm fine with removing it.

@axe312ger
Copy link
Collaborator Author

Relying on cache will be problematic when the process crashes - you don't really know where we left off.

Alright, then I'll request the content types twice from Contentful. As this is only a minimal amount of data compared to the actual content, we should not add to much overhead.

@axe312ger
Copy link
Collaborator Author

I updated the code to request fresh content type data for every node sourcing and schema customization and updated my comment above to reflect the new flow.

@LekoArts LekoArts added topic: source-contentful Related to Gatsby's integration with Contentful and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Sep 20, 2021
@axe312ger axe312ger force-pushed the refactor/prepare-contentful-for-gatsby-v4 branch from a607dd1 to 80f3df8 Compare September 24, 2021 08:24
@axe312ger axe312ger force-pushed the refactor/prepare-contentful-for-gatsby-v4 branch 2 times, most recently from 573bc2b to 97cacdf Compare September 25, 2021 12:14
@KyleAMathews
Copy link
Contributor

Why refetch content types for every sourcing? Is that needed? I assume this would slow down inc builds?

@axe312ger
Copy link
Collaborator Author

Why refetch content types for every sourcing? Is that needed? I assume this would slow down inc builds?

That was my first approach. But Daniel and Ward say storing the content types in cache is not save? #33207 (comment)

I can still revert/remove 59af410 to use Gatsby cache for storing the content types

Copy link
Contributor

@wardpeet wardpeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added questions and comments. There are some possible race conditions with cache & node store

reporter,
})

createTypes(`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladar is it faster to only run createTypes once vs many times or are things the same?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wardpeet @vladar as far as I can see the whole codebase calls createTypes with multiple types at ones.

So I went for that way, more clean, tests pass locally :)

packages/gatsby-source-contentful/src/source-nodes.js Outdated Show resolved Hide resolved
packages/gatsby-source-contentful/src/source-nodes.js Outdated Show resolved Hide resolved
packages/gatsby-source-contentful/src/source-nodes.js Outdated Show resolved Hide resolved
@wardpeet wardpeet force-pushed the refactor/prepare-contentful-for-gatsby-v4 branch from f49eaf9 to 4f22ccf Compare October 12, 2021 20:14
@wardpeet wardpeet changed the title Prepare gatsby-source-contentful for Gatsby v4 feat(gatsby-source-contentful): move types into createSchemaCustomization Oct 12, 2021
Copy link
Contributor

@wardpeet wardpeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tests are green, let's get this in! 🚢 Thanks a ton @axe312ger

@wardpeet wardpeet merged commit d01a373 into master Oct 13, 2021
@wardpeet wardpeet deleted the refactor/prepare-contentful-for-gatsby-v4 branch October 13, 2021 07:10
wardpeet added a commit to herecydev/gatsby that referenced this pull request Oct 29, 2021
…tion (gatsbyjs#33207)

Co-authored-by: Ward Peeters <ward@coding-tech.com>
axe312ger added a commit that referenced this pull request Nov 9, 2021
…tion (#33207)

Co-authored-by: Ward Peeters <ward@coding-tech.com>
bartveneman added a commit to bartveneman/gatsby that referenced this pull request Dec 28, 2023
This bit me today when I wanted to use it. Apparently `forceFullSync` was removed years ago. 

PR's: gatsbyjs#33238, gatsbyjs#33207

@axe312ger mentioned he'd remove it in the new version, but since that's not live yet I'm proposing it here.
pieh added a commit that referenced this pull request Jan 2, 2024
… from readme (#38787)

* Remove `forceFullSync` option from readme

This bit me today when I wanted to use it. Apparently `forceFullSync` was removed years ago. 

PR's: #33238, #33207

@axe312ger mentioned he'd remove it in the new version, but since that's not live yet I'm proposing it here.

* remove option from create-gatsby as well

---------

Co-authored-by: Michal Piechowiak <misiek.piechowiak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: source-contentful Related to Gatsby's integration with Contentful
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants