Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby-source-drupal): Disable caching + add http/2 agent #32012

Merged
merged 9 commits into from
Jun 22, 2021
25 changes: 24 additions & 1 deletion packages/gatsby-source-drupal/src/gatsby-node.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,30 @@ const agent = {
}

async function worker([url, options]) {
return got(url, { agent, ...options })
return got(url, {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to pass request: http2wrapper.auto here, as Got v11 doesn't use the newest version of http2-wrapper. Got v12 is going to be released in the next few weeks.

Copy link
Contributor

@wardpeet wardpeet Jun 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for weighing in @szmarczak !

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I almost forgot. You need to set the http2 option to true as well. It will pass the entire agent object to http2wrapper, otherwise it would pass agent: httpsAgent which results in an error if the endpoint is HTTP/2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh haha thanks!

1_sXntGz_sVG3164fdY03-qQ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix it 5fbf618

Thanks btw for all your great work on Got! It's a fantastic piece of software.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random aside while we have you, I profiled this with http/2 & it looks like normalizeArguments is using more than its fair share of CPU?

Screen Shot 2021-06-22 at 9 42 43 AM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks btw for all your great work on Got! It's a fantastic piece of software.

No problem :) I really enjoy improving it.

oh haha thanks!

https://github.com/sindresorhus/got/blob/f896aa52abc41fe40d4942da94a0408477358f14/source/core/index.ts#L2364-L2367

This hasn't been documented, sorry. One of the ways to fix this in Got would be to always pass the entire agent object to the request function, but then we would need to default request to something like this:

request = (url, options) => {
	if (url.protocol === 'https:') {
		options.agent = options.agent.https;
		return https.request(url, options);
	}

	options.agent = options.agent.http;
	return http.request(url, options);
};

But then request: https.request wouldn't work. We could detect native http functions but then it would fail on custom functions that at the end return the same what https.request does.

I'm not sure what's the best solution here. Alternatively we could make request an object with protocols http2 https and http as keys...

For now http2: true does the trick :P

It looks like normalizeArguments is using more than its fair share of CPU?

Indeed. Can you point it down to a few lines? I suppose it's caused by too intensive object creation. The upcoming Got version uses getters & setters in order to avoid re-doing normalization if not necessary. If you could create a reproducible repo, that would be awesome.

Also feel free to create an issue in the Got repo about this :)

agent,
...options,
cache: {
get: key => options.cache.get(key),
set: (key, value) => {
const parsed = JSON.parse(value)
// Drupal users often set very long max-age as they want to maximize
// the max-age of their varnish cache. This would make gatsby-source-drupal
// cache API calls for hours or days and people would wonder why
// their content doesn't update. We'll change `cache-control` so every request
// must revalidate.
try {
delete parsed.value.cachePolicy.rescc[`max-age`]
delete parsed.value.cachePolicy.resh[`cache-control`]
parsed.value.cachePolicy.rescc[`must-revalidate`] = true
parsed.value.cachePolicy.rescc = {}
KyleAMathews marked this conversation as resolved.
Show resolved Hide resolved
} catch (e) {
console.log(e)
}
return options.cache.set(key, JSON.stringify(parsed))
},
},
})
}

const requestQueue = require(`fastq`).promise(worker, 20)
Expand Down