Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support include files in Bikeshed and ReSpec. #60

Merged
merged 14 commits into from May 21, 2020

Conversation

jyasskin
Copy link
Contributor

@jyasskin jyasskin commented May 9, 2020

Fixes the part of #27 that can be fixed in PR-Preview.

Initial description:

This depends on a to-be-added extension to
https://hg.csswg.org/dev/bikeshed-web to accept an include_urls[] query
parameter.

I sent @plinss a preliminary patch based on the -F file=@header.include solution suggested in #27, but I've now discovered that PR-Preview uses the url parameter to send the file instead of the file parameter. If it sounds good to y'all, I'll update my patch to bikeshed-web to accept extra URLs instead of (or maybe in addition to) extra files.

However, the new plan, described below, is to update Bikeshed to accept a URL as its root spec argument, and resolve includes relative to that. Then bikeshed-web can just pass the url to Bikeshed, which will handle the recursive include fetching. This PR just makes sure to re-generate the spec when included files change.

@tobie
Copy link
Owner

tobie commented May 9, 2020

I’m curious what’s missing from the way params are handled to warrant this special treatment for the include parameter. Couldn’t we just add what’s needed to the object passed to parameters instead?

@jyasskin
Copy link
Contributor Author

jyasskin commented May 9, 2020

When I thought it was going to be a file upload, that definitely wouldn't fit in params, but I guess users could build the raw URLs themselves to avoid any special handling. The raw URLs are complicated though, so this seems like better ergonomics?

@tobie
Copy link
Owner

tobie commented May 9, 2020

Could we build the raw URL and pass it to the config along with the other things we are passing? It seems like it’s maybe a little less ergonomic, but easier to explain and more consistent.

@jyasskin
Copy link
Contributor Author

jyasskin commented May 9, 2020

I'll try something like that.

lib/models/pr.js Outdated
get relevantSrcFiles() {
return [this.config.src_file];
return [this.config.src_file] + this.config.includes;
Copy link
Owner

@tobie tobie May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm. This implies that we do have to do some special casing.

Have you considered just auto-detecting for include statements in the src and doing the right thing without the editor having to specify the includes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it briefly, but you're currently not fetching the source at all, so we'd have to add that in order to scan it for includes, and I'm not eager to write my own partial Bikeshed parser to do that scan.

We could have the server do the scan, and resolve relative to the URL, but that again raises the question of how to write the parser. We could teach Bikeshed to resolve includes relative to URLs, if we give it the URL of the base file instead of a local path. @tabatkins, does that sound plausible?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could teach Bikeshed to resolve includes relative to URLs, if we give it the URL of the base file instead of a local path.

That would be ideal, obviously.

@marcoscaceres, would that also work for ReSpec?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I believe that would work. cc @sidvishnoi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that having Bikeshed and Respec fetch their own includes doesn't solve the problem this line addresses of having PR Preview know when relevant source files have changed.

If we want to fetch the files here, it wouldn't be too hard to do a recursive line-by-line grep for ^path:(.*)$ (Bikeshed) and data-include="([^"]*)"|data-include='([^']*)' (Respec) to identify the watched files.

It'll probably still be good for the processors to fetch their own files on the server, since the scan here will be an overestimate, so I think we shouldn't upload everything it finds to the processor web service.

Feel free to override me on any of that, but I'll go in that direction until I hear otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem this line addresses of having PR Preview know when relevant source files have changed.... If we want to fetch the files here, it wouldn't be too hard to do a recursive line-by-line grep for..

👍 It would be nice if pr-preview finds these dependencies rather than users specifying includes.

ReSpec (and spec-generator) already fetches includes relative to base URL, so this passing of includes files is not relevant to ReSpec, unless I'm missing the point of this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to either let Bikeshed take a base URL (and then do url-fetches of its include files, rather than local file fetches), or add a command that parses a document and returns all the files it might be including. (You'd then have to check if those paths actually exist, because things like local boilerplate files are included by default when they exist, without an explicit indicator.)

lib/models/pr.js Outdated
get relevantSrcFiles() {
return [this.config.src_file];
return [this.config.src_file] + this.config.includes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return [this.config.src_file] + this.config.includes;
return [this.config.src_file].concat(this.config.includes);

Python much? 😄

Copy link
Contributor Author

@jyasskin jyasskin May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, 😳, a test caught that for me somewhere else, but this function must not be tested. I re-learned the other day that PHP uses . for string concatenation, but I'm sure I'll forget it again by the next time I need it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are very few tests because there is very little budget. :)

@jyasskin
Copy link
Contributor Author

How's this look?

/** When this gets back to empty, we're done searching includes. */
const inProgressFetches = new Set();

await new Promise((resolve, reject) => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, this promise seems redundant as it doesn't operate on a callback... just await scanOneFile() and get it to throw as normal. You can probably also extract scanOneFile() outside of this function and get it to return recursiveIncludes ... they could be passed as default arguments... something like:

scanOneFile(path);
async function scanOneFile(path, inProgressFetches = new Set(), recursiveIncludes = new Set()) {
    // do sync stuff... can throw 
    // do recursion:
    await scanOneFile(path, inProgressFetches, recursiveIncludes);
    // eventually
    return recursiveIncludes;
} ;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I was worried about allowing maximal parallelism in the fetches, but a Promise.all() seems to allow the same amount, and it's much simpler to read.

Doing the accumulator sets as parameters initialized by default arguments seems less clear, to me, than making it a nested function. Similarly, it seems clearer which value is returned if it's returned unconditionally as the last line of scanIncludes(), than if readers have to also check what scanOneFile() returns.

lib/models/pr.js Outdated
get relevantSrcFiles() {
return [this.config.src_file];
return [this.config.src_file].concat(this.includes);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
return [this.config.src_file].concat(this.includes);
return [].concat(this.config.src_file, this.includes);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd somewhat expected that to iterate the letters in src_file, but it works. However, it turns out that this.includes both includes src_file on its own, and is a Set, so this line didn't work at all. Fixed.

I'm not sure what style you want here, @tobie. I could just assign to this.relevantSrcFiles and remove the getter?

Comment on lines 68 to 69
for (let i = 1; i < match.length; i++) {
const included = match[i].trim();

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even simpler with the 1-capture regex.

lib/scan-includes.js Outdated Show resolved Hide resolved
lib/scan-includes.js Outdated Show resolved Hide resolved
test/scan-includes.js Outdated Show resolved Hide resolved
@sidvishnoi
Copy link
Contributor

sidvishnoi commented May 14, 2020

As most specs don't make use of includes feature, I think it makes sense to not incur the fetching/parsing overhead for every spec. Maybe add an hasIncludes: boolean flag in .pr-preview.json and only look out for includes if asked for?

@jyasskin
Copy link
Contributor Author

I'm happy to add hasIncludes to the config, but since @tobie was resistant to adding new keys to the top level there, I want to double check with him that it's the right thing to do.

lib/services.js Outdated Show resolved Hide resolved
lib/models/pr.js Outdated Show resolved Hide resolved
@tobie
Copy link
Owner

tobie commented May 15, 2020

I'm happy to add hasIncludes to the config, but since @tobie was resistant to adding new keys to the top level there, I want to double check with him that it's the right thing to do.

Yeah, I think paying the extra CPU cycles to avoid the UX cost is fine. Let’s not add this.

Copy link
Owner

@tobie tobie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we’re getting close to landing this. I’m super excited; as far as I remember, it’s the biggest contribution to PR Preview to date. <3

lib/scan-includes.js Show resolved Hide resolved
lib/scan-includes.js Outdated Show resolved Hide resolved
@tobie tobie self-requested a review May 16, 2020 13:12
Copy link
Owner

@tobie tobie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to go.

Would love reviews from @marcoscaceres and @sidvishnoi before merging.

What's the story on the ReSpec/Bikeshed path now?

And thanks all, it's awesome to see folks contributing this way!

lib/services.js Outdated Show resolved Hide resolved
@tobie
Copy link
Owner

tobie commented May 16, 2020

I do have a slight concern about the roll-out, though, as open pull requests will have the master branch already cached. So all of the newly added includes will show up as a diff for currently open pull requests that get modified before they're merged.

Copy link
Contributor

@sidvishnoi sidvishnoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:
Don't forget to change the PR title before merging!

lib/models/pr.js Outdated
if (this.includes) {
return new Set(this.includes).add(this.config.src_file);
}
return new Set([this.config.src_file]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more of a nit (ignore if you disagree) but it's usually better to just return arrays in a API like this. Sets are generally only useful inside algorithms to avoid duplicates and perform simple "set" thins - but as accessor properties, arrays are better IMO. Also, this will violate .relevantSrcFiles === .relevantSrcFiles.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd be more confortable with this too, tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This violates the === property for arrays too, right? And does even before this PR?

The Set is useful in the single caller of this function, in touchesSrcFile: it switches an O(N^2) algorithm to O(N).

This could just as easily be a method as a getter; I just left the kind of property that was already here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This violates the === property for arrays too, right? And does even before this PR?

True. Maybe just rename it function getRelevantSrcFiles() or deriveRelevantSrcFiles()? At least idiomatically, it gives it a reason to return a new thing every time.

The Set is useful in the single caller of this function, in touchesSrcFile: it switches an O(N^2) algorithm to O(N).

True, but given that we are only dealing with files in the maybe tens of files, I don't think it makes a huge difference. If we were processing a thousands of files, it may have a noticeable performance impact... or even the JIT might treat it as a set under the hood or do some other magic optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

let body;
try {
body = await fetchUrl(resolvedUrl, GITHUB_SERVICE);
} catch {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is always risky if things go wrong for debugging purposes... maybe console.warn(err)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tobie, is anything watching the console warnings for PR-Preview?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's using Papertrail.

You could log both cases. Like it is being done for caching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jyasskin
Copy link
Contributor Author

I think ReSpec will Just Work, and probably has been Just Working except that PR-Preview could miss times it needed to regenerate the preview.

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

I'm not so worried about in-progress PRs that suddenly get a big diff once the bikeshed-web support lands: specs with includes are very badly supported now, so it's unlikely people are looking at them much. The in-progress PRs will just be a last set of PRs that don't work well, and even they will at least show the new version correctly.

@tobie
Copy link
Owner

tobie commented May 18, 2020

I think ReSpec will Just Work, and probably has been Just Working except that PR-Preview could miss times it needed to regenerate the preview.

Oh—Interesting.

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

Should we enable this for Bikeshed right away or wait until that's implemented?

@jyasskin
Copy link
Contributor Author

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

Should we enable this for Bikeshed right away or wait until that's implemented?

I'd be comfortable enabling it for Bikeshed right away: the only harm should be re-generating previews when includes change, when those includes should change the output but won't yet. But it's up to you.

@jyasskin jyasskin changed the title Support include files in Bikeshed. Support include files in Bikeshed and ReSpec. May 18, 2020
@tobie
Copy link
Owner

tobie commented May 20, 2020

This looks good. I'll deploy it as soon as I have a good enough window of time to be able to test it and roll it back/fix it if there are issues.

Also changed a property to a function since its return value wasn't === with
itself.
@tobie tobie merged commit c2c3ead into tobie:master May 21, 2020
@tobie
Copy link
Owner

tobie commented May 21, 2020

How tf did this now make @sidvishnoi and myself the authors of this pull request!?

@sidvishnoi
Copy link
Contributor

Things go 🤷‍♂️ when you rebase and merge

@tobie
Copy link
Owner

tobie commented May 21, 2020

I squashed but it wasn't picked up (super flaky network, probably a UI glitch). Now force pushing to master.

tobie pushed a commit that referenced this pull request May 21, 2020
This triggers a build if any of the includes of a spec are modified by
a pull request.

This is in preparation for adding support for Bikeshed includes.

ReSpec includes should already be supported.
tobie pushed a commit that referenced this pull request May 21, 2020
This triggers a build if any of the includes of a spec are modified by
a pull request.

This is in preparation for adding support for Bikeshed includes.

ReSpec includes should already be supported.

Co-authored-by: Sid Vishnoi <sidvishnoi8@gmail.com>
@jyasskin jyasskin deleted the support-bikeshed-includes branch May 21, 2020 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants