Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: build all.json by combining generated JSON #21637

Closed
wants to merge 1 commit into from

Conversation

rubys
Copy link
Member

@rubys rubys commented Jul 3, 2018

Notes:

  1. Removed a number of root properties that did not seem relevant: source,
    desc, and introduced_in. There no longer is a source, and the other two are
    from the first include and do not reflect the entire API.

  2. As with doc: all.html is seriously broken link-wise #20100, the current "desc"
    properties sometimes contained in-page links, other times referenced another
    page, and often did not match the links in the original HTML or JSON file.
    I chose to standardize on external links as "desc" values are isolated
    snippets as opposed to all.html which can be viewed as a standalone and self
    contained document.

  3. Eliminated preprocessing for @include entirely, including the test case
    for this function.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are updated
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory. labels Jul 3, 2018
@rubys
Copy link
Member Author

rubys commented Jul 3, 2018

See #21568 (comment) for the original request.

@Trott
Copy link
Member

Trott commented Jul 3, 2018

Not sure if @tolmasky or any of their RunKit colleagues will be able to test this ahead of time, but pinging anyway as their the one significant consumer of our docs-in-JSON-format that I'm aware of. (I don't know if this would help or not, but maybe it would help them if the resulting JSON could be stashed somewhere, maybe in a gist or something.)

@vsemozhetbyt
Copy link
Contributor

What will happen with https://github.com/nodejs/node/blob/master/doc/api/index.md? It also contains one include directive.

@rubys
Copy link
Member Author

rubys commented Jul 3, 2018

@Trott @tolmasky gist containing both what all.json will look like after this change, and a diff between what it looks like after vs what it looked like before.

@vsemozhetbyt good catch! Initial thoughts: _toc.md should be renamed to _index.md (overwriting this file). allhtml.js and alljson.js would need to be updated to read this data from its new location.

If you agree, I'll make this change and squash the commit.

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jul 3, 2018

Seems good, but we should also check mentions like this one:

const gtocPath = path.join(docPath, 'api', '_toc.md');

@rubys rubys force-pushed the alljson branch 2 times, most recently from 95ad110 to 511f403 Compare July 3, 2018 19:04
@vsemozhetbyt
Copy link
Contributor

I hope I will be able to review at the weekend. If anybody feels confident to review and land sooner please do)

@Me1000
Copy link

Me1000 commented Jul 3, 2018

Hey all! I work on the RunKit team, thanks for the ping @Trott.

The new output looks like it contains all the existing data. If that is indeed the case (the output is quite large, so maybe I missed something) then it should be fairly easy for us consume the new format. :)

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jul 7, 2018

One breaking change in the index.html: previously, preprocess.js not only processed @include directives but also stripped the special comments (see commentExpr mentions in preprocess.js). Maybe we should make these comments formatted as HTML comments — these are just two lines in the index.md. Otherwise, they are rendered in the index.html:

cmm


const actualDocs = allDocs.filter(
(name) => {
const extension = path.extname(name);
return (extension === '.html' || extension === '.json') &&
name !== '_toc.html';
return (extension === '.html' || extension === '.json');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the parentheses are redundant now.

@@ -0,0 +1,56 @@
'use strict';

// Build all.json by combining the miscs, modules, classes, global, and methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global -> globals?

// Read the table of contents.
const toc = fs.readFileSync(source + '/index.html', 'utf8');

// initialize results. Only these four data values will be collected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. initialize -> Initialize.
  2. Extra space between the sentences.

// Extract (and concatenate) the selected data from each document.
// Expand hrefs found in json to include source HTML file.
for (const link of toc.match(/<a.*?>/g)) {
const href = /href="(.*?)"/.exec(link)[1].replace('.html', '.json');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.replace('.html', '.json') seems erroneous here?

if (!jsonFiles.includes(json) || seen[json]) continue;
const data = JSON.parse(
fs.readFileSync(source + '/' + json, 'utf8')
.replace(/<a href="#/g, `<a href="${href}#`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this should be /<a href=\\"#/g, `<a href=\\"${href}#`, as double quotes are escaped in .json sources.

}

// Mark source as seen.
seen[href] = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seen[href] -> seen[json]?

@@ -36,7 +36,7 @@ marked.setOptions({ renderer });

const docPath = path.resolve(__dirname, '..', '..', 'doc');

const gtocPath = path.join(docPath, 'api', '_toc.md');
const gtocPath = path.join(docPath, 'api', 'index.md');
const gtocMD = fs.readFileSync(gtocPath, 'utf8').replace(/^@\/\/.*$/gm, '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it seems this line also need to be updated so that the new HTML comment from index.md could be excluded from all the docs except the index.html.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the .replace(/^@\/\/.*$/gm, '') part.

const gtocPath = path.join(docPath, 'api', '_toc.md');
const gtocMD = fs.readFileSync(gtocPath, 'utf8').replace(/^@\/\/.*$/gm, '');
const gtocPath = path.join(docPath, 'api', 'index.md');
const gtocMD = fs.readFileSync(gtocPath, 'utf8').replace(/^<!--.*?-->/gms, '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just to prevent doubts: dotAll flag is supported since V8 6.2, so this line is backportable for Node.js 8 LTS.)


// Write results.
fs.writeFileSync(source + '/all.json',
JSON.stringify(results, null, 2), 'utf8');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more significant nit I've found while diffing the old and new results: new all.json lacks the \n at the end of the file, just after the last }. So maybe it is worth to replace JSON.stringify(results, null, 2) with `${JSON.stringify(results, null, 2)}\n` to be on the safe side)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@vsemozhetbyt
Copy link
Contributor

A note: amended commits are not reported to the people that are subscribed to a PR, so maybe it is worth to let everybody know about a new commit version by a comment)

Notes:

1) Removed a number of root properties that did not seem relevant: source,
   desc, and introduced_in.  There no longer is a source, and the other two are
   from the first include and do not reflect the entire API.

2) As with nodejs#20100, the current "desc"
   properties sometimes contained in-page links, other times referenced another
   page, and often did not match the links in the original HTML or JSON file.
   I chose to standardize on external links as "desc" values are isolated
   snippets as opposed to all.html which can be viewed as a standalone and self
   contained document.

3) Eliminated preprocessing for @include entirely, including the test case
   for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side column in
   the generated documentation.
@rubys
Copy link
Member Author

rubys commented Jul 7, 2018

Perhaps I should just stop squashing commits.

@vsemozhetbyt
Copy link
Contributor

@vsemozhetbyt vsemozhetbyt added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jul 7, 2018
@vsemozhetbyt
Copy link
Contributor

Let's land this on Monday, maybe somebody from @nodejs/documentation will want to chime in.

vsemozhetbyt pushed a commit that referenced this pull request Jul 9, 2018
Notes:

1) Removed a number of root properties that did not seem relevant:
   source, desc, and introduced_in.  There no longer is a source, and
   the other two are from the first include and do not reflect the
   entire API.

2) As with #20100, the current
   "desc" properties sometimes contained in-page links, other times
   referenced another page, and often did not match the links in the
   original HTML or JSON file. I chose to standardize on external links
   as "desc" values are isolated snippets as opposed to all.html which
   can be viewed as a standalone and self contained document.

3) Eliminated preprocessing for @include entirely, including the test
   case for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side
   column in the generated documentation.

PR-URL: #21637
Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
@vsemozhetbyt
Copy link
Contributor

Landed in 0c743b5
Thank you!

targos pushed a commit that referenced this pull request Jul 12, 2018
Notes:

1) Removed a number of root properties that did not seem relevant:
   source, desc, and introduced_in.  There no longer is a source, and
   the other two are from the first include and do not reflect the
   entire API.

2) As with #20100, the current
   "desc" properties sometimes contained in-page links, other times
   referenced another page, and often did not match the links in the
   original HTML or JSON file. I chose to standardize on external links
   as "desc" values are isolated snippets as opposed to all.html which
   can be viewed as a standalone and self contained document.

3) Eliminated preprocessing for @include entirely, including the test
   case for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side
   column in the generated documentation.

PR-URL: #21637
Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
@targos targos removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jul 12, 2018
@targos targos mentioned this pull request Jul 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants