adding index and table parser for new probot feature #60

constanceyu · 2019-08-09T18:08:34Z

added idx_html.pre.js and idx_html.htl to extract necessary elements for new helix-bot feature

tripodsan

hmm, I think the html.pre.js got changed accidentally....

constanceyu · 2019-08-09T21:25:03Z

@tripodsan yeah... force pushed to fix it haha..

src/idx_html.pre.js

kptdobe · 2019-08-12T13:47:00Z

src/idx_html.pre.js

+module.exports.before = {
+  fetch: (context, action) => {
+    action.secrets = action.secrets || {};
+    action.secrets.HTTP_TIMEOUT = 5000;


Why do you need this ?

1000 default time is not enough and times out

Is this a local issue or does this happen in production, too? The low timeout is a resilience feature because it is preferable to have the renderer fail for a single request quickly (after 1000 ms) rather than block four more concurrent execution slots while waiting for a longer timeout.

If it's a local issue, I'd rather look for a local (testing-only) fix. If it happens in production, we need to investigate.

i haven't tried it in prod yet but locally it is an issue for now - let me investigate more

ok if we use utils/preFetch there is a 5% chance on average we will get [hlx] error: Gateway timout of 1000 milliseconds exceeded for {url...} i will test the usage of default preFetch after this PR gets merged in

P.S. timeout is misspelled in the error message, here is the fix PR: adobe/helix-pipeline#439

@craeyu the local issue should go away when you use PollyJS, which records the request and replays it – the main purpose is to allow offline testing, but making requests faster (by faking them) is a nice side effect.

kptdobe · 2019-08-12T13:54:12Z

src/idx_html.pre.js

+  tables.push(images);
+  context.content.json = { tables };
+
+  context.content.json.string = JSON.stringify(context.content.json);


Minor details, but... I think adding a string property to an object to store its stringified version is a bit... strange ;) Also, you do not seem to need the json object.
why not just:

context.content.tables = JSON.stringify(tables);

or at least:

context.content.tables = { json: tables, str: JSON.stringify(tables); };

i see.. i am currently using json obj as part of my request in my node app, if we use this then the header would be text/html and i would need to do JSON.parse but let me look into it and get back to you.

tripodsan · 2019-08-12T21:05:06Z

I looked into this agin, I think it would be better to create a (what we call) pure script, which does not use the .htl template at all. for example:

filename: idx_json.js

module.exports.main = (context, action) => {
  const tables = {
    'foo': 42,
  };

  return {
    response: {
      body: tables,
    }
  };
};

this produces:

$ curl -D- http://localhost:3000/index.idx.json
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 10
ETag: W/"a-+vkST8+aT8XFbCzXLFtrSA9KqEM"
Date: Mon, 12 Aug 2019 21:01:18 GMT
Connection: keep-alive

{"foo":42}

having the correct response content type.

trieloff

Use the JSON pipeline instead of the HTML pipeline, it will also make your string handling easier.

/Edit: @tripodsan beat me to the recommendation by 4 minutes.

trieloff · 2019-08-12T21:08:16Z

src/idx_html.htl

@@ -0,0 +1 @@
+${content.json.string}


I think we should use the JSON pipeline for this. Instead of requesting README.idx.html, you'd request README.idx.json. The htl file can be dropped and you'd transform the idx_html.pre.js into a idx_json.js file. You can use this as a starting point: https://github.com/adobe/helix-cli/blob/master/test/integration/src/json.js

i agree that the request for something with a .json extension to correspond with the Content-Type: application/json is the right path to go. However, I think we currently have a general issue where we make a choice of which pipeline to use based on the extension of the request path.
I think given that the idx.json represents a view of the DOM or more specifically of the HTML output to be indexed, this should use the html pipeline instead of the json pipeline.

Furthermore i cannot really come up with a usecase anymore that would require a separate json pipeline alltogether, especially considering that most JSON usecases that i can think of are better served with the GraphQL interface to begin with. I currently think that the most obvious .json usecase would be the .idx.json which clearly should operate on the HTML pipeline, which makes me think that we might not need multiple pipelines anymore.

When looking at the pipelines, it makes sense to unify the front part of the pipelines, so that JSON, HTML and XML behave more alike, but the back part (output) formatting should probably still stay different, just to ensure things like mine types, OpenWhisk encoding expectations, and so on.

Create idx_html.pre.js Create idx_html.htl style change and import html.pre.js Update idx_html.pre.js

trieloff · 2019-08-14T20:41:19Z

src/idx_json.js

+    basic.entries.title = titleEl.textContent;
+  }
+
+  const descEl = document.querySelector('.title .header p');


I'm not sure what the advantage of getting the description from the HTML version instead of the markdown version would be, but I guess it works.

trieloff

Ok. I'll let @davidnuescheler explain why we get the metadata from the rendered HTML instead of taking it from the document.meta.

adobe-bot · 2019-09-11T00:37:46Z

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

constanceyu requested review from tripodsan and trieloff August 9, 2019 18:08

tripodsan suggested changes Aug 9, 2019

View reviewed changes

constanceyu force-pushed the conyu/idx_html branch from 73b805c to c6a6b23 Compare August 9, 2019 21:23

tripodsan reviewed Aug 9, 2019

View reviewed changes

src/idx_html.pre.js Outdated Show resolved Hide resolved

constanceyu force-pushed the conyu/idx_html branch 3 times, most recently from 48577e0 to bed8372 Compare August 10, 2019 01:15

tripodsan approved these changes Aug 10, 2019

View reviewed changes

kptdobe reviewed Aug 12, 2019

View reviewed changes

trieloff suggested changes Aug 12, 2019

View reviewed changes

feat(index): adding idx json for rendering tables

485f845

Create idx_html.pre.js Create idx_html.htl style change and import html.pre.js Update idx_html.pre.js

constanceyu force-pushed the conyu/idx_html branch from bed8372 to 485f845 Compare August 14, 2019 17:20

tripodsan approved these changes Aug 14, 2019

View reviewed changes

tripodsan requested a review from trieloff August 14, 2019 18:21

trieloff reviewed Aug 14, 2019

View reviewed changes

trieloff approved these changes Aug 14, 2019

View reviewed changes

trieloff merged commit 9c67684 into master Aug 14, 2019

constanceyu mentioned this pull request Aug 15, 2019

GraphQL Repository API adobe/helix-home#24

Closed

adobe-bot added the released label Sep 11, 2019

tripodsan deleted the conyu/idx_html branch January 15, 2020 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding index and table parser for new probot feature #60

adding index and table parser for new probot feature #60

constanceyu commented Aug 9, 2019

tripodsan left a comment

constanceyu commented Aug 9, 2019

kptdobe Aug 12, 2019

constanceyu Aug 12, 2019

trieloff Aug 12, 2019

constanceyu Aug 12, 2019

constanceyu Aug 12, 2019 •

edited

Loading

trieloff Aug 12, 2019

kptdobe Aug 12, 2019

constanceyu Aug 12, 2019

tripodsan commented Aug 12, 2019

trieloff left a comment •

edited

Loading

trieloff Aug 12, 2019

davidnuescheler Aug 13, 2019

trieloff Aug 14, 2019

trieloff Aug 14, 2019

trieloff left a comment

adobe-bot commented Sep 11, 2019

		@@ -0,0 +1 @@
		${content.json.string}

adding index and table parser for new probot feature #60

adding index and table parser for new probot feature #60

Conversation

constanceyu commented Aug 9, 2019

tripodsan left a comment

Choose a reason for hiding this comment

constanceyu commented Aug 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

constanceyu Aug 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tripodsan commented Aug 12, 2019

trieloff left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trieloff left a comment

Choose a reason for hiding this comment

adobe-bot commented Sep 11, 2019

constanceyu Aug 12, 2019 •

edited

Loading

trieloff left a comment •

edited

Loading