-
Notifications
You must be signed in to change notification settings - Fork 9
adding index and table parser for new probot feature #60
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think the html.pre.js
got changed accidentally....
73b805c
to
c6a6b23
Compare
@tripodsan yeah... force pushed to fix it haha.. |
48577e0
to
bed8372
Compare
src/idx_html.pre.js
Outdated
module.exports.before = { | ||
fetch: (context, action) => { | ||
action.secrets = action.secrets || {}; | ||
action.secrets.HTTP_TIMEOUT = 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1000 default time is not enough and times out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a local issue or does this happen in production, too? The low timeout is a resilience feature because it is preferable to have the renderer fail for a single request quickly (after 1000 ms) rather than block four more concurrent execution slots while waiting for a longer timeout.
If it's a local issue, I'd rather look for a local (testing-only) fix. If it happens in production, we need to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i haven't tried it in prod yet but locally it is an issue for now - let me investigate more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok if we use utils/preFetch
there is a 5% chance on average we will get [hlx] error: Gateway timout of 1000 milliseconds exceeded for {url...}
i will test the usage of default preFetch after this PR gets merged in
P.S. timeout is misspelled in the error message, here is the fix PR: adobe/helix-pipeline#439
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@craeyu the local issue should go away when you use PollyJS, which records the request and replays it – the main purpose is to allow offline testing, but making requests faster (by faking them) is a nice side effect.
src/idx_html.pre.js
Outdated
tables.push(images); | ||
context.content.json = { tables }; | ||
|
||
context.content.json.string = JSON.stringify(context.content.json); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor details, but... I think adding a string
property to an object to store its stringified version is a bit... strange ;) Also, you do not seem to need the json object.
why not just:
context.content.tables = JSON.stringify(tables);
or at least:
context.content.tables = {
json: tables,
str: JSON.stringify(tables);
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see.. i am currently using json obj as part of my request in my node app, if we use this then the header would be text/html
and i would need to do JSON.parse
but let me look into it and get back to you.
I looked into this agin, I think it would be better to create a (what we call) pure script, which does not use the filename:
this produces:
having the correct response content type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the JSON pipeline instead of the HTML pipeline, it will also make your string handling easier.
/Edit: @tripodsan beat me to the recommendation by 4 minutes.
src/idx_html.htl
Outdated
@@ -0,0 +1 @@ | |||
${content.json.string} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the JSON pipeline for this. Instead of requesting README.idx.html
, you'd request README.idx.json
. The htl
file can be dropped and you'd transform the idx_html.pre.js
into a idx_json.js
file. You can use this as a starting point: https://github.com/adobe/helix-cli/blob/master/test/integration/src/json.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree that the request for something with a .json
extension to correspond with the Content-Type: application/json
is the right path to go. However, I think we currently have a general issue where we make a choice of which pipeline to use based on the extension of the request path.
I think given that the idx.json
represents a view of the DOM or more specifically of the HTML output to be indexed, this should use the html pipeline instead of the json pipeline.
Furthermore i cannot really come up with a usecase anymore that would require a separate json pipeline alltogether, especially considering that most JSON usecases that i can think of are better served with the GraphQL interface to begin with. I currently think that the most obvious .json
usecase would be the .idx.json
which clearly should operate on the HTML pipeline, which makes me think that we might not need multiple pipelines anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When looking at the pipelines, it makes sense to unify the front part of the pipelines, so that JSON, HTML and XML behave more alike, but the back part (output) formatting should probably still stay different, just to ensure things like mine types, OpenWhisk encoding expectations, and so on.
Create idx_html.pre.js Create idx_html.htl style change and import html.pre.js Update idx_html.pre.js
bed8372
to
485f845
Compare
basic.entries.title = titleEl.textContent; | ||
} | ||
|
||
const descEl = document.querySelector('.title .header p'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the advantage of getting the description from the HTML version instead of the markdown version would be, but I guess it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll let @davidnuescheler explain why we get the metadata from the rendered HTML instead of taking it from the document.meta
.
🎉 This PR is included in version 1.0.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
added
idx_html.pre.js
andidx_html.htl
to extract necessary elements for new helix-bot feature