Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from readable-web-to-node-stream to readable-from-web #78

Merged
merged 5 commits into from
Apr 12, 2024

Conversation

surilindur
Copy link
Contributor

After dropping support for Node 16, it is now possible to also drop the library used to convert Web ReadableStreams to Node's own ReadableStreams by using the built-in Readable.fromWeb function that was added in Node 17.

This helps both reduce the number of dependencies and, most critically, avoid the uncaught errors when the stream is aborted, such as when the SPARQL endpoint crashes for whatever reason. This finally fixes the SPARQL benchmark runner crashing when the server crashes, it only took two weeks or so to figure it out. 😢

@coveralls
Copy link

coveralls commented Mar 21, 2024

Pull Request Test Coverage Report for Build 8427459576

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 100.0%

Totals Coverage Status
Change from base Build 8372373746: 0.0%
Covered Lines: 83
Relevant Lines: 83

💛 - Coveralls

@rubensworks
Copy link
Owner

We can't depend on node built-ins unfortunately, as we can't use those in the browser.

Perhaps this functionality also exists in the readable-stream package?

if (isStream(httpResponse.body)) {
responseStream = <NodeJS.ReadableStream> <unknown> httpResponse.body;
} else {
const httpResponseBodyReader = httpResponse.body.getReader();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the Readable.fromWeb does not exist in readable-stream?

If not, to what extent does the code below correspond to Node's Readable.fromWeb impl? Just to know how stable and tested the code below is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the Readable.fromWeb does not exist in readable-stream. They have a from that supports AsyncInterable and something else, but not the response body directly as-is.

The code does approximately the same thing as Node's own implementation here in the read method, except it does not destroy the stream upon error, just emits the error as an event.

It seems to work the same way as Readable.fromWeb with regards to successful and failed streams, based on testing with the SPARQL benchmark runner and the unit test. The new unit test I added makes sure the errors can be handled through the error events in the converted Readable stream, when they were previously uncaught.

I do not know how to further test this, so any tips would be welcome.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also manually test this in a browser build of comunica targeting a SPARQL endpoint?
The easiest would probably be to link this within the comunica jquery widget, and then sending some queries to sparql endpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to reimplement the entire fromWeb function using Node's as reference. No idea how it will work.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just stick with readable-web-to-node-stream then? We know that works. No need to re-invent the wheel, unless you see a good reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readable-web-to-node-stream library does not handle errors at all, and it crashes the whole program with an uncaught error exception when the web stream errors out, such as when the server terminates the connection or the undici body timeout is reached. That is what also causes the SPARQL benchmark runner to crash when the SPARQL endpoint crashes or otherwise closes the connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could look into fixing the fork of readable-web-to-node-stream next week, I guess. This uncaught error has already taken two weeks or more, so spending a day or two on fixing that should not be that bad.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readable-web-to-node-stream library does not handle errors at all, and it crashes the whole program with an uncaught error exception when the web stream errors out

Ok, that's a good point indeed.

Fixing the fork then sounds like a good idea. Perhaps you could even create your own fork (maybe we even want it under the comunica namespace)?
If I remember correctly, readable-web-to-node-stream is used in another place in comunica, so having that reusable fork would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now the library under Comunica that handles this conversion without crashing everything, and I have updated the code in this PR to use that, instead.

@surilindur surilindur changed the title Use built-in Readable.fromWeb over readable-web-to-node-stream Migrate from readable-web-to-node-stream to readable-from-web Mar 25, 2024
@surilindur surilindur changed the title Migrate from readable-web-to-node-stream to readable-from-web Migrate from readable-web-to-node-stream to readable-from-web Mar 25, 2024
@surilindur surilindur force-pushed the feat/use-readable-fromweb branch 2 times, most recently from eb60a15 to c866592 Compare March 25, 2024 12:08
Copy link
Owner

@rubensworks rubensworks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Before we merge, can you confirm to have linked this to the comunica jquery, built it for the browser, and tried out a SPARQL query sent to a SPARQL endpoint, to manually check if things still work correctly in the browser?

*/
public async fetchBindings(endpoint: string, query: string): Promise<NodeJS.ReadableStream> {
public async fetchBindings(endpoint: string, query: string): Promise<Readable> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologise, but I must backtrack my previous comment.
We'll want the return type here to remain NodeJS.ReadableStream to avoid breaking changes in the API. (but in practise, both should be usable interchangeably)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. 🙂 I have reverted the changes now, leaving only the function-internal one (to replace the library) and a cast in fetchTriples after swapping the Readable to come from readable-stream (it was already returning that one, as opposed to NodeJS.ReadableStream).

Do the changes look acceptable now?

@surilindur
Copy link
Contributor Author

Before we merge, can you confirm to have linked this to the comunica jquery, built it for the browser, and tried out a SPARQL query sent to a SPARQL endpoint, to manually check if things still work correctly in the browser?

I finally managed to add browser tests to the readable-from-web library for Firefox, Chromium and WebKit, and those pass after using webpack to bundle the library itself for browser use. This means that the new replacement library works for converting streams in those browsers, as well.

If still needed, I can do the linking and testing with the jQuery widget, but I think it will work just fine in a browser, since the old library also did. 🤔

@rubensworks
Copy link
Owner

If still needed, I can do the linking and testing with the jQuery widget, but I think it will work just fine in a browser, since the old library also did. 🤔

I would definitely do this. Out of personal experience I know that unexpected breakages easily happen when working with Node-browser environments.

@surilindur
Copy link
Contributor Author

I linked the version from this branch to Comunica (v3 master), and linked that Comunica to the jQuery widget, built it, and tested with the DBPedia SPARQL endpoint without issues. 🎉

@rubensworks rubensworks merged commit 6f62a48 into rubensworks:master Apr 12, 2024
8 checks passed
@rubensworks
Copy link
Owner

Thanks @surilindur!

I'lll wait to release a new version until #79 is resolved.

Could you also look into applying this change in comunica's ActorHttp?

@surilindur surilindur deleted the feat/use-readable-fromweb branch April 23, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants