Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor BufReader/BufWriter interfaces to be more idiomatic #444

Merged
merged 1 commit into from
May 29, 2019

Conversation

piscisaureus
Copy link
Member

No description provided.

@piscisaureus
Copy link
Member Author

A similar change could be made w.r.t the ReadResult interface, which we could change to type ReadResult = number | EOF.
(denoland/deno#2384)

@piscisaureus
Copy link
Member Author

also cc @kevinkassimo

@piscisaureus piscisaureus changed the title WIP: refactor BufReader/BufWriter to be more idiomatic Refactor BufReader/BufWriter interfaces to be more idiomatic May 26, 2019
@piscisaureus
Copy link
Member Author

also cc @bartlomieju

@kitsonk
Copy link
Contributor

kitsonk commented May 27, 2019

I agreed with your approach you mentioned in the original issue. I think this API is far more idiomatic Js/Ts. I haven't combed through the PR but liked what I did scan.

@kevinkassimo
Copy link
Contributor

kevinkassimo commented May 27, 2019

I also like the proposal, though I do feel we might want to move the definition EOF symbol to core (cli) at some point (so it is possible to also encourage users to do similar stuff without relying on remote deps)

I'm slightly hesitant of type ReadResult = number | EOF though. One benefit of Uint8Array | EOF is that TypeScript enforces user to check possible EOF condition before use of the data, which does not seems to be the case for number | EOF, since this number is not the data itself. Personally I feel it might make more sense either to make type ReadResult = number (0 for EOF, unix-y), or type ReadResult = Uint8Array | EOF (either a subarray (no copy) from the typed array the user provides on actual data read (so that .byteLength is nread), or EOF)

@zekth
Copy link
Contributor

zekth commented May 27, 2019

Looks good to me. Great job!
Also i like the addition of assertNotEOF 🥇

@zekth
Copy link
Contributor

zekth commented May 27, 2019

@ry about:
https://github.com/denoland/deno_std/pull/444/files#diff-231e8b30ed54f62e59f73019a10ef3d0R146

        // TODO(ry):
        // This skipSpace() is definitely misplaced, but I don't know where it
        // comes from nor how to fix it.

see:
https://github.com/golang/go/blob/master/src/net/textproto/reader.go#L162-L177

and used here: https://github.com/golang/go/blob/master/src/net/textproto/reader.go#L150

Not sure if really misplaced.

@kt3k
Copy link
Member

kt3k commented May 27, 2019

I totally agree that this is more natural as JS/TS. (golang uses returning multi value pattern because it doesn't use throwing for usual error handling, and that's not the case for javascript)

@bartlomieju
Copy link
Member

+1 from me @piscisaureus, definitely cleaner, let me know if you need help with tests.

We should land this PR before landing #453 - the other PR will require extensive rebase

return [0, 0, false];
}
default: {
const Big = 1000000; // arbitrary upper bound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an arbitrary upper bound for the HTTP version necessary here? Isn't HTTP/2 already binary instead of text-based, and doesn't have a status line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See golang api. It's a simple port.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems out of scope for this PR to fix it, but I agree with @MarkTiedemann, this parser seems rather over-engineered.
I think we should accept "HTTP/0.9" and /^HTTP\/1.(\d+)$/ and call it a day.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Contributor

@MarkTiedemann MarkTiedemann May 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems out of scope for this PR to fix it

Agreed.

I think we should accept "HTTP/0.9" and /^HTTP\/1.(\d+)$/ and call it a day.

I think we should accept only 0.9, 1.0, and 1.1 in this case. This is effectively an HTTP 1 parser. Since H2 and H3 are completely different and there's not gonna be a 1.2 version (at least not any time soon), I think we can simplify this even more. (And even if 1.3 should come out one day, we'll probably know about it in advance and can just simply add the missing case then. I'd go with YAGNI here.)

This brings up another important point (which is unimportant for this PR, however): Do we need to parse and handle version 0.9 and 1.0 differently? For example, 0.9 has different methods, such as -- and this is not a joke -- SPACEJUMP. Are there important differences that we need to take into account? And on another related note, is anyone still using 0.9 and 1.0? 1.1 is from 1997, 22 years old. If no one is using 0.9 and 1.0, should we just drop support for them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with all of you. Created an issue for this: #463

}

export async function readRequest(
bufr: BufReader
): Promise<[ServerRequest, BufState]> {
): Promise<ServerRequest | EOF> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

encoding/test.ts Outdated
@@ -1,3 +1,3 @@
// Copyright 2018-2019 the Deno authors. All rights reserved. MIT license.
import "./toml_test.ts";
import "./csv_test.ts";
//import "./csv_test.ts";
Copy link
Member

@ry ry May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create separate issues for csv and ws tests to be re-enabled. Link to them here this:

// TODO(#1234) Re-enable csv test.
// import "./csv_test.ts";

Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is going to break ws and csv... can we temporarily create something like //bufio_old so they can continue to work until ported over?

I'm holding all PRs in deno_std until this one lands, as it's big and complex.

@zekth
Copy link
Contributor

zekth commented May 27, 2019

If this is landed i can fix csv pretty quickly.

@ry
Copy link
Member

ry commented May 27, 2019

@zekth If you have time, can you do that preemptively? (Check out this branch, do the port, and send the diff here for @piscisaureus to include in this PR.) I think this would help us land it. Of course we still need to deal with ws

@ry ry mentioned this pull request May 27, 2019
@zekth
Copy link
Contributor

zekth commented May 27, 2019

Will be fixed today

@zekth
Copy link
Contributor

zekth commented May 27, 2019

csv rewrite is done : piscisaureus#2

@piscisaureus
Copy link
Member Author

piscisaureus commented May 27, 2019

I'm slightly hesitant of type ReadResult = number | EOF though. One benefit of Uint8Array | EOF is that TypeScript enforces user to check possible EOF condition before use of the data, which does not seems to be the case for number | EOF, since this number is not the data itself. Personally I feel it might make more sense either to make type ReadResult = number (0 for EOF, unix-y), or type ReadResult = Uint8Array | EOF (either a subarray (no copy) from the typed array the user provides on actual data read (so that .byteLength is nread), or EOF)

For that reason, BufReader.readFull should maybe take a number of bytes and return a subarray, instead of copying into a user-provided Uint8Array.

(Note that this change can be made gradually)

@ry
Copy link
Member

ry commented May 27, 2019

We should work towards a future where

type EOF = 0;
type ReadResult = number;

See denoland/deno#2384

@kitsonk
Copy link
Contributor

kitsonk commented May 28, 2019

The one "challenge" is that EOF and number will always collapse into the type number in TypeScript, since 0 is contained within number. There aren't negation/subtraction type operators in TypeScript (yet) so you can't have something like "all positive integers" be a distinct type in TypeScript. So while it can't be written as EOF and number, there is no enforcement in TypeScript and in a lot of cases intellisense will collapse to number.

I think it is still a better API, it is just going to be possible to write some unsound code where TypeScript won't help.

@kevinkassimo
Copy link
Contributor

kevinkassimo commented May 28, 2019

@kitsonk do you think type ReadResult = number | null would work in that case? null for EOF might make sense as EOF implies "nothing" to read any more (and more interestingly C NULL is 0, so might be okay-ish for people to understand that null implies EOF)

@piscisaureus
Copy link
Member Author

Tests are broken now because I re-enabled mime/multipart tests in 9605729, but there's a bug in there that I can't figure out at the moment.

The other TODO item is to fix ws plus its tests.

@kitsonk
Copy link
Contributor

kitsonk commented May 28, 2019

do you think type ReadResult = number | null would work in that case?

In strict mode, TypeScript will keep the type safety of null, so yeah, that might work better. While it is illogical, seeking or reading 0 would result in a read of 0, but not denote EOF, so it does feel like they should be distinct. A type EOF = -1 would also make sense, but again that would collapse to a number again, so only null (or undefined) can be carved out in strict mode. It would allow safe narrowing of the type as well... like:

if (readResult === EOF) {
 // readResult type is `null` here
} else {
  // readResult type is `number` here
}

@kevinkassimo
Copy link
Contributor

@kitsonk also null (emptiness, nothing) sounds better than undefined (unknown, not well defined) to me in this case by definition in JS

@ry
Copy link
Member

ry commented May 28, 2019

@kitsonk @kevinkassimo if that's the case we can remove the EOF type and just use 0. Everyone knows that 0 means EOF anyway. I do not want number | null - it's unnecessary complicated.

@piscisaureus piscisaureus requested a review from ry May 29, 2019 08:13
@piscisaureus
Copy link
Member Author

I think this is done.
@ry PTAL

@piscisaureus
Copy link
Member Author

piscisaureus commented May 29, 2019

@ry, @kevinkassimo, @kitsonk

I wonder if there are any objections to keeping the EOF symbol and type and moving it into core.
Originally I only added it as a "hack" to get typescript's help during this refactor, thinking that eventually I'd do a global search-and-replace and replace EOF by null, but I must say that it works rather well.

We could define EOF to be 0 or null but I see only disadvantages. Using 0 doesn't work when the return value is a number (e.g. BufReader.readByte() currently returns -1 on eof; I didn't touch it in this PR but I think we might want to change it over as well). Using null is useless in non-strict mode.

So what about just keeping it?

@kitsonk
Copy link
Contributor

kitsonk commented May 29, 2019

So what about just keeping it?

A 👍 from me. It hadn't occurred to me to do that, but a symbol will always be a seperate type and not collapse, and it is also seperate at runtime too. Moving it to core make sense to me as well.

@MarkTiedemann
Copy link
Contributor

Everyone knows that 0 means EOF anyway.

Not sure that's universal. For example, Java's BufferedReader#read returns -1, .NET's Stream#read returns 0, and C's EOF expands to "a negative integer".

Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comments

io/bufio.ts Show resolved Hide resolved
io/bufio.ts Show resolved Hide resolved
io/bufio.ts Show resolved Hide resolved
ws/mod.ts Show resolved Hide resolved
piscisaureus added a commit to piscisaureus/deno_std that referenced this pull request May 29, 2019
io: refactor BufReader/Writer interfaces to be more idiomatic (denoland#444)

Thanks Vincent Le Gofff (@zekth) for porting over the CSV
implementation.

Fixes: denoland#436
piscisaureus added a commit to piscisaureus/deno_std that referenced this pull request May 29, 2019
io: refactor BufReader/Writer interfaces to be more idiomatic (denoland#444)

Thanks Vincent Le Goff (@zekth) for porting over the CSV reader
implementation.

Fixes: denoland#436
…nd#444)

Thanks Vincent Le Goff (@zekth) for porting over the CSV reader
implementation.

Fixes: denoland#436
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants