Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor BufReader/BufWriter interfaces to be more idiomatic #444

Merged
merged 1 commit into from May 29, 2019

Conversation

8 participants
@piscisaureus
Copy link
Contributor

commented May 24, 2019

No description provided.

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch 2 times, most recently from b60b841 to 97c67e1 May 24, 2019

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2019

A similar change could be made w.r.t the ReadResult interface, which we could change to type ReadResult = number | EOF.
(denoland/deno#2384)

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2019

also cc @kevinkassimo

@piscisaureus piscisaureus changed the title WIP: refactor BufReader/BufWriter to be more idiomatic Refactor BufReader/BufWriter interfaces to be more idiomatic May 26, 2019

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

also cc @bartlomieju

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

I agreed with your approach you mentioned in the original issue. I think this API is far more idiomatic Js/Ts. I haven't combed through the PR but liked what I did scan.

@kevinkassimo

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

I also like the proposal, though I do feel we might want to move the definition EOF symbol to core (cli) at some point (so it is possible to also encourage users to do similar stuff without relying on remote deps)

I'm slightly hesitant of type ReadResult = number | EOF though. One benefit of Uint8Array | EOF is that TypeScript enforces user to check possible EOF condition before use of the data, which does not seems to be the case for number | EOF, since this number is not the data itself. Personally I feel it might make more sense either to make type ReadResult = number (0 for EOF, unix-y), or type ReadResult = Uint8Array | EOF (either a subarray (no copy) from the typed array the user provides on actual data read (so that .byteLength is nread), or EOF)

@zekth

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Looks good to me. Great job!
Also i like the addition of assertNotEOF 🥇

@zekth

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@ry about:
https://github.com/denoland/deno_std/pull/444/files#diff-231e8b30ed54f62e59f73019a10ef3d0R146

        // TODO(ry):
        // This skipSpace() is definitely misplaced, but I don't know where it
        // comes from nor how to fix it.

see:
https://github.com/golang/go/blob/master/src/net/textproto/reader.go#L162-L177

and used here: https://github.com/golang/go/blob/master/src/net/textproto/reader.go#L150

Not sure if really misplaced.

@kt3k

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

I totally agree that this is more natural as JS/TS. (golang uses returning multi value pattern because it doesn't use throwing for usual error handling, and that's not the case for javascript)

@bartlomieju

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

+1 from me @piscisaureus, definitely cleaner, let me know if you need help with tests.

We should land this PR before landing #453 - the other PR will require extensive rebase

return [0, 0, false];
}
default: {
const Big = 1000000; // arbitrary upper bound

This comment has been minimized.

Copy link
@MarkTiedemann

MarkTiedemann May 27, 2019

Contributor

Why is an arbitrary upper bound for the HTTP version necessary here? Isn't HTTP/2 already binary instead of text-based, and doesn't have a status line?

This comment has been minimized.

Copy link
@zekth

zekth May 27, 2019

Contributor

See golang api. It's a simple port.

This comment has been minimized.

Copy link
@piscisaureus

piscisaureus May 29, 2019

Author Contributor

It seems out of scope for this PR to fix it, but I agree with @MarkTiedemann, this parser seems rather over-engineered.
I think we should accept "HTTP/0.9" and /^HTTP\/1.(\d+)$/ and call it a day.

This comment has been minimized.

Copy link
@ry

ry May 29, 2019

Contributor

SGTM

This comment has been minimized.

Copy link
@MarkTiedemann

MarkTiedemann May 29, 2019

Contributor

It seems out of scope for this PR to fix it

Agreed.

I think we should accept "HTTP/0.9" and /^HTTP\/1.(\d+)$/ and call it a day.

I think we should accept only 0.9, 1.0, and 1.1 in this case. This is effectively an HTTP 1 parser. Since H2 and H3 are completely different and there's not gonna be a 1.2 version (at least not any time soon), I think we can simplify this even more. (And even if 1.3 should come out one day, we'll probably know about it in advance and can just simply add the missing case then. I'd go with YAGNI here.)

This brings up another important point (which is unimportant for this PR, however): Do we need to parse and handle version 0.9 and 1.0 differently? For example, 0.9 has different methods, such as -- and this is not a joke -- SPACEJUMP. Are there important differences that we need to take into account? And on another related note, is anyone still using 0.9 and 1.0? 1.1 is from 1997, 22 years old. If no one is using 0.9 and 1.0, should we just drop support for them?

This comment has been minimized.

Copy link
@zekth

zekth May 29, 2019

Contributor

Agree with all of you. Created an issue for this: #463

}

export async function readRequest(
bufr: BufReader
): Promise<[ServerRequest, BufState]> {
): Promise<ServerRequest | EOF> {

This comment has been minimized.

Copy link
@ry

ry May 27, 2019

Contributor

👍

@@ -1,3 +1,3 @@
// Copyright 2018-2019 the Deno authors. All rights reserved. MIT license.
import "./toml_test.ts";
import "./csv_test.ts";
//import "./csv_test.ts";

This comment has been minimized.

Copy link
@ry

ry May 27, 2019

Contributor

Please create separate issues for csv and ws tests to be re-enabled. Link to them here this:

// TODO(#1234) Re-enable csv test.
// import "./csv_test.ts";
@ry
Copy link
Contributor

left a comment

Since this is going to break ws and csv... can we temporarily create something like //bufio_old so they can continue to work until ported over?

I'm holding all PRs in deno_std until this one lands, as it's big and complex.

@zekth

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

If this is landed i can fix csv pretty quickly.

@ry

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@zekth If you have time, can you do that preemptively? (Check out this branch, do the port, and send the diff here for @piscisaureus to include in this PR.) I think this would help us land it. Of course we still need to deal with ws

@ry ry referenced this pull request May 27, 2019

Merged

[encoding] add csv parse #458

@zekth

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Will be fixed today

@zekth

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

csv rewrite is done : piscisaureus#2

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

I'm slightly hesitant of type ReadResult = number | EOF though. One benefit of Uint8Array | EOF is that TypeScript enforces user to check possible EOF condition before use of the data, which does not seems to be the case for number | EOF, since this number is not the data itself. Personally I feel it might make more sense either to make type ReadResult = number (0 for EOF, unix-y), or type ReadResult = Uint8Array | EOF (either a subarray (no copy) from the typed array the user provides on actual data read (so that .byteLength is nread), or EOF)

For that reason, BufReader.readFull should maybe take a number of bytes and return a subarray, instead of copying into a user-provided Uint8Array.

(Note that this change can be made gradually)

@ry

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

We should work towards a future where

type EOF = 0;
type ReadResult = number;

See denoland/deno#2384

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch 2 times, most recently from 01d6c96 to d2e3819 May 27, 2019

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

The one "challenge" is that EOF and number will always collapse into the type number in TypeScript, since 0 is contained within number. There aren't negation/subtraction type operators in TypeScript (yet) so you can't have something like "all positive integers" be a distinct type in TypeScript. So while it can't be written as EOF and number, there is no enforcement in TypeScript and in a lot of cases intellisense will collapse to number.

I think it is still a better API, it is just going to be possible to write some unsound code where TypeScript won't help.

@kevinkassimo

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

@kitsonk do you think type ReadResult = number | null would work in that case? null for EOF might make sense as EOF implies "nothing" to read any more (and more interestingly C NULL is 0, so might be okay-ish for people to understand that null implies EOF)

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

Tests are broken now because I re-enabled mime/multipart tests in 9605729, but there's a bug in there that I can't figure out at the moment.

The other TODO item is to fix ws plus its tests.

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

do you think type ReadResult = number | null would work in that case?

In strict mode, TypeScript will keep the type safety of null, so yeah, that might work better. While it is illogical, seeking or reading 0 would result in a read of 0, but not denote EOF, so it does feel like they should be distinct. A type EOF = -1 would also make sense, but again that would collapse to a number again, so only null (or undefined) can be carved out in strict mode. It would allow safe narrowing of the type as well... like:

if (readResult === EOF) {
 // readResult type is `null` here
} else {
  // readResult type is `number` here
}
@kevinkassimo

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

@kitsonk also null (emptiness, nothing) sounds better than undefined (unknown, not well defined) to me in this case by definition in JS

@ry

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

@kitsonk @kevinkassimo if that's the case we can remove the EOF type and just use 0. Everyone knows that 0 means EOF anyway. I do not want number | null - it's unnecessary complicated.

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch 2 times, most recently from ea92926 to 621e3bc May 29, 2019

@piscisaureus piscisaureus requested a review from ry May 29, 2019

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

I think this is done.
@ry PTAL

@piscisaureus

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

@ry, @kevinkassimo, @kitsonk

I wonder if there are any objections to keeping the EOF symbol and type and moving it into core.
Originally I only added it as a "hack" to get typescript's help during this refactor, thinking that eventually I'd do a global search-and-replace and replace EOF by null, but I must say that it works rather well.

We could define EOF to be 0 or null but I see only disadvantages. Using 0 doesn't work when the return value is a number (e.g. BufReader.readByte() currently returns -1 on eof; I didn't touch it in this PR but I think we might want to change it over as well). Using null is useless in non-strict mode.

So what about just keeping it?

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

So what about just keeping it?

A 👍 from me. It hadn't occurred to me to do that, but a symbol will always be a seperate type and not collapse, and it is also seperate at runtime too. Moving it to core make sense to me as well.

@MarkTiedemann

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

Everyone knows that 0 means EOF anyway.

Not sure that's universal. For example, Java's BufferedReader#read returns -1, .NET's Stream#read returns 0, and C's EOF expands to "a negative integer".

@ry

ry approved these changes May 29, 2019

Copy link
Contributor

left a comment

LGTM modulo comments

Show resolved Hide resolved io/bufio.ts
Show resolved Hide resolved io/bufio.ts
Show resolved Hide resolved io/bufio.ts
Show resolved Hide resolved ws/mod.ts

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch from 816f10e to 314d292 May 29, 2019

piscisaureus added a commit to piscisaureus/deno_std that referenced this pull request May 29, 2019

draft commit message
io: refactor BufReader/Writer interfaces to be more idiomatic (denoland#444)

Thanks Vincent Le Gofff (@zekth) for porting over the CSV
implementation.

Fixes: denoland#436

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch from 314d292 to ed399d5 May 29, 2019

piscisaureus added a commit to piscisaureus/deno_std that referenced this pull request May 29, 2019

draft commit message
io: refactor BufReader/Writer interfaces to be more idiomatic (denoland#444)

Thanks Vincent Le Goff (@zekth) for porting over the CSV reader
implementation.

Fixes: denoland#436
io: refactor BufReader/Writer interfaces to be more idiomatic (#444)
Thanks Vincent Le Goff (@zekth) for porting over the CSV reader
implementation.

Fixes: #436

@piscisaureus piscisaureus force-pushed the piscisaureus:bufio branch from 5988292 to 0ee6334 May 29, 2019

@piscisaureus piscisaureus merged commit 0ee6334 into denoland:master May 29, 2019

5 checks passed

denoland.deno_std Build #20190529.22 succeeded
Details
denoland.deno_std (Linux) Linux succeeded
Details
denoland.deno_std (Mac) Mac succeeded
Details
denoland.deno_std (Windows) Windows succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@piscisaureus piscisaureus deleted the piscisaureus:bufio branch May 29, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.