New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add File module #71
Add File module #71
Conversation
The I think there are a few nitpicky problems with function implementations as well. Go ahead and point them out if you see them, but right now I'm more worried about making this look translatable into Hemlock. |
bootstrap/src/basis/file.ml
Outdated
let of_path ?(mode=Mode.RW) path = | ||
let open Unix in | ||
let unix_flags = Mode.to_unix_flags mode in | ||
try | ||
Ok ({fd=Some (openfile path unix_flags 0o640)}) | ||
with | ||
| Unix_error (errno, fn_name, fn_param) -> begin | ||
Error (String.concat ~sep:" -> " [Unix.error_message errno; fn_name; fn_param]) | ||
end | ||
| _ -> not_reached () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is a good example of where I'd like to improve things. The contents of this function look nothing like what Hemlock will look like. I'd like to hide the Unix.openfile
call and exception handler behind a wrapper that looks more Hemlock-like, but not really sure what the wrapper function signature should look like. I also wonder if it'd be easier to wrap native c function calls than to wrap Unix
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing native wrappers around the system calls sounds like a pretty good idea, actually. The result would be quite close to how the Hemlock code will look.
bootstrap/src/basis/file.ml
Outdated
let%expect_test "of_path_hlt" = | ||
let open Format in | ||
printf "@[<h>"; | ||
let t = of_path_hlt ~mode:Mode.R_O "/home/cevans/test.txt" in | ||
let bytes = read_hlt t in | ||
let s = Bytes.to_string_hlt bytes in | ||
let _ = close_hlt t in | ||
printf "t = %a\n" String.pp s; | ||
printf "@]"; | ||
|
||
[%expect{| | ||
t = "The quick brown fox jumped over the red fence.\n" | ||
|}] | ||
|
||
let%expect_test "Stream.of_path_hlt" = | ||
let open Format in | ||
printf "@[<h>"; | ||
Stream.of_path_hlt "/home/cevans/test.txt" | ||
|> Array.of_stream | ||
|> Bytes.to_string_hlt | ||
|> printf "%a\n" String.pp; | ||
printf "@]"; | ||
|
||
[%expect{| | ||
"The quick brown fox jumped over the red fence.\n" | ||
|}] | ||
|
||
let%expect_test "Stream.of_path_hlt" = | ||
let open Format in | ||
printf "@[<h>"; | ||
let _ = Stream.of_path_hlt "/home/cevans/test.txt" | ||
|> Stream.write_hlt (of_path_hlt ~mode:Mode.W_A "/home/cevans/test2.txt") | ||
in | ||
printf "@]"; | ||
|
||
[%expect{| |}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get rid of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it's possible to use the stand-alone test infrastructure similar to what I did for the entropy tests (in bootstrap/test/basis/
) for the cases where you need a test file to operate on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the overall way this is coming together.
bootstrap/src/basis/array.ml
Outdated
[0; 1; 2]; | ||
[0; 1; 2; 3]; | ||
] in | ||
List.iter streams ~f:test; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List.iter streams ~f:test; | |
List.iter streams ~f:test; |
| [] -> None | ||
| hd :: tl -> Some (hd, tl) | ||
end in | ||
let streams = List.map ~f:(Stream.init_indef ~f) [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was tempted to add Stream.of_list
, but I'll leave that alone for now and take care of it when I try to put Stream
everywhere that it belongs.
bootstrap/src/basis/file.mli
Outdated
type t = | ||
| R_O (** open existing file for read, failing if it does not exist *) | ||
| W (** create new or truncate existing file and open for write *) | ||
| W_A (** create new or append to existing file and open for write *) | ||
| W_AO (** append to existing file and open for write, failing if it does not | ||
exist *) | ||
| W_C (** create new file and open for write, failing if file exists *) | ||
| W_O (** truncate existing file and open for write, failing if file does | ||
not exist *) | ||
| RW (** create new or truncate existing file and open for read and write *) | ||
| RW_A (** create new append to existing file and open for read and write *) | ||
| RW_AO (** append to existing file and open for read and write, failing if | ||
file does not exist *) | ||
| RW_C (** create new file and open for read and write, failing if file | ||
exists *) | ||
| RW_O (** truncate existing file and open for read and write, failing if | ||
file does not exist *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned earlier today, I really like this approach, since the number of legal combinations isn't overwhelming, and this prevents weird/bad combinations.
bootstrap/src/basis/file.mli
Outdated
|
||
module Mode : sig | ||
type t = | ||
| R_O (** open existing file for read, failing if it does not exist *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general comment on, ahem, comments, I try hard to use capitalization and periods whenever it's feasible to write sentence-like comments (and often even for phrases vaguely approximating sentences, but that's not always a good fit). These doc strings are all complete sentences (yay), so they're almost great. 😉
bootstrap/src/basis/file.mli
Outdated
| W_O (** truncate existing file and open for write, failing if file does | ||
not exist *) | ||
| RW (** create new or truncate existing file and open for read and write *) | ||
| RW_A (** create new append to existing file and open for read and write *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| RW_A (** create new append to existing file and open for read and write *) | |
| RW_A (** create new or append to existing file and open for read and write *) |
bootstrap/src/basis/file.ml
Outdated
let t_res = of_path ?mode path in | ||
match t_res with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let t_res = of_path ?mode path in | |
match t_res with | |
match of_path ?mode path with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same basic idea in numerous other places. As a general rule, if you have to make up a temporary name that doesn't aid understanding, inline the expression.
bootstrap/src/basis/file.ml
Outdated
let rec fn i n chars fd = begin | ||
let n' = Unix.write fd chars i n in | ||
match n' = -1 with | ||
| true -> Some "Error reading file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| true -> Some "Error reading file" | |
| true -> Some "Error writing file" |
bootstrap/src/basis/file.mli
Outdated
(** [read_hlt n t] reads up to [n] bytes from [t] and returns the read bytes or | ||
halts if bytes could not be read. *) | ||
|
||
val write: Bytes.t -> t -> e option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to support partial writes. Ordinarily that just causes a bunch of trouble for files (yay interruptible syscalls), but what about for broken pipes, where we did write some bytes before the pipe was closed? I think this comes up for sockets too.
val write: Bytes.t -> t -> (usize, e) Result.t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Maybe partial writes should return an error that contain the number of lines written, and None
still indicates success? Regardless, the unix standard write operation returns the number of bytes written, so we should as well.
bootstrap/src/basis/file.ml
Outdated
match write file t with | ||
| Some e -> halt e | ||
| None -> () | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bootstrap/src/basis/file.ml
Outdated
let t', t'' = Stream.split buflen t in | ||
let bytes = Array.of_stream t' in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nice.
Cool, I agree about dropping the I'll fix the doc comments and add real file I/O tests in |
Oh, one thing we're missing here is a way to access files which have names that aren't UTF-8. We talked about having a bottom-level open function which takes a |
Ah, right. I'll fix that. I was hasty here and just plugged in the type that OCaml didn't complain about. Iirc, most filesystems just care about the byte content of the filename, but shells often interpret the bytes as utf8 depending on the locale environment variables? |
Yep, that's the way I understand it. Dealing with non-UTF-8 in the shells is difficult these days, which means non-UTF-8 file names almost never come up in practice. But we can't completely ignore them, unfortunately. |
e01cb67
to
82350d3
Compare
e9f58e9
to
27c2c17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot has changed since my first review. Once you have my feedback incorporated I'll do another full pass.
bootstrap/src/basis/file.mli
Outdated
type file = t | ||
type t = Buffer.t Stream.t | ||
|
||
(* val read: ?file !-> file * (Buffer.t, Error.t) result Stream.t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val read: ?file !-> file * (Buffer.t, Error.t) result Stream.t *) | |
(* val of_file: file -> (Buffer.t, Error.t) result Stream.t *) |
I'm uneasy about requiring isolated file references here (?file), because it requires some care to keep a reference isolated until it is consumed. Additionally, there's no benefit here, since we return a reference to the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent here was to consume a unique reference to a file. The stream would own it. The file
that we return here would only be useful for the File.close
function, basically the interruptable type we mentioned before. Every other function in the API that takes a file input except File.close
requires a the file to be a unique reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unique references are rather precious, and they require meticulous care. I've been calling them "isolated" rather than "unique", because the only thing that keeps them unique is isolation. We should definitely chat about this in more detail and figure out whether they're appropriate for this use case.
bootstrap/src/basis/file.mli
Outdated
(* val read_hlt: ?file !-> file * t *) | ||
val of_file_hlt: file -> file * t | ||
|
||
(* val write: ?file -> t !-> Error.t option * ?file *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val write: ?file -> t !-> Error.t option * ?file *) | |
(* val write: file -> t !-> Error.t option *) |
bootstrap/src/basis/file.mli
Outdated
(** [of_path mode path] opens file in [mode] at [path] and returns the resulting | ||
file or halts if file could not be opened. *) | ||
|
||
(* val of_stdin: unit !-> Bytes.t !-> ?t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val of_stdin: unit !-> Bytes.t !-> ?t *) | |
(* val of_stdin: unit !-> Bytes.t !-> t *) |
The file
isn't necessarily isolated; it could be a copy of a value in the environment. Same for stdout
and stderr
. In fact, these should probably be much simpler (values rather than functions).
(* val of_stdin: unit !-> Bytes.t !-> ?t *) | |
(* val stdin: t *) |
bootstrap/src/basis/file.c
Outdated
|
||
CAMLprim value | ||
hemlock_file_of_stdin_inner(value a_unit) { | ||
return hemlock_file_finalize_result(dup(STDIN_FILENO)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't dup the terminal file descriptors.
bootstrap/src/basis/file.mli
Outdated
(** [of_path mode path] opens file in [mode] at [path] and returns the resulting | ||
file or an error if file could not be opened. *) | ||
|
||
(* val of_path_hlt: ?flag:Flag.t -> ?mode:usize -> ^Buffer.t !-> t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val of_path_hlt: ?flag:Flag.t -> ?mode:usize -> ^Buffer.t !-> t *) | |
(* val of_path_hlt: ?flag:Flag.t -> ?mode:usize -> ^Buffer.t !$-> t *) |
bootstrap/src/basis/file.mli
Outdated
(* val read_into: !^Buffer.t -> t $-> Error.t option *) | ||
val read_into: Buffer.t -> t -> (Buffer.t, Error.t) result | ||
|
||
(* val read_into_hlt: !^Buffer.t -> t !-> Buffer.t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val read_into_hlt: !^Buffer.t -> t !-> Buffer.t *) | |
(* val read_into_hlt: !^Buffer.t -> t !$-> Buffer.t *) |
bootstrap/src/basis/file.mli
Outdated
(** [read n t] reads up to [n] bytes from [t] and returns the read bytes or an | ||
error if bytes could not be read. *) | ||
|
||
(* val read_hlt: t !-> @Buffer.t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val read_hlt: t !-> @Buffer.t *) | |
(* val read_hlt: t !$-> @Buffer.t *) |
bootstrap/src/basis/file.mli
Outdated
(** [write bytes t] writes [bytes] to [t] and returns an error if bytes could | ||
not be written. *) | ||
|
||
(* val write_hlt: ^Buffer.t -> t !-> usize *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val write_hlt: ^Buffer.t -> t !-> usize *) | |
(* val write_hlt: ^Buffer.t -> t !$-> usize *) |
bootstrap/src/basis/file.mli
Outdated
(** [close t] closes file [t] and returns an error if it could not be closed. | ||
*) | ||
|
||
(* val close_hlt: t !-> unit *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val close_hlt: t !-> unit *) | |
(* val close_hlt: t !$-> unit *) |
bootstrap/src/basis/file.mli
Outdated
index relatve to the beginning of the file or an error if it could not be | ||
changed. *) | ||
|
||
(* val seek_hlt: isize -> t !-> usize *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val seek_hlt: isize -> t !-> usize *) | |
(* val seek_hlt: isize -> t !$-> usize *) |
bootstrap/src/basis/file.mli
Outdated
after to the beginning of the file. Returns the new byte index relatve to | ||
the beginning of the file or an error if it could not be changed. *) | ||
|
||
(* val seek_hd_hlt: isize -> t !-> usize *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val seek_hd_hlt: isize -> t !-> usize *) | |
(* val seek_hd_hlt: isize -> t !$-> usize *) |
bootstrap/src/basis/file.mli
Outdated
before to the end of the file. Returns the new byte index relatve to the | ||
beginning of the file or an error if it could not be changed. *) | ||
|
||
(* val seek_tl_hlt: ?i:isize -> t !-> usize *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val seek_tl_hlt: ?i:isize -> t !-> usize *) | |
(* val seek_tl_hlt: ?i:isize -> t !$-> usize *) |
bootstrap/src/basis/file.mli
Outdated
writes contents of [byte_stream] to it. Returns an error if not all bytes | ||
could be written. *) | ||
|
||
(* val write_hlt: file -> t !-> unit *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val write_hlt: file -> t !-> unit *) | |
(* val write_hlt: file -> t !$-> unit *) |
bootstrap/src/basis/stream.mli
Outdated
@@ -9,6 +9,7 @@ open Rudiments | |||
|
|||
(** {1 Type and derivations} *) | |||
|
|||
(* (`a, >e) t *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* (`a, >e) t *) | |
(* ('a, >e) t *) |
bootstrap/src/basis/stream.mli
Outdated
@@ -34,63 +38,76 @@ val init_indef: f:('state -> ('a * 'state) option) -> 'state -> 'a t | |||
|
|||
(** {1 Length} *) | |||
|
|||
(* val length: 'a >e t >e-> usize *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(* val length: 'a >e t >e-> usize *) | |
(* val length: ('a, >e) t >e-> usize *) |
Same thing in several other places.
8cc7539
to
6e0143b
Compare
Merged via #84. |
ref #52