Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Revamp file reading and writing #4906

Merged
merged 8 commits into from
Aug 7, 2024
Merged

feat: Revamp file reading and writing #4906

merged 8 commits into from
Aug 7, 2024

Conversation

hargoniX
Copy link
Contributor

@hargoniX hargoniX commented Aug 3, 2024

This PR:

  • changes the implementation of readBinFile and readFile to only require two system calls (stat + read) instead of one read per 1024 byte chunk.
  • fixes a bug where Handle.getLine would get tripped up by a NUL character in the line and cut the string off. This is caused by the fact that the original implementation uses strlen and lean_mk_string which is the backer of mk_string does so as well.
  • fixes a bug where Handle.putStr and thus by extension writeFile would get tripped up by a NUL char in the line and cut the string off. Cause here is the use of fputs when a NUL char is possible.

Closes: #4891
Closes: #3546
Closes: #3741

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc August 3, 2024 08:06 Inactive
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Aug 3, 2024
@leanprover-community-mathlib4-bot
Copy link
Collaborator

leanprover-community-mathlib4-bot commented Aug 3, 2024

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e83f78d5af8887d2d384e2405adf85e3ee72fec4 --onto ee430b6c80f7fbd80c3cae1804e634cb4318d147. (2024-08-03 08:13:09)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e83f78d5af8887d2d384e2405adf85e3ee72fec4 --onto 647a5e94925791f6a16b629b6c16ffad60a7f478. (2024-08-06 08:28:35)

@hargoniX hargoniX marked this pull request as ready for review August 5, 2024 06:38
@hargoniX hargoniX requested a review from Kha August 5, 2024 06:38
IO.println cond
IO.FS.removeFile tmpFile

#eval test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use #guard_msgs, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well move this to tests/run now?

src/Init/System/IO.lean Outdated Show resolved Hide resolved
@Kha
Copy link
Member

Kha commented Aug 5, 2024

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit aaff432.
There were significant changes against commit e83f78d:

  Benchmark          Metric             Change
  ======================================================
+ lake build clean   task-clock          -1.2% (-16.7 σ)
+ stdlib             tactic execution    -1.5% (-19.1 σ)
+ stdlib             wall-clock          -1.6% (-31.0 σ)

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc August 6, 2024 08:22 Inactive
@hargoniX
Copy link
Contributor Author

hargoniX commented Aug 6, 2024

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 1be6f69.
There were no significant changes against commit e83f78d.

@hargoniX hargoniX requested a review from Kha August 6, 2024 09:27
Copy link
Member

@TwoFX TwoFX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

IO.println cond
IO.FS.removeFile tmpFile

#eval test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well move this to tests/run now?


def readFile (fname : FilePath) : IO String := do
let data ← readBinFile fname
match String.fromUTF8? data with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a copy of the data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately the runtime representation of ByteArray has the data part inlined instead of as a pointer and the layout of the ByteArray and String object are not compatible :/.

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc August 6, 2024 15:24 Inactive
@hargoniX hargoniX added this pull request to the merge queue Aug 7, 2024
Merged via the queue into master with commit 473b345 Aug 7, 2024
13 checks passed
@hargoniX
Copy link
Contributor Author

hargoniX commented Aug 7, 2024

Of course windows doesn't have getline and this breaks everything /o\

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IO.FS.readFile corrupts the contents of files containing null bytes null characters break readFile/getLine
6 participants