Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching multiple times on Windows with gix 0.63.0 eventually results in "Could not move a temporary file into its desired place" #1404

Closed
udoprog opened this issue Jun 16, 2024 · 7 comments · Fixed by #1411
Assignees
Labels
acknowledged an issue is accepted as shortcoming to be fixed C-Windows Windows-specific issues

Comments

@udoprog
Copy link

udoprog commented Jun 16, 2024

Current behavior 😯

I have a project that uses gix to try to fetch from a remote repo, it results in an error I'm either not sure how to deal with or if it's a bug.

After running the provided example project a few times (2-3), I get this output:

Info: done 0 items in 0.00s (0 items/s)
Info: done 0 items in 0.00s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Success: done 1.1KB in 0.00s (7.2MB/s)
Success: done 0.0 objects in 0.05s (0.0 objects/s)
Info: done 31 items in 0.05s (567 items/s)
Info: pack file at repo-0\.git\objects\pack\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack is retained despite failing to move the index file into place. You can use plumbing to make it usable.
Error: Could not move a temporary file into its desired place

Caused by:
    Access is denied. (os error 5)

Steps to reproduce 🕹

The gix dependency is configured like this to use reqwest:

[dependencies.gix]
version = "0.63.0"
features = [
    "blocking-http-transport-reqwest",
    "blocking-http-transport-reqwest-rust-tls-trust-dns",
    "blocking-http-transport-reqwest-native-tls",
]

I've set up a minimal project here which tries to fetch refs/tags/v4 from actions/upload-artifact:

https://github.com/udoprog/playground/tree/gix-issue

@Byron Byron added acknowledged an issue is accepted as shortcoming to be fixed C-Windows Windows-specific issues labels Jun 17, 2024
@Byron
Copy link
Member

Byron commented Jun 17, 2024

Thanks for reporting!

I expected the current tempfile handling to cause this eventually, on Windows, and it's similar to what I have been dealing with here.

While researching this I noticed that Git does implement retries for renaming tempfiles into place, and maybe, just maybe, the tempfile could set additional flags on Windows to make this more likely to work.

One thing is for sure - if something legitimately opened the index file for reading, nobody is allowed to write over it. Right now I assume nobody really opened it, but the laziness of the filesystem causes files to remain opened even after they have been closed in the same (or another) process.

CC @EliahKagan - tempfiles are the greatest issue on Windows and this comment sums up everything I know. Generally I have issues reproducing any of this organically, so I suppose more focussed test-programs have to be created to validate certain changes.

@udoprog
Copy link
Author

udoprog commented Jun 17, 2024

One thing is for sure - if something legitimately opened the index file for reading, nobody is allowed to write over it. Right now I assume nobody really opened it, but the laziness of the filesystem causes files to remain opened even after they have been closed in the same (or another) process.

I just tried the included reproduction from a cold but already initialized project (rebooted from yesterday), I verified that there were no open files, the same thing happens. I also noted that this is reliably reproduced for me by simply running the included example project two times. Any subsequent runs causes the issue to happen. There seems to be something about the state of the git repo which triggers this behavior.

I've included a zip file repo-0.zip which includes the state of my repo, extracting it to repo-0 into the reproduction project and running the project reliably reproduces the issue:
repo-0.zip

I've also run a diff between the first and the second run in the repo, it seems like when an existing repo is gix::open-ned, rather than constructed with git::init. With git::open it leaves the following files behind after the process completes:

udoprog@memphis:/mnt/d/repo/playground$ diff -r repo-1 repo-0
Only in repo-0/objects/pack: pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.idx
Only in repo-0/objects/pack: pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack

@udoprog
Copy link
Author

udoprog commented Jun 17, 2024

So I've added some file tracing to a branch of gitoxide, this branch in my playground repo uses it.

[dependencies.gix]
version = "0.63.0"
features = [
    "blocking-http-transport-reqwest",
    "blocking-http-transport-reqwest-rust-tls-trust-dns",
    "blocking-http-transport-reqwest-native-tls",
    "tracing",
]

[patch.crates-io]
gix = { git = "https://github.com/udoprog/gitoxide", branch = "file-tracing" }

With this, and tracing_subscriber configured, I get this:

$env:RUST_BACKTRACE="1"; $env:RUST_LOG="gix=trace"; cargo run
2024-06-17T13:07:14.686913Z DEBUG ThreadSafeRepository::open():open_from_paths(): gix_path::env: invoking git to get system prefix/exec path cmd="git.exe" "--exec-path"
2024-06-17T13:07:16.147940Z TRACE fetch::Prepare::receive(): gix_pack::index::init: open data path="repo-0\\objects\\pack\\pack-1ce5dc60458160361a10a41d8055651c170e39fd.idx"
2024-06-17T13:07:16.148946Z TRACE fetch::Prepare::receive(): gix_pack::data::file::init: open data path="repo-0\\objects\\pack\\pack-1ce5dc60458160361a10a41d8055651c170e39fd.pack"
2024-06-17T13:07:16.150513Z TRACE fetch::Prepare::receive(): gix_pack::index::init: open data path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.idx"
Info: done 0 items in 0.00s (0 items/s)
Info: done 0 items in 0.00s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Success: done 1.1KB in 0.00s (4.4MB/s)
Success: done 0.0 objects in 0.05s (0.0 objects/s)
Info: done 31 items in 0.05s (572 items/s)
2024-06-17T13:07:16.350491Z TRACE fetch::Prepare::receive():gix_pack::Bundle::write_to_directory(): gix_tempfile::handle::persist: persist self.id=1 path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack" res=Some(1)
2024-06-17T13:07:16.351354Z TRACE fetch::Prepare::receive():gix_pack::Bundle::write_to_directory(): gix_tempfile::handle::persist: persisted path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack"
2024-06-17T13:07:16.351828Z TRACE fetch::Prepare::receive():gix_pack::Bundle::write_to_directory(): gix_tempfile::handle::persist: persist self.id=2 path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.idx" res=Some(2)
2024-06-17T13:07:16.352651Z TRACE fetch::Prepare::receive():gix_pack::Bundle::write_to_directory(): gix_tempfile::handle::persist: errored path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.idx" error=Os { code: 5, kind: PermissionDenied, message: "Access is denied." }
Info: pack file at repo-0\objects\pack\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack is retained despite failing to move the index file into place. You can use plumbing to make it usable.
2024-06-17T13:07:16.354211Z TRACE gix_pack::index: close index path="repo-0\\objects\\pack\\pack-1ce5dc60458160361a10a41d8055651c170e39fd.idx"
2024-06-17T13:07:16.354532Z TRACE gix_pack::data: close data path="repo-0\\objects\\pack\\pack-1ce5dc60458160361a10a41d8055651c170e39fd.pack"
2024-06-17T13:07:16.354799Z TRACE gix_pack::index: close index path="repo-0\\objects\\pack\\pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.idx"
Error: Could not move a temporary file into its desired place

Caused by:
    Access is denied. (os error 5)

With this we can see that fetching opens the existing index files, which are still open when we try to persist the new temporary files in the same location since they share the same name.

This is not an issue on Linux since overwriting the files with new ones won't affect or be blocked by any existing open files. But since Windows uses file locking by default this is what we get.

@udoprog
Copy link
Author

udoprog commented Jun 17, 2024

I've included a zip file repo-0.zip ...

I realized that this issue is very likely to be transitory meaning the file I included might not work, since I believe the index files will be different depending on the content of the fetch. So if the remote changes between runs it might succeed.

@Byron
Copy link
Member

Byron commented Jun 17, 2024

Thanks for researching this and providing everything one would need for a reproduction.

Something I confused initially is that when you write index, you mean pack index, but I would understand index file (the Git stage in .git/index). That was confusing as of course the file will be read, but it is held completely in memory without retaining memory maps.

However, with packs and indices, this is different as they will be held open once the index (and later its pack) have been lazily loaded.

Once a fetch is repeated the same file is received, which it then tries to move into place.

What's great is that in this case, we could ignore these errors entirely (if the file destination existed, at least), as it's clear that thanks to the hash in the name, the new file must have the same content as the old one.

Of course, there is an issue before that which is that it should notice that there is nothing to do. Thanks to --depth = 1 this is not happening though as it will always try to deepen, provide the shallow boundary, and get the same pack.

On the wire, fix gix this looks like this:

0012command=fetch
001bagent=git/oxide-0.63.0
0001000ethin-pack
000eofs-delta
0010include-tag
0035shallow 65462800fd760344b1a7b4382951275a0abb4808
000ddeepen 1
0032want 65462800fd760344b1a7b4382951275a0abb4808
0032have 65462800fd760344b1a7b4382951275a0abb4808
0009done
0000

and for git -c http.proxy=localhost:9090 fetch --depth 1 https://github.com/actions/upload-artifact refs/tags/v4 it looks like this (with newlines added):

0011command=fetch
0024agent=git/2.39.3.(Apple.Git-146)
0016object-format=sha1
0001000dthin-pack
000dofs-delta
0034shallow 65462800fd760344b1a7b4382951275a0abb4808
000cdeepen 1
0032want 65462800fd760344b1a7b4382951275a0abb4808
0032have 65462800fd760344b1a7b4382951275a0abb4808
0000

So Git does the same, but does not fail. The reason for this probably is that it receives an empty pack.

0014acknowledgments
0031ACK 65462800fd760344b1a7b4382951275a0abb4808
000aready
00010011shallow-info
0001000dpackfile
0024�PACK�����������‚;بêµ�­jÇ\‚<ı>Ó003a�Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
0006��0000

interestingly, gix also receives an empty pack, but the difference is that it writes it to disk.

0011shallow-info
0001000dpackfile
0024�PACK�����������‚;بêµ�­jÇ\‚<ı>Ó003a�Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
0006��0000

So the solution should be to discard empty packs, as indeed, otherwise it should never receive a pack that it already has. But even if it does, it should be save to still use the assumption that an existing pack is the same as the one that would be put in place.

Byron added a commit that referenced this issue Jun 17, 2024
Failure can occour on Windows as it's likely such a pack or index is already opened
during negotiation. Then, when an empty fetch via `--depth 1` is repeated,
a temporary file would be renamed onto one an mmapped pack or index, causing
a failure.

Now packs or indices aren't written anymore if they are empty.
@Byron Byron linked a pull request Jun 17, 2024 that will close this issue
9 tasks
@Byron Byron self-assigned this Jun 17, 2024
Byron added a commit that referenced this issue Jun 17, 2024
Failure can occour on Windows as it's likely such a pack or index is already opened
during negotiation. Then, when an empty fetch via `--depth 1` is repeated,
a temporary file would be renamed onto one an mmapped pack or index, causing
a failure.

Now packs or indices aren't written anymore if they are empty.
Byron added a commit that referenced this issue Jun 19, 2024
Failure can occour on Windows as it's likely such a pack or index is already opened
during negotiation. Then, when an empty fetch via `--depth 1` is repeated,
a temporary file would be renamed onto one an mmapped pack or index, causing
a failure.

Now packs or indices aren't written anymore if they are empty.
@udoprog
Copy link
Author

udoprog commented Jun 20, 2024

Hi! I'm sorry to report that I'm still seeing this, but only when using gix::remote::fetch::Tags::None.

I've updated the reproduction branch:

https://github.com/udoprog/playground/tree/gix-issue

In order to reproduce, you have to delete the original repository used for testing. Since it seems like if you sync once with gix::remote::fetch::Tags::Included the issue no longer happens.

This is my full test session:

PS D:\Repo\playground> del repo-0

Confirm
The item at D:\Repo\playground\repo-0 has children and the Recurse parameter was not specified. If you continue, all children will be removed with the item. Are you sure you want to continue?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"): 
PS D:\Repo\playground> cargo run 
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.51s
     Running `target\debug\playground.exe`
Info: done 28 items in 0.03s (856 items/s)
Info: done 129981 items in 0.03s (3948749 items/s)
Info: done 0 items in 0.05s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Success: done 1.9KB in 0.00s (10.6MB/s)
Success: done 28.0 objects in 0.09s (312.0 objects/s)      
Info: done 34339 items in 0.09s (376858 items/s)
[src\git\mod.rs:42:9] r = Direct {
    full_ref_name: "refs/tags/v1",
    object: Sha1(241e9bf5f01f63286a020946ddaa3fd6d9c32cca),
}
[src\git\mod.rs:42:9] r = Direct {
    full_ref_name: "refs/heads/v1",
    object: Sha1(241e9bf5f01f63286a020946ddaa3fd6d9c32cca),
}
PS D:\Repo\playground> cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.50s
     Running `target\debug\playground.exe`
Info: done 28 items in 0.03s (874 items/s)
Info: done 129981 items in 0.03s (4034234 items/s)
Info: done 0 items in 0.05s (0 items/s)
Info: done 0 items in 0.05s (0 items/s)
Success: done 1.9KB in 0.00s (10.4MB/s)
Success: done 28.0 objects in 0.09s (316.0 objects/s)
Info: done 34339 items in 0.09s (380402 items/s)
Error: Could not move a temporary file into its desired place

Caused by:
    Access is denied. (os error 5)
error: process didn't exit successfully: `target\debug\playground.exe` (exit code: 1)
PS D:\Repo\playground> 

@Byron Byron reopened this Jun 21, 2024
@Byron
Copy link
Member

Byron commented Jun 21, 2024

Thanks for retesting this!

The core issue was that during negotiation, it would access packed objects, leaving the respective packs memory mapped. Then, when receiving a pack that is one that it already has, it fails. Pulling exactly the same pack should not happen, but it seems it more likely with --depth=1.

The solution will be to not fail if the target file already exists as we know it has the right content thanks to the hash in its filename.

Another option could be to mmap files such that Windows allows them to be replaced, but I think I will take the tempfile route for now.

Byron added a commit that referenced this issue Jun 21, 2024
…nes (#1404).

That way, on Windows there is no chance for access-denied errors.
Maybe there are other advantages as well even for Unix.

Since packs are usually written rarely, in comparison to loose objects,
the extra file accesss seems acceptable.
@Byron Byron removed a link to a pull request Jun 21, 2024
9 tasks
@Byron Byron linked a pull request Jun 21, 2024 that will close this issue
Byron added a commit that referenced this issue Jun 21, 2024
…nes (#1404).

That way, on Windows there is no chance for access-denied errors.
Maybe there are other advantages as well even for Unix.

Since packs are usually written rarely, in comparison to loose objects,
the extra file accesss seems acceptable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed C-Windows Windows-specific issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants