Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix authenticated flag in a transfer adapter response #159

Merged
merged 5 commits into from
May 7, 2024
Merged

Fix authenticated flag in a transfer adapter response #159

merged 5 commits into from
May 7, 2024

Conversation

vit-zikmund
Copy link
Contributor

The authenticated flag in the /batch endpoint response is always true, which surely is not the case when the PRE_AUTHORIZED_ACTION_PROVIDER is null in the startup configuration.

This prevented me from using the github auth provider, because with authenticated: true the git client doesn't try the stored credentials previously used to authenticate against the batch endpoint.

There's couple loosely related tweaks included, see the details in the particular commits.

The Identity.id field is used in the JWTAuthenticator._generate_token_for_action as one of the fields in the preauth JWT. The human-readable GitHub login is better suitable for this place than an opaque number internally used by GitHub (which they also happen to call an 'id').

Additionally, add some comments to the code I managed to forget how it works :)
The LFS batch response always included authenticated = true, even when the default "PRE_AUTHORIZED_ACTION_PROVIDER" was disabled (= null). This led the git client to forever retry the download/upload attempts without realizing it should try the credentials from its creds store.

This comes in handy with the GitHub auth provider which can handle both the usual authentication and the same for the streaming transfer adapter.
The only package that's being packaged here is 'giftless', any other "rogue" directory confuses the packager.
@vit-zikmund
Copy link
Contributor Author

Finally fixed all lints and tests. Also here's a little story on how I got here.

In my setup, I'm using a storage backend on a private network (while the API is public), so I naturally ended up using the basic streaming transfer adapter. With this, the default JWT PRE_AUTHORIZED_ACTION_PROVIDER hitched a ride and everything seemed to be heading for the brightest future. This worked OK during numerous smoke tests, but once our folks started using the service for real (with a tens GiB project with a thousands binary files), the default JWTs coming back with the download action (but I guess other actions would be hit, too) beared a little ticking time bomb. Because suddenly an innocent git pull started downloading many of those changed files... and then hung forever (as luck would have it, when my colleague eventually ^C'd that, his day's worth of work was gone - I'd attribute that to the git lfs client, but there's still something we/I could have done better).

After some time trying to figure out what's wrong, I found there's a handful of successfully downloaded objects, but then just an endless string of 401s. All due to the JWTs being expired. The lfs client obviously has a very stupid retry logic that - without any backoff - retries any 401 immediately and forever. The only thing that made it stop retrying a particular file was ironically a 503 (server temporarily unavailable, usually to be retried) from the poor proxy I have on the front...

But let's focus now on why it had to retry in the first place. The default lifetime for those JWTs is 60s, but it's obviously possible that the downloading won't start at once for all the files due to some sane connection limits in the client. Therefore I'd expect the JWT's lifetime would span at least the lifetime of the batch response, which, as I found later, has its own 15 minute default. Giftless docs mention this config parameter only for the multipart transfer mode, which seems to have a different default. This could use some unification, I'm sure. And docs.

Looking at the decoded JWTs eventually made me do this mostly cosmetic 40eee64.

Finally understanding what's most of this preauth logic about (it took a while 🙄), I realized I could use my github auth provider to serve the same purpose (the streaming storage API URL neatly overlaps with the batch API URL, so the auth provider still know which org/repo it is) without dragging in the the JWT preauth logic. That's what I tried by disabling the PRE_AUTHORIZED_ACTION_PROVIDER in the config. Disabling worked, but then the streaming transfer adapter still always responded with authenticated: true, which made the git lfs client assume the authentication data is present in the object's metadata, which wasn't the case (client started with 401s right away and didn't even stop after the action lifetime expired, lol, it's a tough one) . Counting on the fact the preauth is disabled, I believe this flag should be rightfully false. So I introduced a piece of basic logic that sets authenticated to false if the preauth provider is not there at all. I can imagine we could push it further by introducing an interface for letting the preauth provider say how this flag should be set, but I don't see a particular use case for that atm. After the fix the git lfs client correctly provided saved github credentials for the hostname, which was properly handled by the github auth. I'm happy there, one auth for everything, and the github token doesn't expire that easily.

The last change 5575147 just tells setuptools to package only the giftless directory/package (because in my fork I have a new directory which breaks tox). I know this is rather presumptuous of me to want this upstream, but then again it doesn't break anything here 😇 (Just flick a finger and I'll revert it 😉).

@vit-zikmund
Copy link
Contributor Author

I dunno if I should tag you, dear @athornton (whoops, I did it again 😇), as you previously said you're not really part of the Datopian pack (still you were the most fruitful partner in getting things merged).

@athornton
Copy link
Collaborator

Sorry! Way behind on my email. I'll take a look at this stuff in a bit.

@athornton
Copy link
Collaborator

Packaging just giftless is fine, and, yeah, I got really lost trying to go through the preauth stuff too several months back.

This looks OK to merge to me, but I want to build an image from it and run that in testing for a little while to make sure that my sort-of-different-s3-via-Google-Workload-Identity doesn't break under it. From what I'm reading it shouldn't, but I should make sure, especially since I have a tough time following the thread through all the abstraction layers.

@athornton
Copy link
Collaborator

{
    "business": "GitLFS",
    "failure_count": 1,
    "monkey_count": 1,
    "name": "gitlfs",
    "start_time": "2024-03-28T22:52:07.192506Z",
    "success_count": 56033
}

The failure was when I restarted it with the new image. I'm going to merge, and then I'll get the dependabot PRs.

@athornton athornton merged commit 67ee430 into datopian:main May 7, 2024
7 checks passed
@vit-zikmund vit-zikmund deleted the fix-preauth branch May 7, 2024 16:33
@vit-zikmund
Copy link
Contributor Author

sweet! Thank you! I'm polishing some further patches for github (including docs), so stay tuned :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants