Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mutex around key refresh with get_public_keys_from_web() #137

Merged
merged 1 commit into from
May 14, 2024

Conversation

jthiltges
Copy link
Contributor

Limit key refresh to a single simultaneous request to avoid overloading issuers.

Limit key refresh to a single simultaneous request to avoid
overloading issuers.
Copy link
Contributor

@djw8605 djw8605 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this will only limit downloads when we already have a valid token in the database. We will still have a stampede when we first see an issuer, but hopefully not after that?

src/scitokens_internal.cpp Show resolved Hide resolved
@jthiltges
Copy link
Contributor Author

As far as I can tell, this will only limit downloads when we already have a valid token in the database. We will still have a stampede when we first see an issuer, but hopefully not after that?

That's correct. This PR focused on the refresh case alone, since that was causing the issue at hand. I could work on expanding it to include new issuers if you prefer.

@djw8605
Copy link
Contributor

djw8605 commented Apr 30, 2024

I'm wondering what type of unit tests would test this? Though, I don't think we need a unittest for this.

@djw8605
Copy link
Contributor

djw8605 commented Apr 30, 2024

Have you tested this on a system?

@jthiltges
Copy link
Contributor Author

I've tested it lightly with a build that included some logging to stdout, and it seemed to work as intended: for a burst of requests and an expired key, saw a single attempt on the issuer. I'd also checked with packet capture, and confirming just two connections to the issuer.

Getting proper testing would be great. Should be do-able, but I think we'd need some sort of logging from the library to verify.

@djw8605
Copy link
Contributor

djw8605 commented Apr 30, 2024

I would accept just making a quick build and running it on a production xrootd server for ~48 hours.

jthiltges added a commit to unlhcc/hcc-packaging that referenced this pull request Apr 30, 2024
@jthiltges
Copy link
Contributor Author

Getting back to this, I'd made a scratch build and have been running it on Nebraska xfer servers--along with cmsaf-dev xcaches--for a number of days now. It doesn't appear to have caused any issues.

Copy link
Contributor

@djw8605 djw8605 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djw8605 djw8605 merged commit 6fced37 into scitokens:master May 14, 2024
7 checks passed
@jthiltges jthiltges deleted the pr/mutex branch May 24, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants