Skip to content

Change stashcp to read caches list from wlcg-wpad servers by default#108

Merged
djw8605 merged 7 commits intoopensciencegrid:masterfrom
DrDaveD:download-stashservers
May 7, 2020
Merged

Change stashcp to read caches list from wlcg-wpad servers by default#108
djw8605 merged 7 commits intoopensciencegrid:masterfrom
DrDaveD:download-stashservers

Conversation

@DrDaveD
Copy link
Copy Markdown
Contributor

@DrDaveD DrDaveD commented Apr 9, 2020

See discussion in SOFTWARE-3516.

caches.json files can still be used with stashcp -j option, but if it is not set then the list of caches is read from the wlcg-wpad server and signature verified with the OSG cvmfs public key. Adds a new option -n to use an alternate provided list; currently the only option for that is '-n xroots'.

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 13, 2020

@djw8605 @efajardo @rynge please review.

Comment thread stashcp/__init__.py
@DrDaveD DrDaveD force-pushed the download-stashservers branch from 4954204 to 667853d Compare April 14, 2020 20:15
Copy link
Copy Markdown
Member

@djw8605 djw8605 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this logic could be made more clear if there was a new class that handled all of the interaction with the special formatted file. Would have functions to verify the authenticity, handle getting the right lines... Containing all that logic would make this easier to understand.

Comment thread stashcp/__init__.py Outdated
Comment thread stashcp/__init__.py Outdated
Comment thread stashcp/__init__.py Outdated
Comment thread stashcp/__init__.py Outdated
@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 14, 2020

I think a lot of this logic could be made more clear if there was a new class that handled all of the interaction with the special formatted file. Would have functions to verify the authenticity, handle getting the right lines...

There aren't currently any classes in the command and I don't think it's a good idea to change the style that much for a small subset. Some of this could probably be put into functions that are grouped closer together in the source file, would that do?

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 20, 2020

@djw8605 This has stagnated. Can we proceed?

How about @brianhlin, do you have any comments?

@brianhlin brianhlin requested a review from djw8605 April 20, 2020 21:18
@djw8605
Copy link
Copy Markdown
Member

djw8605 commented Apr 20, 2020

How do we update the list of caches?

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 21, 2020

By changing the list in CVMFS_EXTERNAL_URL in the master branch of the config-repo repository. The OSG cvmfs config repo is the osg branch and the EGI cvmfs config repo is the egi branch. The master branch is the superset of the two of them.

@matyasselmeci
Copy link
Copy Markdown
Contributor

If I read the code correctly, it still supports admin overrides via a caches.json, right? Can we keep the old caches.json as an example? Rename it to caches.json.example and stick it under the docs directory?

@efajardo
Copy link
Copy Markdown
Contributor

@matyasselmeci yes I support that and I still need that feature so I can use it for folks that do not have their caches in CVMFS, like CMS.

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 21, 2020

I put it in docs/configs. I also removed the longitude & latitude information because they were being ignored.

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 23, 2020

Ok, is anybody ready to approve this? Can it proceed?

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented Apr 27, 2020

This has been sitting here long enough, somebody please approve.

Comment thread docs/configs/caches.json Outdated
@brianhlin brianhlin requested a review from efajardo April 28, 2020 15:34
@brianhlin
Copy link
Copy Markdown
Member

At this point, these changes are over my head. @djw8605 @efajardo please look these changes over again and approve if appropriate.

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented May 6, 2020

I moved json loading and stashserver loading into functions as requested by @djw8605

@djw8605
Copy link
Copy Markdown
Member

djw8605 commented May 6, 2020

Another quick question. When it requests the list of stashcache servers in that file, is it in the ordered list for nearest?

@DrDaveD
Copy link
Copy Markdown
Contributor Author

DrDaveD commented May 6, 2020

The first line of the stashservers.dat api response is the dynamically calculated order of servers, and the rest is the static, signed contents which includes all the servers. stashcp then applies the order and rearranges them nearest to farthest. It is done this way because the server names are sensitive; we don't want a man-in-the-middle attack. Once we have the names, the ordering is not sensitive information so it does not need to be verified. By putting them together in one response, we have minimal communications overhead.

Copy link
Copy Markdown
Contributor

@efajardo efajardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@djw8605 djw8605 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djw8605 djw8605 merged commit 1acaa38 into opensciencegrid:master May 7, 2020
@DrDaveD DrDaveD deleted the download-stashservers branch May 7, 2020 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants