-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support creating partial mirror (Stratum 1) of CernVM-FS repo #3554
Comments
If a client tried to access a file that did not exist it would get 404 and then I believe fail over to a different server altogether, making the partial server no longer useful. Or is the idea to prevent clients from accessing parts of a repo at all? Wouldn't a proxy server or a writeable (non-preloaded) alien cache be a better solution? With those you can safely empty the cache or clean up files at will without risk of causing problems. |
I wonder if @boegel is expecting that partially replicating a repo would also prevent the non-replicated files from showing up in a directory listing to the user. I don't think that's a possibility, since the top level catalog (and therefore all nested catalogs) has to match the publish-time hash. |
The intention here is not to prevent users from accessing particular parts of the repository, but to only have an in-network copy of the parts of the repository that are actually relevant for that particular site. |
But the delay/404 would result in the local s1 not being used at all anymore. Is the goal to save a little bit of storage space, or are people objecting to replicating data they don't need? |
Maybe as a small note that disk storage shouldnt not be such a big problem: public repos at cern combined are around 650 TB, but on disk storage only take around 70 TB. Thats nearly a 10x reduction. I would argue that for big HPC sites that should not really be a problem to have a couple of TB for the eessi repo - and huge part of it is the container image repo unpacked.cern.ch . i would agree with @rptaylor and @DrDaveD that alien cache would be the better choice if storage space is really a problem. if eessi really grows so much that it becomes a problem, maybe it makes sense to split eessi based on the architecture? you can have the main repo that has symlinks to the different architectures (software-aarch64.eessi.io, software-86_64x.eessi.io, ...). And if a site that just want one specific architecture they just replicate their specific architecture. |
It would be nice if CernVM-FS could provide support for creating a Stratum 1 mirror for only parts of a CernVM-FS repository.
This is already supported by the
shrinkwrap
utility (see https://cvmfs.readthedocs.io/en/stable/cpt-shrinkwrap.html#creating-an-image-for-root), it would be nice to also support this for Stratum 1 serversThe text was updated successfully, but these errors were encountered: