Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to share large content between repo directories. #766

Closed
jwittner opened this issue Oct 21, 2015 · 16 comments
Closed

No way to share large content between repo directories. #766

jwittner opened this issue Oct 21, 2015 · 16 comments

Comments

@jwittner
Copy link

My objective in our CI is to be able to sync multiple branches one to one with folders, but right now that means a new download of our large content - roughly 15 GB - for every new branch. Seems like I should be able to move the storage for the local media and it's metadata out to a shared folder which git-lfs could leverage. As far as I can tell there is no way to adjust the local media directory that reports from "git lfs env". If so then this might just be a documentation clarity issue. I'm not certain if there are implementation details preventing this either.

@technoweenie
Copy link
Contributor

I'd be into allowing the lfs.LocalMediaDir constant to change based on an environment variable. It wouldn't be difficult:

https://github.com/github/git-lfs/blob/f56cd54ad7f8e13df6415a568f5681394bd44b31/lfs/lfs.go#L99-L119

This would lead to a shared LFS directory on each CI server. I also wanted to experiment with some kind of LFS proxy, which could let you have your own shared storage for ALL your CI servers. This could greatly reduce the bandwidth with your upstream LFS server if it isn't already hosted inside your network.

@jwittner
Copy link
Author

Shared folders and the environment variable could get you close to a single storage location solution.

@jwittner
Copy link
Author

What about driving lfs.LocalMediaDir via the git-config - it would match the ways we configure the endpoint in git-lfs.

git config --add lfs.localMediaDir ?

@technoweenie
Copy link
Contributor

I like the git config option. Easy to configure per repo or globally 👍

@technoweenie technoweenie changed the title No way to share large content between repo directoies. No way to share large content between repo directories. Oct 21, 2015
@tristanz
Copy link

I have workflows that would also be helped by this option, so 👍 .

@sinbad
Copy link
Contributor

sinbad commented Oct 23, 2015

Have you tried using git worktree? Git LFS already supports that so if you create a separate worktree on the other branch, the lfs/objects directory will be shared between them just like the git repo itself. Another advantage is that this will automatically be respected by other functionality like the upcoming git lfs prune.

https://git-scm.com/docs/git-worktree

@tristanz
Copy link

That's a good suggestion, but my applications is slightly different. I want a local cache of all my objects across all my projects. We have large binaries that are shared across multiple projects and get added and removed fairly frequently. Ideally there would be a global LRU cache of objects. The folder for this cache could be configured per project.

@technoweenie
Copy link
Contributor

Sounds like there are two use cases here:

  1. Multiple clones for a CI server. Git worktrees may work here if the CI server supports them.
  2. A shared object directory for multiple repos. Doable, but the upcoming prune command (Prune feature (delete old / unreferenced local objects) #742) would have to be disabled for every repository. Otherwise one repository might remove every other repository's objects because they're not referenced in the first repository's history. This is definitely a more technical option, and users would be responsible for managing this shared directory according to their needs.

Anyone want to take this on?

@jwittner
Copy link
Author

Git worktree looks like a potential solution, I'm using Jenkins and will have to dig into whether I can use this easily in my case. If not, I'm definitely willing to start looking at a Git-lfs solution. Caveats: I'm new to Go and new to Git. If someone else wants to pick this up before that, just let me know here. I'll get back to this thread next week after digging into the worktree solution and let you know.

@sinbad
Copy link
Contributor

sinbad commented Oct 26, 2015

A shared object directory for multiple repos. Doable, but the upcoming prune command (#742) would have to be disabled for every repository.

You mean unrelated repos? If they were required to register with the shared directory somehow (just a dir with files that are pointers back to the repo) then prune could follow that and behave like it does for worktree, including those other repos in the retain set. Would need an equivalent of git worktree --prune to tidy up references to repos that went away though, and moving repos could be an issue.

In git-lob I supported a shared object store for many repos (related or not) and used hard links to track how many references existed to each object. Basically each repo used its own local object store but the files were really hard links to the shared directory. Advantage was moving folders around didn't break anything but it did complicate the storage engine somewhat (and limit support to filesystems with hard link support - I had to write a hard link implementation for Windows/NTFS). Also not sure how this would interoperate with worktrees now (this was before they were added to git).

@jwittner
Copy link
Author

Worktrees are working for me so I'm not compelled to add this functionality right now. @sinbad for the win.

@jlehtniemi-broadsoft
Copy link

What if the shared object directory could be provided as an alternative? Similar to git clone reference repository. I.e. the objects would be searched from:

  1. local LFS storage directory
  2. alternative object storage directory if configured
  3. LFS server

It would be very flexible setup for object sharing yet it should not cause any problems with prune command.

@technoweenie
Copy link
Contributor

Added to the roadmap in #1438.

@Wingjam
Copy link

Wingjam commented Apr 18, 2017

For now, how can we change lfs.LocalMediaDir?

@damirdev
Copy link

damirdev commented May 4, 2018

Old question, but I'm also confused about configuration in shared directory (for build controllers). I was able to find solution only after research in source code in IDE.
In source code:
config.go

if c.fs == nil {
		lfsdir, _ := c.Git.Get("lfs.storage")
		c.fs = fs.New(c.LocalGitDir(), c.LocalWorkingDir(), lfsdir)
	}

fs.go

// New initializes a new *Filesystem with the given directories. gitdir is the
// path to the bare repo, workdir is the path to the repository working
// directory, and lfsdir is the optional path to the `.git/lfs` directory.
func New(gitdir, workdir, lfsdir string) *Filesystem {
	fs := &Filesystem{
		GitStorageDir: resolveGitStorageDir(gitdir),
	}

	fs.ReferenceDir = resolveReferenceDir(fs.GitStorageDir)

	if len(lfsdir) == 0 {
		lfsdir = "lfs"
	}

	if filepath.IsAbs(lfsdir) {
		fs.LFSStorageDir = lfsdir
	} else {
		fs.LFSStorageDir = filepath.Join(fs.GitStorageDir, lfsdir)
	}

	return fs
}

Docs git-lfs-config.5.ronn
Notes aboute prune
Example (in global git config, windows):

>git config --global lfs.storage C:\lfs\storage
>git config lfs.storage
C:\lfs\storage
>git clone https://my-vsts-account.visualstudio.com/_git/lfs-test
Cloning into 'lfs-test'...
Unpacking objects: 100% (14/14), done.
Filtering content: 100% (5/5), 6.80 MiB | 801.00 KiB/s, done.
>ls C:\lfs\storage
incomplete  objects  tmp
>ls C:\lfs\storage\objects
18  1c  51  a7  ca
>cd lfs-test
>ls -a .git
.  ..  HEAD  config  description  hooks  index  info  logs  objects  packed-refs  refs

Per project directory in bash (mingw)

$ git -c lfs.storage=/C/lfs/storage/lfs-test clone https://my-vsts-account.visualstudio.com/_git/lfs-test && cd lfs-test && git config lfs.storage /C/lfs/storage/lfs-test

@ttaylorr
Copy link
Contributor

ttaylorr commented May 4, 2018

Hi @dtulepov, these are all great references. I'm not quite sure about your specific question: are you having trouble configuring lfs.storage, or is something not behaving as expected? Either way, I'd be more than happy to help, but I think that a new issue would be the best place to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants