Use internal (local) git repo to version selected files #15

Open
sickill opened this Issue Jan 25, 2012 · 12 comments

8 participants

@sickill
Owner

There are some feature requests related to versioning synced files. I don't have implementing native versioning in plans because the goal of git-dude was and still is to be simple sync tool built on rsync.

However, I can see one, possibly working, solution based on git:

bitpocket could have internal, local, git repository (ie. hidden in .bitpocket/git) and working tree being your ~/BitPocket dir. All files could be ignored by default and you would run bitpocket track somefile to add it to git and start versioning. Later you could either run bitpocket snapshot at any time to commit all the changes from tracked files or run bitpocket sync --snapshot to commit+sync. This local git repository would be transferred with rsync to other machines like all other files resulting in all of the versions being accessible on all your machines.

This way the repository size can be reasonably small because of tracking only files you want (and you definitely don't want to track mp3 or avi files). Files not tracked by git would still be synced to other machines like they're now.

You could also be able to adjust .git_ignore file to ie. have all *.txt files automatically tracked without need for manual bitpocket track for each file.

Thoughts? Suggestions?

@mindctrl

Interesting idea. I like the ability to specify which files to track individually, and the option for a wildcard to track *.txt.

It would be neat to have the option to flip that and have it default to tracking all with option to ignore tracking on wildcards *.mp3, kinda like .bitpocket/exclude does now for the sync function.

When you said you could run 'bitpocket snapshot' to commit changes from tracked files, were you meaning that you could make changes to revisions and they would be synced/merged back into the master? I'm kinda confused about that.

@flxfxp

The problem with using git is that it does a complete sync of the repository, automatically syncing all revisions as well. This means that your usable size can be 20MB but the size of the repo can be as large as 1GB due to all revisions.

@sickill
Owner

@flxfxp Notice that such disk usage (20MB -> 1GB) may be only the result of storing binary files which you normally should not store in git. Of course you may want to version your PNG files or sth but I wouldn't expect such dramatic disk usage (correct me if I'm wrong).

@sickill
Owner

@mindctrl bitpocket snapshot would just do git add + git commit to master. Sync could could do one of two things:

  • rsync also internal git repository (con is that there might be already new commit in bitpocket remote's git repo, this could lead to losing HEAD)
  • git pull/push (con is that there might be conflict and we need to ask user somehow to resolve it).
@flxfxp

@sickill sorry but I'm going to correct you :) The main reason for Dropbox for a lot of people to store any data. For some people this means documents, mp3s, applications, program code, psds, designs, etc. The key element of Dropbox is that it doesn't judge: it works for any type of file, and by using a cloud revision system it keeps the actual Dropbox size small, whilst still keeping all revisions.

@kibiz0r

I agree with @flxfxp on this. To be a true Dropbox replacement, you must be able to version any kind of file without incurring a significant overhead.

The way I picture it is this:

  • The server has a git repo
  • Clients say bitpocket track <myfile> to add and commit a file
  • When clients sync, the server adds and commits tracked files
  • Clients always have the latest version of all files
  • Clients can view timestamps of previous versions of a particular file with bitpocket versions <myfile>
  • Clients can read an old copy with bitpocket read <myfile> <timestamp> which prints the contents to stdout | less
  • Clients can download an old copy with bitpocket open <myfile> <timestamp> which opens it from the temp dir
  • Clients can rewind a file to an old version with bitpocket rewind <myfile> <timestamp>
  • As far as I can tell, there shouldn't be any conflicts
@sickill
Owner

@kibiz0r Yeah, this is something that can work. The files that are being tracked by git would be excluded from rsync transfer. They would be sent to main repo with normal git pull/push combo. Initially I thought that git repo could be rsync'ed but that could possibly lose commits by resetting HEAD when there were diverged branches on both ends. pull/push would be much safer, it could require human intervention in case of file conflicts though.

@rennis250

Would this be a good project to work form?

https://github.com/karalabe/gitbox/wiki

@sickill
Owner

@rennis250 I'm not sure, I think rather not. But thanks for idea. Gitbox is built around Dropbox and bitpocket just around rsync.

@skaapgif

For git-like versioning of large files have a look at https://github.com/bup/bup

@torfason
Collaborator

torfason@b7f82cb provides another, somewhat different approach, to this, although the idea there (and in torfason@06f56e6) is more about data safety than about versioning.

That approach stores history in local git repositories on each client, and they are never synced. However, this could be adapted to run git on the server instead, creating a single, authoritative git repository there. I still think the git repositories should not be synced (as in bidirectional syncing), however, one could imagine a command to rsync the server repository to the client to look at the history.

@DanielHeath DanielHeath pushed a commit to DanielHeath/git-dude that referenced this issue Mar 30, 2014
@sickill Versioning! (see sickill/bitpocket#15) 02add46
@DanielHeath DanielHeath pushed a commit to DanielHeath/git-dude that referenced this issue Mar 30, 2014
@sickill Revert "Versioning! (see sickill/bitpocket#15)"
This reverts commit 02add46.
4bfd73d
@frank-dspeed

Hello frinds Syncing and versioning via git is already complet implamented via git annex v6 you can check in files and git annex sync and all that google it but i also forked of this script and will add git annex support to it the good news is you can use git annex with this script to even speed up syncs on realy big repositorys with some millions of files thats the use case where we found your script.

git annex is well but takes to long for our sync case.

git annex will add finally versioning of the backups into bitpocket and will make bitpocket one of the fastes 2 way sync tools for realy large datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment