An RDF and URN-based content-addressing datastore for creating, processing, exporting, and sharing snapshots of files and directories
Java CSS JavaScript HTML Makefile Batchfile
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
doc
ext-lib
installer
src
test/contentcouch
web
.classpath
.gitignore
.project
ContentCouch.jar.urn
JSON-FORMAT.txt
Makefile
README
build-jar.xml
winstone.bat
winstone.sh

README

ContentCouch is an RDF-and-URI-based system for storing and sharing
snapshots of a hierarchical filesystem.

More conceptual details about ContentCouch can be found on the wiki at
http://wiki.github.com/TOGoS/contentcouch and under doc/ in the project
directory.  This file is intended only to help you get started using
ContentCouch at a basic level.

== Installation ==

* Compile java classes by running build.sh or using Eclipse
* cd into the 'installler' directory
* Run install.bat or install.sh
  - this should create a file, 'ccouch-install.properties'
* Edit ccouch-install.properties, following the instructions therein
* Run install.bat/sh again
  - This will create a script with the classpath and default repository
    configured at the 'script-path' you gave in ccouch-install.properties.

You should now be able to run ccouch at the command line simply by typing
'ccouch' (or whatever you gave as script-path).

For more information, run 'ccouch -?' to see a list of sub-commands, and
'ccouch <subcommand> -?' to see a list of options for any given sub-command.

== Repositories and Configuration Files ==

Each repository can have a configuration file in it called 'ccouch-config'.

This file contains information about other local and remote repositories, and
the options within can also be given on the command line to the 'ccouch'
command.  Configuration files can also give default arguments for subcommands.

Example:

  # Some additional repositories to check when searching for content
  # or to cache heads from (using ccouch cache-heads //<repo-name>/...)
  -local-repo:music /home/ted/ccouch-music/
  -remote-repo:jon http://www.jon.com/ccouch/

  [checkout]
  # Extra arguments to be passed when 'ccouch checkout' is invoked:
  -link
  -merge

Configuration files can also be outside of the main repository and direct to
it.  For example, if there is a file, /home/ted/ccouch.conf:

  -repo /home/ted/ccouch/

Then invoking 'ccouch -repo /home/ted/ccouch.conf' would be equivalent to
'ccouch -repo /home/ted/ccouch/' (which indicates to use /home/ted/ccouch/
as the root directory of the main repository and also load whatever
options are given in /home/ted/ccouch/ccouch-config).

== Repository Roles ==

- main - this is the repository that will be checked first when loading
  files, and that files will be stored in when 'ccouch store' is invoked.
  This repository can reference any number of local and remote
  repositories.  The repository specified using -repo <path> is the main one.

- local - these are other repositories that can be accessed quickly
  (i.e. on the same filesystem or computer as the main repository).
  Specify a local repository using -local-repo[:<name>] <path>.
  
- remote - a repository residing on a machine that is slow or difficult
  to access, such as one connected only by a low-bandwidth link (such as
  the internet).  You generally want to cache files that are loaded from
  remote repositories rather than downloading them each time they are needed.
  Specify a remote repository using -remote-repo[:<name>] <path>.

== Heads ==

Each repository can have a 'heads' directory that contains named files.
The directory structure within heads has the following structure:

  heads/<origin-repo-name>/<project-name>/<version-number>

For example, if I have downloaded heads that were created on my home
computer, 'togos-win', relating to my music archives, I might have a file:

  heads/togos-win/music-archives/5

The origin repository name is included so that repositories can cache
information created by other repositories while providing some namespacing.
If togos-ubn has heads for togos-win, that is saying that togos-ubn is
storing these heads in togos-win's name.  It may be that togos-ubn has its
own music archives (in fact, it does) which may or may not mirror what is
on togos-win.  Because these things are namespaced, the files don't conflict
and it is easy to tell which version of music-archives you are accessing.

The project name part of the head path can itself be split up by sub-project.
You could have a head named like:

  heads/joe-laptop/work/ACME/timelogs/113

The ccouch commands and servlet recognise a URI form that looks like:

  x-ccouch-head:[//<repo-name>]/<origin-repo-name>/[<project-name>/[<version>]]

Using these URIs you can reference specific heads.  If <repo-name> is
included, the URI references the head in that specific repository. Otherwise,
only the main and local (except by the 'cache-heads' command, which will default to using <origin-repo-name> as <repo-name>).

If the URI ends with0 "/latest", it will reference the latest version found
in whichever repositories are being checked, where 'latest' is interpreted
as 'highest filename using natural comparison'.

If the URI ends with "/", it is interpreted as pointing the entire directory
of heads.

If a head is to be interpreted as an RDF document (usually heads contain
RDFified Commit objects), its URI must be prepended with "x-parse-rdf:".
e.g., to reference the latest Commit of togos-win's music archives:

  x-parse-rdf:x-ccouch-head://togos-win/togos-win/music-archives/latest

For convenience, 'ccouch cache-heads' does some special handling of head
paths.  For more information, run

  ccouch cache-heads -?

== Servlet ==

A shell script is included (winstone.sh for unix, winstone.bat for windows)
that will start up the Winstone servlet container with the
ContentCouchExplorerServlet.  This servlet may work with other containers,
but has not been tested with them.

The purpose of the servlet is to give you a way to easily poke around in
your repositories.  It is not a necessary component to use ccouch.

The included shell scripts point Winstone to the web/ directory in the
ContentCouch project.  This directory contains:

  repo-config - configuration file used to configure the servlet repository
  WEB-INF/classes - where all the class files are loaded from
  WEB-INF/web.xml - servlet configuration file

repo-config in the web folder should generally be a single line pointing
to the actual location of the main repository, e.g.

  -repo F:/datastore/ccouch/

== Example: Joe's Laptop == 

Joe has a laptop that he does a lot of work on.  He has a directory full of
photos, a directory full of music, and a bunch of work stuff.  He wants to
keep snapshots of these things, and so he sets up some ContentCouch
repositories.  He sets up one repository for the bulk of his files and
calls it "joe-laptop".  He keeps his work on a separate hard drive, and
so it makes sense for him to have a separate repository on that hard drive
for storing work stuff so that he can store and checkout using hardlinks.
(Note that if his work was on the same drive as his other files, he could
get away with using a single repository, and still keep data separate by
storing it in different 'sectors')

  joe-laptop       ; catch-all repository for files archived from joe-laptop
  joe-laptop-work  ; a repository storing joe's work-related files

Joe's 'work' hard drive is C, and his photos and music are stored on F, so
he sets up his repositores to be on the same drives as the files they are
storing (again, so that he can use hardlinks):

  C:/ccouch/work/
  F:/ccouch/main/

Joe eventually decides that he wants other people to be able to pull from
his repositories, and so he sets up a web server and maps his ccouch
repositories to the following URLs:

  http://laptop.joe.example.org/ccouch/main/
  http://laptop.joe.example.org/ccouch/work/

Next Joe is going to set up a script that will back up files from his friend
Bill.  Bill has a the following repository set up:

  (repo name)  (repo URL)
  bill         http://ccouch.billstuff.example.org/bill/
  
(Note that Bill was not required to include the name of his repository in
its URL, but doing so makes management easier).

Joe now adds a reference to Bill's repository to his main repository's
config file, F:/ccouch/main/ccouch-config:

  # Local stuff:
  -repo-name joe-laptop
  -local-repo C:/ccouch/work
  # Bill's repository:
  -remote-repo:bill http://ccouch.billstuff.example.org/bill/

  # I always wanna use hardlinks to store:
  [checkout]
  -link
  
  [store]
  -link

Now Joe is all set up to back up Bill's stuff.  Here is the shell script:

  # Download all heads from bill that originated there
  ccouch -repo F:/ccouch/main/ cache-heads x-ccouch-head://bill/bill/
  # Cache the objects that one of those heads points at
  ccouch -repo F:/ccouch/main/ cache x-parse-rdf:x-ccouch-head:bill/memoirs/latest
  
  # Also download heads that bill cached from his friend Dave:
  ccouch -repo F:/ccouch/main/ cache-heads x-ccouch-head://bill/dave/
  # And cache Dave's music:
  ccouch -repo F:/ccouch/main/ cache x-parse-rdf:x-ccouch-head:dave/music/latest

Now Joe can run that script to back up files from Bill and Dave.  He
might even run it from a cron job so that those things get
automatically backed up every night.