a remotestorage server for POSIX systems
JavaScript C CSS Shell C++ Objective-C Python
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 4 commits ahead, 5 commits behind remotestorage:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
auth
demo
lib
scripts
src
test
tools
.gitignore
.gitmodules
CHANGELOG
CHANGES
COPYING
LICENSE
LIMITATIONS
Makefile
README
TODO
init-script-defaults
init-script.sh
package.json

README

krsd - A Kerberised RemoteStorage server implementation
=======================================================

Contents:

1. Introduction
2. Overview
2.1 remotestorage
2.2 webfinger
2.3 authorization tools
2.4 storage system
3. Installing
3.1 Dependencies
3.2 Getting the code
3.3 Building
3.4 Installing system-wide
3.5 Setting options
3.6 Integrating authorization

1) Introduction
---------------

remotestorage is an open specification for personal data storage. It is supposed
to replace the currently popular proprietary "cloud storage" protocols using an
open standard and thereby promoting the seperation of applications and their
data on the web.

For more information, check out these links:
  * http://remotestorage.io/ - Information about the remotestorage protocol
                               and current implementations.
  * http://unhosted.org/     - Philosophy, hands-on Tutorials and App collection.

2) Overview
-----------

krsd brings three things:
* a HTTP endpoint implementing remotestorage: /storage/{user}
* a HTTP endpoint implementing webfinger: /.well-known/webfinger

User management is based on Kerberos identities.  The SPNEGO mechanism
specified in RFC 4559 is used to this end.  Briefly put, this is a GSS-API
exchange embedded as base64 in WWW-Authenticate and Authorization headers
of the Negotiate type.

User accounts are unrelated to accounts on the server hosting this service.
Storage is both available to users and to services, and each is identified
with their usual principal names, including forms like these:

	xmpp/xmpp.arpa2.net@ARPA2.NET
	john@ARPA2.NET
	john/admin@ARPA2.NET

These forms are translated to paths on the local filesystem as described
below, under "Storage system".

krsd is a fork of the useful work on rs-serve, and the respective
locations of the original and derived work are:
https://github.com/remotestorage/rs-serve
https://github.com/arpa2/krsd
The name has been changed to avoid confusion with application users.  It is
not clear to date if this project will live its own life or get integrated
with the original branch.  In the first case, a separate name brings
clarity; in the second, it has been harmless.

krsd is entirely written in C, using mostly POSIX library functions. It
relies on a few portable libraries, see the list under "Dependencies" below.
It does however currently use the signalfd() system call, which is only
available on Linux. (this is a solvable problem though, if you want to
be able to run on another system, please open an issue to ask for help)

2.1) remotestorage
------------------

The currently implemented protocol version is "draft-dejong-remotestorage-01".  It has been modified to permit implied, that is HTTP-level, authentication and authorisation.  Demo code is available in a separate directory.

Currently the following features are supported:
* CORS support for all verbs (TODO: May not work at present)
* GET, PUT, DELETE requests on files and folders
* Opaque version strings (in directory listings and "ETag" header)
* Conditional GET, PUT and DELETE requests ("If-Match", "If-None-Match" headers)
* Protection of all non-public paths via authentication by the browser at the HTTP level, and authorisation based on a .k5remotestorage file in the destination file system
* Special handling of public paths (i.e. those starting with /public/), such that
  requests on non-directory paths succeed without authorization.
* HEAD requests on files and folders with "Content-Length" header
  (not part of remotestorage-01, only enabled when --experimental flag is given)

2.2) webfinger
--------------

The webfinger implementation only serves information about remotestorage
and is currently not extensible.
The hostname part of user addresses is expected to be the hostname set for
the rs-serve instance. This currently defaults to "local.dev" and can be
overridden with the --hostname option.
Virtual hosting (== hosting storage for multiple domains from a single
instance) is currently not supported.

The pathname returned ermits for the krsd to parse out two components of
the pathname to which it stores: /storage/domain.tld/user/ should be at
the beginning of the URI if it is to be accepted as a remoteStorage URI
on krsd.  Anything after this is taken as a path into the storage
structures used.

2.3) authorization
------------------

This version of rs-serve cannot employ the original authorization backend
and frontend, because it removes the implicit bearer flow described by
OAuth2, for reasons of security.  Instead, it relies on the implicit and
time-constrained access provided by Kerberos.

It might have been possible to rely on a bearer token obtained from a
Kerberos-specific OAuth2 node, which would have made minimal or no changes
to rs-serve, but the downside of that would be that the intermediate form
of the bearer tokens provide access for all times, whereas Kerberos tickets
passed directly put a time constraint on the exchange.  (This point however,
is negotiable. It seems that OAuth2 permits it.  Encryption to resource
and information-rich handling in the IDP would be possible, in a site
managed and controlled by the user's side of things.  Let's talk?)

Another option might have been to package Kerberos tickets in bearer tokens,
but that would have introduced an extra intermediate party with no added
value but unpacking / repacking the token, so it was deemed less attractive.

The current implementation does not constraint access to particular users,
or groups of users.  It is very likely that such a facility will be added
in a future version, possibly based on central management infrastructure.

The --auth-uri option now causes a warning, but is not fatal.


2.4) Storage system
-------------------

The payload data of the remotestorage endpoint is stored on the local filesystem
within the respective user's target directory.  Determining this directory is
done by splitting the principal name at the @ sign and reshaping it according
to these steps:

 0. Initially, the pathname of a realm is a fixed string that is shared by
    all principals.

    Example 0a.
	A fallback default could be /var/lib/krs/

    Example 0b.
	When using OpenAFS as a backing store for this service, the starting
	point /afs/ may be more useful.

 1. Firstly, the realm is removed and translated into its related domain name,
    which should be supported in the surrounding infrastructure.  In case of
    generic hosting, the surrounding infrastructure could be the DNS, with
    DNSSEC validated for sites that use it.  The result from this mapping is
    the DNS domain name, represented as lowercase characters without a
    trailing dot.

    Example 1a.
	ARPA2.NET is setup in the local infrastructure, and its domain name
	is found to be arpa2.net -- which is used as a pathname component.

    Example 1b.
	ARPA2.NET is not setup in local infrastructure, and it is looked up
	in DNS under the same domain name, arpa2.net -- and if that site
	thought it sufficiently important to sign with DNSSEC, then this
	will be validated.  What is looked up under the domain name is a
	TXT record named _kerberos, holding what should match literally
	with the realm name, case included.  In case of a mismatch, fail to
	help the user.  Since the realm is being looked up, rather than a
	DNS hostname, there will not be a trace up to parent domains.

    What we now concatenate the DNS-name to the installation directory
    as another subdirectory level.

 2. Secondly, the local part (before the @ sign) is treated literally as a
    path, with one or more levels.  Note that this means that a slash is
    treated as a directory separator.  The last level is, once again, a
    directory name.

    Example 2a.
	john@ARPA2.NET will append john/ to the path found so far.

    Example 2b.
	john/admin@ARPA2.NET will append john/admin/ to the path found
	so far.

    Example 2c.
	http/chitchat.arpa2.net will append http/chitchat.arpa2.net/ to
	the path found so far.

  3. Thirdly, a fixed directory name such as remotestorage/ is appended to
     the path name.  This is done to separate remote storage from local
     storage and from other remote storage components, and to make all the
     other directories unavailable.  This is of no use to local storage, but
     it is immensely useful when dealing with storage on an OpenAFS share
     that is also employed for other purposes.

The result is a concrete path into a filesystem.  A few complete examples may
be helpful at this point.

Example.  The user john@ARPA2.NET could be found in DNS zone arpa2.net, and
dependent on local settings his files could end up in mounted locations like:

  /afs/arpa2.net/john/remotestorage/...
  /var/lib/krs/arpa2.net/john/remotestorage/...

Example.  After changing user-ID to john/admin under the same realm, the
files accessible to John are found in places like:

  /afs/arpa2.net/john/admin/remotestorage/...
  /var/lib/krs/arpa2.net/john/admin/remotestorage/...

Example.  When a server is not acting on behalf of a user (through S4U with
Constrained Delegation) but on its own title, it depends on its principal
name.  For example, xmpp/xmpp.arpa2.net@ARPA2.NET could be found on:

  /afs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...
  /var/lib/krs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...

The filesystem path is configured in the webfinger profile, and may or may
not relate to the Kerberos Principal Name used to access the resource.
For full flexibility, the remotestorage directory should hold a file named
.k5remotestorage which must hold the Kerberos Principal Name on a line of
its own, if it is to have read/write access to anything underneath this
directory.

The reliance on filesystem paths implies a few noteworthy restrictions:

* The remotestorage endpoint cannot be used to store both a directory and a file
  under the same path (ignoring the trailing slash). That means you cannot store
  /foo/bar/baz and /foo/bar, but only one of them. This is a natural restriction
  of traditional filesystems, that is currently well adhered to by all apps using
  remotestorage (as far as I know).

* MIME types may not be exact for files that were added "out-of-band", that is
  not added via the remotestorage protocol, but by copying to the remotestorage/
  directory by other means. krsd stores MIME type and character encoding
  under the "user.mime_type" and "user.charset" extended attributes, given these
  are supported by the underlying filesystem. When these attributes aren't set,
  a MIME type is guessed using libmagic, which may not always yield desirable
  results. (for example an empty file, created using "touch" will be transmitted
  via remotestorage with a Content-Type header of "inode/x-empty; charset=binary")
  If even libmagic fails to make sense of a file, the Content-Type is set to
  "application/octet-stream; charset=binary".

* Filesystem privileges must be setup to grant access to the user.  In the
  case of OpenAFS, this will be based on the user, and krsd will act
  on behalf of the user when storing files.  In this case, Constrained
  Delegation must be permissive to S4U2Proxy use, and the principal ticket
  for the user must probably be created proxiable.  The details of this have
  not been ironed out yet.

3) Installing
-------------

These steps should enable you to install krsd.

3.1) Dependencies
-----------------

- GNU make
- pkg-config (or tweak the Makefile)
- gcc
- libc
- libevent (>= 2.0)
- libmagic
- libattr
- BerkeleyDB

On Debian based systems, this should give you all you need:

  apt-get install build-essential libevent-dev libmagic-dev libattr1-dev libssl-dev libdb-dev pkg-config

If you want to develop, you may also want debug symbols and valgrind (required by
leakcheck.sh script):

  apt-get install libevent-dbg valgrind

3.2) Getting the code
---------------------

Given you are reading this file, you probably have the code already, but just to
be sure:

Currently the krsd code is hosted on github.

You can browse it online, at:

  https://github.com/arpa2/krsd

or close it using git:

  git clone git://github.com/arpa2/krsd.git

Note that krsd itself is a clone of rs-serve, with the distinctive
feature that krsd introduces Kerberos authentication.

3.3) Building
-------------

Given you have all dependencies installed, simply run

  make

and you should be good to go.

3.4) Installing system-wide
---------------------------

To install the krsd binary to /usr/bin, run

  make install

as a privileged user.

To install somewhere else, tweak the Makefile first.

This will also install an init script to /etc/init.d/krsd and a default
configuration to /etc/default/krsd.

On Debian based systems (i.e. when "update-rc.d" is present), "make install"
will also install the krsd init script into /etc/rc*.d/.

3.5) Setting options
--------------------

There are a variety of options 

If you want to use the init script, you can set options in /etc/default/krsd,
otherwise just pass them on the command line.

Run:

  krsd --help

to get a list of supported options.

3.6) Integrating authorization
------------------------------

To integrate an authorization endpoint, you need to do two things:

* configure endpoint URI

  Set the --auth-uri option to a printf style format string. "%s" will be
  replaced with the username.

* configure your authorization endpoint to manage krsd tokens

  krsd doesn't care where tokens come from, but it need to know them to
  decide whether a given request is authorized or not. It maintains an internal
  store for authorizations (i.e. structures of [user-name, token, scopes]),
  which must be managed from the outside.

  The tools to do this are:

  * rs-add-token:

      Usage: rs-add-token <user> <token> <scope1> [<scope2> ... <scopeN>]

    - <user> is the login name of the user (krsd must be able to resolve
      it using getpwnam() in order to find the home directory)
    - <token> is the token string authenticating future requests. For krsd
      it is an opaque string.
    - <scope1>..<scopeN> are scope strings in the same form as described in
      draft-dejong-remotestorage-01, Section 9.

  * rs-remove-token:

      Usage: rs-remove-token <user> <token>

    <user> and <token> must both be given.
    If the token cannot be found, rs-remove-token terminates with non-zero status.

  * rs-list-tokens:

    Lists all currently installed tokens and their respective scopes.

    The output format is primarily meant for (human) debugging and subject to change.

4) Contributing
---------------

* Note that krsd is a fork of the original admirable work named rs-serve.
  Never blame the rs-serve authors for mistakes that I (Rick) have added.
  And don't expect them to support my mistakes!

* If you've found a bug, or have any questions, please open an issue on github:
  https://github.com/remotestorage/rs-serve/issues

* If you want to contribute, fork the project on github and send pull requests.

* In any case, don't hesitate to talk with us on IRC:
  #remotestorage and #unhosted, both on irc.freenode.org

  Webchat links:
  - #unhosted: http://webchat.freenode.net/?channels=unhosted
  - #remotestorage: http://webchat.freenode.net/?channels=remotestorage