Permalink
Browse files

Docs.

  • Loading branch information...
1 parent 49b34ac commit 83ab12bcc94510bff5bc8d3cc16635d35dc38b66 @chrisrossi chrisrossi committed Sep 13, 2012
Showing with 164 additions and 32 deletions.
  1. +32 −22 acidfs/__init__.py
  2. +132 −10 docs/index.rst
View
@@ -15,17 +15,27 @@
class AcidFS(object):
"""
- Exposes a view of a filesystem with ACID semantics usable via the
- `transaction <http://pypi.python.org/pypi/transaction>`_ package. The
- filesystem is backed by a `Git <http://git-scm.com/>`_ repository. An
- instance of `AcidFS` is not thread safe and should not be shared by multiple
- concurrent contexts.
+ An instance of `AcidFS` exposes a transactional filesystem view of a `Git`
+ repository. Instances of `AcidFS` are not threadsafe and should not be
+ shared across threads, greenlets, etc.
- Constructor Arguments
+ **Paths**
- ``path``
+ Many methods take a `path` as an argument. All paths use forward slash `/`
+ as a separator, regardless of the path separator of the
+ underlying operating system. The path `/` represents the root folder of
+ the repository. Paths may be relative or absolute: paths beginning with a
+ `/` are absolute with respect to the repository root, paths not beginning
+ with a `/` are relative to the current working directory. The current
+ working directory always starts at the root of the repository. The current
+ working directory can be changed using the :meth:`chdir` and
+ :meth:`cd` methods.
- The path in the real, local fileystem of the repository.
+ **Constructor Arguments**
+
+ ``repo``
+
+ The path to the repository in the real, local filesystem.
``head``
@@ -47,20 +57,20 @@ class AcidFS(object):
session = None
_cwd = ()
- def __init__(self, path, head='HEAD', create=True, bare=False):
- wdpath = path
- dbpath = os.path.join(path, '.git')
+ def __init__(self, repo, head='HEAD', create=True, bare=False):
+ wdpath = repo
+ dbpath = os.path.join(repo, '.git')
if not os.path.exists(dbpath):
wdpath = None
- dbpath = path
+ dbpath = repo
if not os.path.exists(os.path.join(dbpath, 'HEAD')):
if create:
- args = ['git', 'init', path]
+ args = ['git', 'init', repo]
if bare:
args.append('--bare')
else:
- wdpath = path
- dbpath = os.path.join(path, '.git')
+ wdpath = repo
+ dbpath = os.path.join(repo, '.git')
subprocess.check_output(args)
else:
raise ValueError('No database found in %s' % dbpath)
@@ -143,13 +153,13 @@ def cd(self, path):
def open(self, path, mode='r'):
"""
- Open a file for reading or writing. Supported modes are::
+ Open a file for reading or writing. Supported modes are:
- + 'r', file is opened for reading
- + 'w', file opened for writing
- + 'a', file is opened for writing in append mode
+ + `r`, file is opened for reading
+ + `w`, file opened for writing
+ + `a`, file is opened for writing in append mode
- 'b' may appear in any mode but is ignored. Effectively all files are
+ `b` may appear in any mode but is ignored. Effectively all files are
opened in binary mode, which should have no impact for platforms other
than Windows, which is not supported by this library anyway.
@@ -197,8 +207,8 @@ def open(self, path, mode='r'):
def listdir(self, path=''):
"""
- Return list of files in directory indicated py `path`. If `path` is
- omitted, the current working directory is used.
+ Return list of files in indicated directory. If `path` is omitted, the
+ current working directory is used.
"""
session = self._session()
obj = session.find(self._mkpath(path))
View
@@ -3,20 +3,142 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
-Welcome to AcidFS's documentation!
-==================================
+======
+AcidFS
+======
-Contents:
+**The filesystem on ACID**
-.. toctree::
- :maxdepth: 2
+`AcidFS` allows interaction with the filesystem using transactions with ACID
+semantics. `Git` is used as a back end, and `AcidFS` integrates with the
+`transaction <http://pypi.python.org/pypi/transaction>`_ package allowing use of
+multiple databases in a single transaction.
+
+Features
+========
+
++ Changes to the filesystem will only be persisted when a transaction is
+ committed and if the transaction succeeds.
+
++ Within the scope of a transaction, your application will only see a view of
+ the filesystem consistent with that filesystem's state at the beginning of the
+ transaction. Concurrent writes do not affect the current context.
+
++ A full history of all changes is available, since files are stored in a
+ backing `Git` repository. The standard `Git` toolchain can be used to recall
+ past states, roll back particular changes, replicate the repository remotely,
+ etc.
+
++ Changes to a `AcidFS` filesystem are synced automatically with any other
+ database making use of the `transaction` package and its two phase commit
+ protocol, eg. `ZODB` or `SQLAlchemy`.
+
++ Most common concurrent changes can be merged. There's even a decent chance
+ concurrent modifications to the same text file can be merged.
+
++ Transactions can be started from an arbitrary commit point, allowing, for
+ example, a web application to apply the results of a form submission to the
+ state of your data at the time the form was rendered, making concurrent edits
+ to the same resource less risky and effectively giving you transactions that
+ can span request boundaries.
+
+Motivation
+==========
+
+The motivation for this package is the fact that it often is convenient for
+certain very simple problems to simply write and read data from a fileystem,
+but often a database of some sort winds up being used simply because of the
+power and safety available with a system which uses transactions and ACID
+semantics. For example, you wouldn't want a web application with any amount of
+concurrency at all to be writing directly to the filesystem, since it would be
+easy for two threads or processes to both attempt to write to the same file at
+the same time, with the result that one change is clobbered by another, or even
+worse, the application is left in an inconsistent, corrupted state. After
+thinking about various ways to attack this problem and looking at `Git's`
+datastore and plumbing commands, it was determined that `Git` was a very good fit,
+allowing a graceful solution to this problem.
+
+Limitations
+===========
+
+In a nutshell:
+
++ Only platforms where `fcntl` is available are supported. This excludes
+ Microsoft Windows and probably the JVM as well.
++ Kernel level locking is used to manage concurrency. This means `AcidFS`
+ cannot handle multiple application servers writing to a shared network drive.
++ The type of locking used only synchronizes other instances of `AcidFS`.
+ Other processes manipulating the `Git` repository without using `AcidFS`
+ could cause a race condition. A repository used by `AcidFS` should only be
+ written to by `AcidFS` in order to avoid unpleasant race conditions.
+
+All of the above limitations are a result of the locking used to synchronize
+commits. For the most part, during a transaction, nothing special needs to
+be done to manage concurrency since `Git's` storage model makes management of
+multiple, parallel trees trivially easy. At commit time, however, any new data
+has to be merged with the current head which may have changed since the
+transaction began. This last step should be synchronized such that only one
+instance of `AcidFS` is attempting this at a time. The mechanism, currently,
+for doing this is use of the `fcntl` module which takes advantage of an
+advisory locking mechanism available in Unix kernels.
-Indices and tables
-==================
+Usage
+=====
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+`AcidFS` is easy to use. Just create an instance of `acidfs.AcidFS` and start
+using the filesystem::
+ import acidfs
+
+ fs = acidfs.AcidFS('path/to/my/repo')
+ fs.mkdir('foo')
+ with fs.open('/foo/bar', 'w') as f:
+ print >> f, 'Hello!'
+
+If there is not already a `Git` repository at the path specified, one is created.
+An instance of `AcidFS` is not thread safe. The same `AcidFS` instance should
+not be shared across threads or greenlets, etc.
+
+The `transaction <http://pypi.python.org/pypi/transaction>`_ package is used to
+commit and abort transactions::
+
+ import transaction
+
+ transaction.commit()
+ # If no exception has been thrown, then changes are saved! Yeah!
+
+.. note::
+
+ If you're using `Pyramid <http://www.pylonsproject.org/>`_, you should use
+ `pyramid_tm <http://pypi.python.org/pypi/pyramid_tm>`_. For other WSGI
+ frameworks there is also `repoze.tm2
+ <http://pypi.python.org/pypi/repoze.tm2>`_.
+
+API
+===
+
+.. automodule:: acidfs
+
+ .. autoclass:: AcidFS
+
+ .. automethod:: open
+ .. automethod:: cwd
+ .. automethod:: chdir
+ .. automethod:: cd(path)
+ .. automethod:: listdir
+ .. automethod:: mkdir
+ .. automethod:: mkdirs
+ .. automethod:: rm
+ .. automethod:: rmdir
+ .. automethod:: rmtree
+ .. automethod:: mv
+ .. automethod:: exists
+ .. automethod:: isdir
+ .. automethod:: empty
+ .. automethod:: get_base
+ .. automethod:: set_base
+
+.. toctree::
+ :maxdepth: 2

0 comments on commit 83ab12b

Please sign in to comment.