Skip to content

Commit

Permalink
Merge pull request #18 from tomster/master
Browse files Browse the repository at this point in the history
merge downstream changes
  • Loading branch information
tomster committed Sep 1, 2016
2 parents 091a315 + fb59298 commit 46ac881
Show file tree
Hide file tree
Showing 293 changed files with 8,314 additions and 2,636 deletions.
16 changes: 16 additions & 0 deletions .travis.yml
@@ -0,0 +1,16 @@
language: python
python: 2.7
sudo: false
env:
- TOX_ENV=py27
install:
- pip install setuptools-git
- pip install tox
before_script: cd application
script:
- tox -e $TOX_ENV
notifications:
irc:
- "irc.freenode.org#pyfidelity"
on_success: change
on_failure: change
7 changes: 7 additions & 0 deletions CHANGES.rst
@@ -1,3 +1,10 @@
0.2.0 - Unreleased
-------------------

- major refactoring
- use ephemeral cleanser jails
- use ephemeral storage for initial fileupload

0.1.10 - Unreleased
-------------------

Expand Down
162 changes: 154 additions & 8 deletions README.rst
Expand Up @@ -177,19 +177,165 @@ Finally::

After restarting the application, the new translations will be active.


Further Documentation
*********************

For more details check these links:

* `pyramid.i18n <http://docs.pylonsproject.org/projects/pyramid/en/1.3-branch/narr/i18n.html>`_
* `Chameleon <http://chameleon.repoze.org/docs/latest/i18n.html>`_
* `Babel <http://babel.edgewall.org/wiki/Documentation/0.9/index.html>`_

Roadmap
-------

While the original releases were geared towards an instance of the briefkasten application hosted by `ZEIT ONLINE <https://ssl.zeit.de/briefkasten/submit>`_ further development is planned to make the application useful 'out of the box'. In particular:
The life cycle of a submission
******************************

Users entrusting us with sensitive data is the key concern of the software and when and getting it straight where this data is stored for how long in what form is crucial.

The stages are numbered with a three digit integer code, allowing to group and sort them.

Status codes beginning with `0` mean that the submission is still being handled by the web application (and implies that it is still unencrypted)

The life of a submission begins with the POST of the client browser succeeding.
Any attachments are first stored in memory before writing them to disk into a dedicated dropbox directory.
At this point the submission has the status `010 received` and is readable in plaintext by any attacker who gains access to the application jail.

Next, the web application hands off the submission to an external processing script, which immediately either errors out or acknowledges the receipt of the drop directory.

The error case at this stage means that the cleansing setup is seriously broken and the web application will take it upon itself to delete the attachments immediately to avoid exposing them in plaintext unduly.
(TODO: a cronjob on the jailhost should additionally monitor for dropboxes in the 'submitted' or 'submitted failed' state for longer than a given threshold)

If the submission was successful (the process script returns `0` as exit code) the dropbox is considered to be `020 submitted`.

Once submitted, the cleanser performs basic sanity checking. If that fails for whatever reason it will set the status to `500 cleanser init failure`. Since it's basically being able to accept the attachments it will delete the attachment itself (TODO: confirm with @erdgeist)

If the process script determines that the cleansing setup is intact (whether locally or via one or more cleanser jails) it will set the status to `100 processing`.
The submission still resides in plaintext inside the application jail.

The process will now initiate the cleansing, either locally or by submitting it to a cleanser jail. Either way, once the submission is sucessful, the status will change to `200 quarantined`, and the submission is (finally) no longer readable inside the application jail.

If submission has failed the status will be set to `501 cleanser submission failure` and the attachments will be deleted.

Now we are left with three possible outcomes: success, failure during cleansing or timeout:

- `510 cleanser processing failure`
- `520 cleanser timeout failure`
- `900 success`

In all cases except `900` the attachments will have been deleted from the fileystem of the briefkasten host.


Further Documentation
*********************


TODO
====

general bugs
------------

X fix claim mechanism

X investigate 'heisenbug'

- update docs re: `source bin/activate`

x ensure appserver is running after config changes

X ensure testing secret is present in themed forms

x use private devpi with git-setuptools-version



feature: refactor process workflow
----------------------------------

- break into wrapping `process` call which will catch any exceptions and set the status accordingly
and will also be responsible for calling cleanup

- `Dropbox.process` is

x the only entry point into and encapsulates the entire cleansing process

x a long-running, synchronous call that always succeeds (to the caller)

x but catches underlying failures and updates the status of the dropbox accordingly

x always calls cleanup

x separate 'private' tasks:

x if we have attachments:

x create uncleansed, encrypted fallback copy of attachments
- failures:
- no valid keys

x clean attachments (this also encrypts them)
- failures:
x no cleansers configured
x no cleansers available
- time-out

x archive clean attachments if cleaning was successful and size is over limit

- archive uncleaned attachments if cleaning failed
(re-uses the initially created encrypted backup before that is wiped during cleanup)

x notify editors via email

x (include cleaned attachments if cleaning was sucessful and size below limit)
x otherwise include link to share


feature: large attachments support
----------------------------------

x calculate total size of attachments

x add configurable threshold value (support MB/GB via humanfriendly)

x configure cleansed/uncleansed file system paths

x configure formatstrings to render them as shares


feature: asynchronous workers
-----------------------------

x separate worker process (either using celery, or a custom worker)

x runs in separate jail with mapped dropbox container file system

x reads identical confguration on init

x watches for appearance of new dropboxes and reacts to according to their status

x keep dropbox specific settings in settings.json file inside container directory, only keep pyramid specific settings in .ini file (including path to dropbox container)

TODOS:

x create `worker` entry point

x create supervisord config for worker

x create configuration reader (hardcode python dict for now)

x factor rendering of email text out of pyramid view into separate dropbox subtask


feature: re-activate watchdog feature
-------------------------------------

x set recipients to configured watchdog address instead of editors

- integrate watchdog setup into makefile and base.conf

- configure watchdog without buildout and from ploy.conf values


feature: local janitor (in python)
----------------------------------

- create cronjob (in worker jail)

* provide fully functional deployment scripts that create a 'best practice' installation from scratch, including web server, SSL setup, installation of all dependencies etc.
- write tests for erdgeist's python code :)
4 changes: 3 additions & 1 deletion application/.coveragerc
@@ -1,5 +1,7 @@
[run]
omit = briefkasten/tests/*
omit =
briefkasten/tests/*
briefkasten/testing.py
[report]
# Regexes for lines to exclude from consideration
exclude_lines =
Expand Down
23 changes: 13 additions & 10 deletions application/Makefile
@@ -1,18 +1,21 @@
pyversion = 2.7
python = python$(pyversion)
buildoutcfg = development
cfgs = buildout development deployment

all: buildout
all: bin/tox

$(cfgs): %: %.cfg bin/buildout
bin/buildout -c $@.cfg
tests: bin/py.test
@bin/py.test

buildout.cfg:
ln -s $(buildoutcfg).cfg buildout.cfg
bin/pserve: requirements.txt bin/pip
bin/pip install -r requirements.txt
@touch $@

bin/buildout: bin/pip
bin/pip install zc.buildout
bin/tox bin/py.test bin/devpi: bin/python bin/pip setup.py bin/pserve
bin/python setup.py dev
@touch $@

upload: setup.py bin/py.test bin/devpi
PATH=${PWD}/bin:${PATH} bin/devpi upload --no-vcs

bin/python bin/pip:
virtualenv .
Expand All @@ -21,4 +24,4 @@ bin/python bin/pip:
clean:
git clean -fXd

.PHONY: all $(cfgs) clean
.PHONY: all $(cfgs) clean tests upload
1 change: 1 addition & 0 deletions application/README.rst
67 changes: 51 additions & 16 deletions application/briefkasten/__init__.py
@@ -1,17 +1,43 @@
# -*- coding: utf-8 -*-
from pyramid.config import Configurator
from pyramid.httpexceptions import HTTPNotFound

from dropbox import DropboxContainer
dropbox_container = DropboxContainer()

from pyramid.httpexceptions import HTTPNotFound, HTTPGone
from pyramid.i18n import TranslationStringFactory
from itsdangerous import SignatureExpired, URLSafeTimedSerializer
from .dropbox import DropboxContainer, generate_drop_id

_ = TranslationStringFactory('briefkasten')


def generate_post_token(secret):
""" returns a URL safe, signed token that contains a UUID"""
return URLSafeTimedSerializer(secret, salt=u'post').dumps(generate_drop_id())


def parse_post_token(token, secret, max_age=300):
return URLSafeTimedSerializer(secret, salt=u'post').loads(token, max_age=max_age)


def dropbox_post_factory(request):
"""receives a UUID via the request and returns either a fresh or an existing dropbox
for it"""
try:
drop_id = parse_post_token(
token=request.matchdict['token'],
secret=request.registry.settings['post_secret'])
except SignatureExpired:
raise HTTPGone('dropbox expired')
except Exception: # don't be too specific on the reason for the error
raise HTTPNotFound('no such dropbox')
dropbox = request.registry.settings['dropbox_container'].get_dropbox(drop_id)
if dropbox.status_int >= 20:
raise HTTPGone('dropbox already in processing, no longer accepts data')
return dropbox


def dropbox_factory(request):
""" expects the id of an existing dropbox and returns its instance"""
try:
return dropbox_container.get_dropbox(request.matchdict['drop_id'])
return request.registry.settings['dropbox_container'].get_dropbox(request.matchdict['drop_id'])
except KeyError:
raise HTTPNotFound('no such dropbox')

Expand All @@ -31,6 +57,7 @@ def is_equal(a, b):


def dropbox_editor_factory(request):
""" this factory also requires the editor token"""
dropbox = dropbox_factory(request)
if is_equal(dropbox.editor_token, request.matchdict['editor_token'].encode('utf-8')):
return dropbox
Expand All @@ -39,21 +66,29 @@ def dropbox_editor_factory(request):


def german_locale(request):
""" a 'negotiator' that always returns german"""
return 'de'


def main(global_config, **settings):
""" Configure and create the main application. """
def configure(global_config, **settings):
config = Configurator(settings=settings, locale_negotiator=german_locale)
config.begin()
config.add_translation_dirs('briefkasten:locale')
app_route = settings.get('appserver_root_url', '/')
config.add_static_view('%sstatic/deform' % app_route, 'deform:static')
config.add_static_view('%sstatic' % app_route, 'briefkasten:static')
config.include('pyramid_deform')
config.add_renderer('.pt', 'pyramid_chameleon.zpt.renderer_factory')
config.add_route('fingerprint', '%sfingerprint' % app_route)
config.add_route('dropbox_form', '%ssubmit' % app_route)
config.add_route('dropbox_editor', '%s{drop_id}/{editor_token}' % app_route, factory=dropbox_editor_factory)
config.add_route('dropbox_view', '%s{drop_id}' % app_route, factory=dropbox_factory)
config.scan()
dropbox_container.init(settings)
return config.make_wsgi_app()
config.add_route('dropbox_form_submit', '%s{token}/submit' % app_route, factory=dropbox_post_factory)
config.add_route('dropbox_fileupload', '%s{token}/upload' % app_route, factory=dropbox_post_factory)
config.add_route('dropbox_editor', '%sdropbox/{drop_id}/{editor_token}' % app_route, factory=dropbox_editor_factory)
config.add_route('dropbox_view', '%sdropbox/{drop_id}' % app_route, factory=dropbox_factory)
config.add_route('dropbox_form', app_route)
config.scan(ignore=['.testing'])
config.registry.settings['dropbox_container'] = DropboxContainer(root=config.registry.settings['fs_dropbox_root'])
config.commit()
return config


def main(global_config, **settings):
""" Configure and create the main application. """
return configure(global_config, **settings).make_wsgi_app()

0 comments on commit 46ac881

Please sign in to comment.