Merge pull request #122 from NBISweden/hotfix/master

Hotfix/master
EGA-archive · Sep 29, 2017 · bea4e39 · bea4e39
2 parents a7b354f + eb528cc
commit bea4e39
Show file tree

Hide file tree

Showing 47 changed files with 949 additions and 694 deletions.
diff --git a/README.md b/README.md
@@ -14,50 +14,70 @@ containers or as virtual machines.
 
 | Components | Role |
 |------------|------|
-| db         | Sets up a postgres database with appropriate schema |
-| mq         | Sets up a RabbitMQ message broker with appropriate accounts, exchanges, queues and bindings |
-| inbox      | SFTP server where user credentials are in the db component |
-| frontend   | Documentation for the users |
-| connectors | Back and Forth communication between CentralEGA and LocalEGA |
+| db         | A Postgres database with appropriate schema |
+| mq         | A RabbitMQ message broker with appropriate accounts, exchanges, queues and bindings |
+| inbox      | SFTP server, acting as a dropbox, where user credentials are in the db component |
 | monitors   | Gathers the logs of all components |
-| keys       | Handles the GPG and master keys for encryption/decryption |
+| keyserver  | Handles the encryption/decryption keys |
 | workers    | Connect to the keys component (via SSL) and do the actual re-encryption work |
 | vault      | Stores the files from the staging area to the vault. It includes a verification step afterwards. |
+| frontend   | Documentation for the users |
+
+The workflow is as follows and consists of two ordered parts.
+
+### Handling users
+
+Central EGA contains a database of users. The users' ID can be their Elixir-ID
+(of which we handle the @elixir-europe.org suffix by stripping it).
 
-The workflow is as follows. We indicate the involved component in
-between parenthesis.
+We have developped some custom-made NSS and PAM modules, allow user
+authentication via either a password or an RSA key against the
+CentralEGA database itself. The user is chrooted into their home
+folder.
 
-Central EGA drops a message containing the user account information,
-which is picked up (connectors) and forwarded internally (mq). 
+The procedure is as follows. The inbox is started without any created
+user. When a user wants log into the inbox (actually, only sftp
+uploads are allowed), the NSS module looks up the username in a local
+database, and, if not found, queries the CentralEGA database. Upon
+return, we stores the user credentials in the local database and
+create the user's home folder. The user now gets logged in if the
+password or public key authentication succeeds. Upon subsequent login
+attempts, only the local database is queried, until the user's
+credentials expire, making the local database effectively acts as a
+cache.
 
-Once in the internal message broker (mq), the inbox service gets the
-message and creates the user account. It simply drops the information
-into the database and creates a home folder with the right
-permissions. The user ID can be its Elixir-ID (of which we strip the
-@elixir-europe.org). The custom-made NSS and PAM modules allow user
-authentication via either a password or an RSA key. The user is
-chrooted into their home folder.
+After proper configuration, there is no user maintenance, it is
+automagic. The other advantage is to have a central location of the
+EGA users.
+
+Note that it is also possible to add non-EGA users if necessary, by
+adding them to the local database, and specifing a
+non-expiration/non-flush policy for those users.
+
+
+### Ingesting files
 
 Central EGA drops a message per file to ingest, containing the
 username, the filename and the checksums (along with their related
 algorithm) of the encrypted file and the decrypted content. The
-message is picked up (connectors) and forwarded internally (mq).
+message is picked up by some ingestion workers. Many ingestion workers
+can be created.
 
-For each file, if it is found in the inbox (by a worker), checksums
-are computed to verify the integrity of the file (ie. did we receive
-it entirely). If the checksums are not provided, they will be derived
-from companion files. That worker retrieves the decryption key (keys)
-in a secure manner and decrypts the file.
+For each file, if it is found in the inbox, checksums are computed to
+verify the integrity of the file (ie. did we receive it entirely). If
+the checksums are not provided, they will be derived from companion
+files. That worker retrieves the decryption key in a secure
+manner (from the keyserver) and decrypts the file.
 
 To improve efficiency, each block that are decrypted are piped into a
 separate process for re-encryption. This has the advantage to
 constrain the memory usage per worker and save the re-encryption
-time. Moreover, we compute the checksum of the decrypted
-content. After completion, the re-encrypted file is located in the
-staging area, with a UUID name, and a message is drop into the message
-broker to signal that the next step can start.
-
-The file is moved from the staging area into the vault. A verification
-step is included to ensure that the storing went fine (vault+verify).
-After that, a message of completion is sent (connector) to Central EGA
-(mq).
+time. In addition to the re-encryption, we also compute the checksum
+of the decrypted content. After completion, the re-encrypted file is
+located in the staging area, with a UUID name, and a message is
+dropped into the local message broker to signal that the next step can
+start.
+
+The next step is to move the file from the staging area into the
+vault. A verification step is included to ensure that the storing went
+fine.  After that, a message of completion is sent to Central EGA.
diff --git a/docker/README.md b/docker/README.md
@@ -1,40 +1,117 @@
 # Deploy LocalEGA using Docker
 
-## Preliminaries
+## The environment variables
 
-It is necessary to also create a `.env` file with the following variables:
-(mostly used to parameterize the docker-compose file itself)
+It is necessary to create a `.env` file with the following variables:
+(mostly used to parameterize docker-compose)
 
-	COMPOSE_PROJECT_NAME=ega
-	CODE=<python/code/folder>    # path to folder where setup.py is
-	CONF=<path/to/your/ini/file> # will be mounted in the containers as /etc/ega/conf.ini
-
-	SSL_CERT=<path/to/ssl.cert>  # for the ingestion workers to communicate with the keys server
+```
+COMPOSE_PROJECT_NAME=ega
+COMPOSE_FILE=ega.yml
 
-Moreover, there are settings to include regarding the
-encryption/decryption for the keys server.  We locate those variables
-(in order to not make them available to all containers) in the
-subfolder (to be created in not already exisiting) `.env.d/keys`:
+CODE=<python/code/folder>    # path to folder where setup.py is
+CONF=<path/to/your/ini/file> # will be mounted in the containers as /etc/ega/conf.ini
 
-```
+# settings regarding the encryption/decryption
 KEYS=<path/to/keys.conf>
+SSL_CERT=<path/to/ssl.cert>  # for the ingestion workers to communicate with the keys server
 SSL_KEY=<path/to/ssl.key>
 RSA_SEC=<path/to/rsa/sec.pem>
 RSA_PUB=<path/to/rsa/pub.pem>
-PGP_SEC=<path/to/pgp/sec.key>
-PGP_PUB=<path/to/pgp/pub.key>
-PGP_PASSPHRASE='something'
-PGP_PASSPHRASE=<something-complex>
+GPG_HOME=<path/to/gpg/homedir> # including pubring.kbx, trustdb.gpg, private-keys-v1.d and openpgp-revocs.d
+
+# Temporarily faking Central EGA
+CEGA_USERS=<path/to/users/folder> # containing one .yml file per user
 ```
 
+You may get started with some extra instructions to create
+the [private data](stubbing/private.md).
+
 For the database, we create `.env.d/db` containing:
 
 ```
-POSTGRES_USER=postgres
+POSTGRES_USER=<some-user>
 POSTGRES_PASSWORD=<some-password>
 ```
 
-## Running
+For the keyserver, we create `.env.d/gpg` containing:
+
+```
+GPG_PASSPHRASE=the-correct-passphrase
+```
+## The CONF file
+
+The file pointed by `CONF` should contain the values that reset those
+from [defaults.ini](../src/lega/conf/defaults.ini). For example:
+
+```
+[DEFAULT]
+# We want more output
+log = debug
+
+[ingestion]
+gpg_cmd = /usr/local/bin/gpg --homedir ~/.gnupg --decrypt %(file)s
+
+## Connecting to Central EGA
+[cega.broker]
+host = cega_mq
+username = <some-user>
+password = <some-password>
+vhost = <some-vhost>
+heartbeat = 0
+
+[db]
+host = ega_db
+username = <same-as-POSTGRES_USER-above>
+password = <same-as-POSTGRES_PASSWORD-above>
+```
+
+All the other values will remain unchanged.<br/>
+Use `docker-compose exec <some-container> ega-conf --list` in any container (but inbox).
+
+## The KEYS file
+
+The file pointed by `KEYS` should contain the information about the
+keys and will be located _only_ on the keyserver. For example:
+
+```
+[DEFAULT]
+active_master_key = 1
+
+[master.key.1]
+seckey = /etc/ega/rsa/sec.pem
+pubkey = /etc/ega/rsa/pub.pem
+passphrase = <something-complex>
+
+[master.key.2]
+seckey = /etc/ega/rsa/sec2.pem
+pubkey = /etc/ega/rsa/pub2.pem
+passphrase = <something-complex-too>
+```
+
+Docker will map the path from `RSA_PUB` in the `.env` file to
+`/etc/ega/rsa/pub.pem` in the keyserver container, for example.
+
+## A Central EGA user
+
+We fake the CentralEGA message broker and user database, with 2
+containers: `cega_mq` and `cega_users`.
+
+The `cega_users` is a very simple file-based server, that reads from
+the folder pointed by `CEGA_USERS`. The latter contains one file per user, of the following form:
+
+```
+---
+password_hash: $1$xyz$sx8gPI05DJdJe4MJx5oXo0
+pubkey: ssh-rsa AAAAB3NzaC1yc...balbla...MiFw== some.comment@lega.sftp
+expiration: some interval
+```
+
+The file name `john.yml` is used for the user `john`. You must at
+least specify a `password_hash` or a `pubkey`. Other values can be
+empty or missing.
+
+# Running
 
 	docker-compose up -d
 
@@ -54,26 +131,3 @@ will be created on-the-fly by docker-compose.
 ## Status
 
 	docker-compose ps
-
-## Example
-
-<a href="https://asciinema.org/a/nhHCuLd7mYjL4UgKQDI7uRJHs">
-<img src="https://asciinema.org/a/nhHCuLd7mYjL4UgKQDI7uRJHs.png" width="836" style="display:block;margin:0 auto;"/>
-</a>
-
-	1) docker-compose up -d
-
-We now create a "user" message in the broker. For that, we use the frontend and ega-publisher.
-
-	2) docker-compose exec frontend ega-publisher inbox --broker 'cega.broker' --routing 'sweden.user' 'test' 'ssh-pub-key' 'some-hash'
-
-We upload an encrypted file (here named data1)
-
-	3) sftp -P 2222 test@localhost
-	sftp> put <absolute/or/relative/path/to/file/named/data1>
-
-Finally, we create a "file" message, again using the frontend.
-
-	4) docker-compose exec frontend ega-publisher ingestion --broker 'cega.broker' --routing 'sweden.file' --enc "<bla>" --enc_algo 'md5' --unenc "<bla...>" --unenc_algo 'md5' test data1
-
-Check now that the vault has the file and the message broker sent a message back in the CentralEGA queue, named `sweden.v1.commands.completed`.