Skip to content

Commit

Permalink
Merge pull request #122 from NBISweden/hotfix/master
Browse files Browse the repository at this point in the history
Hotfix/master
  • Loading branch information
viklund committed Sep 29, 2017
2 parents a7b354f + eb528cc commit bea4e39
Show file tree
Hide file tree
Showing 47 changed files with 949 additions and 694 deletions.
84 changes: 52 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,50 +14,70 @@ containers or as virtual machines.

| Components | Role |
|------------|------|
| db | Sets up a postgres database with appropriate schema |
| mq | Sets up a RabbitMQ message broker with appropriate accounts, exchanges, queues and bindings |
| inbox | SFTP server where user credentials are in the db component |
| frontend | Documentation for the users |
| connectors | Back and Forth communication between CentralEGA and LocalEGA |
| db | A Postgres database with appropriate schema |
| mq | A RabbitMQ message broker with appropriate accounts, exchanges, queues and bindings |
| inbox | SFTP server, acting as a dropbox, where user credentials are in the db component |
| monitors | Gathers the logs of all components |
| keys | Handles the GPG and master keys for encryption/decryption |
| keyserver | Handles the encryption/decryption keys |
| workers | Connect to the keys component (via SSL) and do the actual re-encryption work |
| vault | Stores the files from the staging area to the vault. It includes a verification step afterwards. |
| frontend | Documentation for the users |

The workflow is as follows and consists of two ordered parts.

### Handling users

Central EGA contains a database of users. The users' ID can be their Elixir-ID
(of which we handle the @elixir-europe.org suffix by stripping it).

The workflow is as follows. We indicate the involved component in
between parenthesis.
We have developped some custom-made NSS and PAM modules, allow user
authentication via either a password or an RSA key against the
CentralEGA database itself. The user is chrooted into their home
folder.

Central EGA drops a message containing the user account information,
which is picked up (connectors) and forwarded internally (mq).
The procedure is as follows. The inbox is started without any created
user. When a user wants log into the inbox (actually, only sftp
uploads are allowed), the NSS module looks up the username in a local
database, and, if not found, queries the CentralEGA database. Upon
return, we stores the user credentials in the local database and
create the user's home folder. The user now gets logged in if the
password or public key authentication succeeds. Upon subsequent login
attempts, only the local database is queried, until the user's
credentials expire, making the local database effectively acts as a
cache.

Once in the internal message broker (mq), the inbox service gets the
message and creates the user account. It simply drops the information
into the database and creates a home folder with the right
permissions. The user ID can be its Elixir-ID (of which we strip the
@elixir-europe.org). The custom-made NSS and PAM modules allow user
authentication via either a password or an RSA key. The user is
chrooted into their home folder.
After proper configuration, there is no user maintenance, it is
automagic. The other advantage is to have a central location of the
EGA users.

Note that it is also possible to add non-EGA users if necessary, by
adding them to the local database, and specifing a
non-expiration/non-flush policy for those users.


### Ingesting files

Central EGA drops a message per file to ingest, containing the
username, the filename and the checksums (along with their related
algorithm) of the encrypted file and the decrypted content. The
message is picked up (connectors) and forwarded internally (mq).
message is picked up by some ingestion workers. Many ingestion workers
can be created.

For each file, if it is found in the inbox (by a worker), checksums
are computed to verify the integrity of the file (ie. did we receive
it entirely). If the checksums are not provided, they will be derived
from companion files. That worker retrieves the decryption key (keys)
in a secure manner and decrypts the file.
For each file, if it is found in the inbox, checksums are computed to
verify the integrity of the file (ie. did we receive it entirely). If
the checksums are not provided, they will be derived from companion
files. That worker retrieves the decryption key in a secure
manner (from the keyserver) and decrypts the file.

To improve efficiency, each block that are decrypted are piped into a
separate process for re-encryption. This has the advantage to
constrain the memory usage per worker and save the re-encryption
time. Moreover, we compute the checksum of the decrypted
content. After completion, the re-encrypted file is located in the
staging area, with a UUID name, and a message is drop into the message
broker to signal that the next step can start.

The file is moved from the staging area into the vault. A verification
step is included to ensure that the storing went fine (vault+verify).
After that, a message of completion is sent (connector) to Central EGA
(mq).
time. In addition to the re-encryption, we also compute the checksum
of the decrypted content. After completion, the re-encrypted file is
located in the staging area, with a UUID name, and a message is
dropped into the local message broker to signal that the next step can
start.

The next step is to move the file from the staging area into the
vault. A verification step is included to ensure that the storing went
fine. After that, a message of completion is sent to Central EGA.
138 changes: 96 additions & 42 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,117 @@
# Deploy LocalEGA using Docker

## Preliminaries
## The environment variables

It is necessary to also create a `.env` file with the following variables:
(mostly used to parameterize the docker-compose file itself)
It is necessary to create a `.env` file with the following variables:
(mostly used to parameterize docker-compose)

COMPOSE_PROJECT_NAME=ega
CODE=<python/code/folder> # path to folder where setup.py is
CONF=<path/to/your/ini/file> # will be mounted in the containers as /etc/ega/conf.ini

SSL_CERT=<path/to/ssl.cert> # for the ingestion workers to communicate with the keys server
```
COMPOSE_PROJECT_NAME=ega
COMPOSE_FILE=ega.yml
Moreover, there are settings to include regarding the
encryption/decryption for the keys server. We locate those variables
(in order to not make them available to all containers) in the
subfolder (to be created in not already exisiting) `.env.d/keys`:
CODE=<python/code/folder> # path to folder where setup.py is
CONF=<path/to/your/ini/file> # will be mounted in the containers as /etc/ega/conf.ini
```
# settings regarding the encryption/decryption
KEYS=<path/to/keys.conf>
SSL_CERT=<path/to/ssl.cert> # for the ingestion workers to communicate with the keys server
SSL_KEY=<path/to/ssl.key>
RSA_SEC=<path/to/rsa/sec.pem>
RSA_PUB=<path/to/rsa/pub.pem>
PGP_SEC=<path/to/pgp/sec.key>
PGP_PUB=<path/to/pgp/pub.key>
PGP_PASSPHRASE='something'
PGP_PASSPHRASE=<something-complex>
GPG_HOME=<path/to/gpg/homedir> # including pubring.kbx, trustdb.gpg, private-keys-v1.d and openpgp-revocs.d
# Temporarily faking Central EGA
CEGA_USERS=<path/to/users/folder> # containing one .yml file per user
```

You may get started with some extra instructions to create
the [private data](stubbing/private.md).

For the database, we create `.env.d/db` containing:

```
POSTGRES_USER=postgres
POSTGRES_USER=<some-user>
POSTGRES_PASSWORD=<some-password>
```

## Running
For the keyserver, we create `.env.d/gpg` containing:

```
GPG_PASSPHRASE=the-correct-passphrase
```
## The CONF file

The file pointed by `CONF` should contain the values that reset those
from [defaults.ini](../src/lega/conf/defaults.ini). For example:

```
[DEFAULT]
# We want more output
log = debug
[ingestion]
gpg_cmd = /usr/local/bin/gpg --homedir ~/.gnupg --decrypt %(file)s
## Connecting to Central EGA
[cega.broker]
host = cega_mq
username = <some-user>
password = <some-password>
vhost = <some-vhost>
heartbeat = 0
[db]
host = ega_db
username = <same-as-POSTGRES_USER-above>
password = <same-as-POSTGRES_PASSWORD-above>
```

All the other values will remain unchanged.<br/>
Use `docker-compose exec <some-container> ega-conf --list` in any container (but inbox).

## The KEYS file

The file pointed by `KEYS` should contain the information about the
keys and will be located _only_ on the keyserver. For example:

```
[DEFAULT]
active_master_key = 1
[master.key.1]
seckey = /etc/ega/rsa/sec.pem
pubkey = /etc/ega/rsa/pub.pem
passphrase = <something-complex>
[master.key.2]
seckey = /etc/ega/rsa/sec2.pem
pubkey = /etc/ega/rsa/pub2.pem
passphrase = <something-complex-too>
```

Docker will map the path from `RSA_PUB` in the `.env` file to
`/etc/ega/rsa/pub.pem` in the keyserver container, for example.

## A Central EGA user

We fake the CentralEGA message broker and user database, with 2
containers: `cega_mq` and `cega_users`.

The `cega_users` is a very simple file-based server, that reads from
the folder pointed by `CEGA_USERS`. The latter contains one file per user, of the following form:

```
---
password_hash: $1$xyz$sx8gPI05DJdJe4MJx5oXo0
pubkey: ssh-rsa AAAAB3NzaC1yc...balbla...MiFw== some.comment@lega.sftp
expiration: some interval
```

The file name `john.yml` is used for the user `john`. You must at
least specify a `password_hash` or a `pubkey`. Other values can be
empty or missing.

# Running

docker-compose up -d

Expand All @@ -54,26 +131,3 @@ will be created on-the-fly by docker-compose.
## Status

docker-compose ps

## Example

<a href="https://asciinema.org/a/nhHCuLd7mYjL4UgKQDI7uRJHs">
<img src="https://asciinema.org/a/nhHCuLd7mYjL4UgKQDI7uRJHs.png" width="836" style="display:block;margin:0 auto;"/>
</a>

1) docker-compose up -d

We now create a "user" message in the broker. For that, we use the frontend and ega-publisher.

2) docker-compose exec frontend ega-publisher inbox --broker 'cega.broker' --routing 'sweden.user' 'test' 'ssh-pub-key' 'some-hash'

We upload an encrypted file (here named data1)

3) sftp -P 2222 test@localhost
sftp> put <absolute/or/relative/path/to/file/named/data1>

Finally, we create a "file" message, again using the frontend.

4) docker-compose exec frontend ega-publisher ingestion --broker 'cega.broker' --routing 'sweden.file' --enc "<bla>" --enc_algo 'md5' --unenc "<bla...>" --unenc_algo 'md5' test data1

Check now that the vault has the file and the message broker sent a message back in the CentralEGA queue, named `sweden.v1.commands.completed`.

0 comments on commit bea4e39

Please sign in to comment.