Skip to content

Commit

Permalink
Docs/sftp inbox+other docs (#97)
Browse files Browse the repository at this point in the history
  • Loading branch information
blankdots committed Dec 18, 2023
2 parents 5a08609 + d781dc2 commit 9f64e52
Show file tree
Hide file tree
Showing 5 changed files with 90 additions and 53 deletions.
1 change: 1 addition & 0 deletions aggregate-mappings.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"sda/cmd/s3inbox/s3inbox.md": "docs/services/s3inbox.md",
"sda/cmd/syncapi/syncapi.md": "docs/services/syncapi.md",
"sda/cmd/sync/sync.md": "docs/services/sync.md",
"sda-sftp-inbox/README.md": "docs/services/sftpinbox.md",
"./GETTINGSTARTED.md": "docs/guides/sda-dev-test-doc.md",
"sda/sda.md": "docs/services/sda.md"
}
Expand Down
5 changes: 5 additions & 0 deletions docs/dictionary/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -295,3 +295,8 @@ wyenrumyh
yaml
yihkqimti
yml
FS
Mina's
SPRINGFRAMEWORK
env
programmatically
10 changes: 0 additions & 10 deletions docs/guides/secret-management.md

This file was deleted.

79 changes: 79 additions & 0 deletions docs/services/sftpinbox.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# SFTP Inbox

## Federated EGA/LocalEGA login system

`CentralEGA` contains a database of users, with IDs and passwords.

We have developed a solution based on [Apache Mina SSHD](https://mina.apache.org/sshd-project/)
to allow user authentication via either a password or an RSA key against the CentralEGA database
itself. The user is locked within their home folder, which is done programmatically using
[RootedFileSystem](https://github.com/apache/mina-sshd/blob/master/sshd-core/src/main/java/org/apache/sshd/common/file/root/RootedFileSystem.java).

The solution uses `CentralEGA`'s user IDs but can also be extended to
use LifeScience AAI IDs (of which we strip the ``@elixir-europe.org`` suffix).

The procedure is as follows. The inbox is started without any created
user. When a user wants to log into the inbox (actually, only ``sftp``
uploads are allowed), the code looks up the username in a local
cache, and, if not found, queries the `CentralEGA` [REST endpoint](https://nss.ega-archive.org/spec/). Upon
return, we store the user credentials in the local cache and create
the user's home directory. The user now gets logged in if the password
or public key authentication succeeds. Upon subsequent login attempts,
only the local cache is queried, until the user's credentials
expire. The cache has a default TTL of 5 minutes, and is wiped clean
upon reboot (as a cache should). Default TTL can be configured via ``CACHE_TTL`` env var.

The user's home directory is created when its credentials upon successful login.
Moreover, for each user, we detect when the file upload is completed and compute its
checksum.

## S3 integration

Default storage back-end for the inbox is local file-system. But we also support S3 service as a back-end. It can be
enabled using S3-related env-vars (see configuration details below).

If S3 is enabled, then files are still going to be stored locally, but after successful upload, they will going to be
uploaded to the specified S3 back-end. With this approach local file-system plays role of so called "staging area",
while S3 is the real final destination for the uploaded files.

## Configuration

Environment variables used:


| Variable name | Default value | Description |
|---------------------|--------------------|-----------------------------------------------------------------|
| BROKER_USERNAME | guest | RabbitMQ broker username |
| BROKER_PASSWORD | guest | RabbitMQ broker password |
| BROKER_HOST | mq | RabbitMQ broker host |
| BROKER_PORT | 5672 | RabbitMQ broker port |
| BROKER_VHOST | / | RabbitMQ broker vhost |
| BROKER_EXCHANGE | sda | RabbitMQ broker exchange |
| BROKER_ROUTING_KEY | files | RabbitMQ broker routing key |
| INBOX_PORT | 2222 | Inbox port |
| INBOX_LOCATION | /ega/inbox/ | Path to POSIX Inbox backend |
| INBOX_FS_PATH | | Prefix path when custom filesystem is used on top of POSIX |
| INBOX_KEYPAIR | | Path to RSA keypair file |
| KEYSTORE_TYPE | JKS | Keystore type to use, JKS or PKCS12 |
| KEYSTORE_PATH | /etc/ega/inbox.jks | Path to Keystore file |
| KEYSTORE_PASSWORD | | Password to access the Keystore |
| CACHE_TTL | 300.0 | CEGA credentials time-to-live |
| CEGA_ENDPOINT | | CEGA REST endpoint |
| CEGA_ENDPOINT_CREDS | | CEGA REST credentials |
| S3_ENDPOINT | inbox-backend:9000 | Inbox S3 backend URL |
| S3_REGION | us-east-1 | Inbox S3 backend region (us-east-1 is default in Minio) |
| S3_ACCESS_KEY | | Inbox S3 backend access key (S3 disabled if not specified) |
| S3_SECRET_KEY | | Inbox S3 backend secret key (S3 disabled if not specified) |
| S3_BUCKET | | Inbox S3 backend secret bucket (S3 disabled if not specified) |
| USE_SSL | true | true if S3 Inbox backend should be accessed by HTTPS |
| LOGSTASH_HOST | | Hostname of the Logstash instance (if any) |
| LOGSTASH_PORT | | Port of the Logstash instance (if any) |

If `LOGSTASH_HOST` or `LOGSTASH_PORT` is empty, Logstash logging will not be enabled.

In addition, environment variables can be used to configure log level for different packages. Package loggers can be configured using corresponding package names, for example, to turn of logs of Spring, one can set environment variable `LOGGING_LEVEL_ORG_SPRINGFRAMEWORK=OFF`, or to set Mina's own logs to debug: `LOGGING_LEVEL_SE_NBIS_LEGA_INBOX=DEBUG`, etc.

### SFTP Inbox Local Development/Testing

For local development/testing see instructions in [dev_utils](https://github.com/neicnordic/sensitive-data-archive/tree/main/sda-sftp-inbox/dev_utils) folder.
There is an README file in the [dev_utils](https://github.com/neicnordic/sensitive-data-archive/tree/main/sda-sftp-inbox/dev_utils) folder with sections for running the pipeline locally using Docker Compose.
48 changes: 5 additions & 43 deletions docs/submission.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ Submission Inbox
have developed several solutions allowing user authentication against
CentralEGA user database:

- [Apache Mina Inbox](submission.md#apache-mina-inbox);
- [Apache Mina Inbox](submission.md##sftp-inbox);
- [S3 Proxy Inbox](submission.md#s3-proxy-inbox);
- [TSD File API](submission.md#tsd-file-api).

Expand All @@ -157,48 +157,10 @@ CentralEGA REST endpoint. Upon return, we store the user credentials in
the local cache and create the user's home directory. The user now gets
logged in if the password or public key authentication succeeds.

### Apache Mina Inbox

This solution makes use of [Apache Mina SSHD
project](https://mina.apache.org/sshd-project/), the user is locked
within their home folder, which is done by using `RootedFileSystem`.

The user's home directory is created upon successful login. Moreover,
for each user, we detect when the file upload is completed and compute
its checksum. This information is provided to CentralEGA via a
[shovel mechanism on the local message broker](connection.md).
We can configure default cache TTL via `CACHE_TTL` environment variable.

#### Apache Mina Configuration

Environment variables used:

| Variable name | Default value | Description |
|:----------------------|:-------------------|:-----------------------------------------------------------|
| `BROKER_USERNAME` | guest | RabbitMQ broker username |
| `BROKER_PASSWORD` | guest | RabbitMQ broker password |
| `BROKER_HOST` | mq | RabbitMQ broker host |
| `BROKER_PORT` | 5672 | RabbitMQ broker port |
| `BROKER_VHOST` | `/` | RabbitMQ broker vhost |
| `INBOX_PORT` | `2222` | Inbox port |
| `INBOX_LOCATION` | /ega/inbox/ | Path to POSIX Inbox backend |
| `INBOX_KEYPAIR` | | Path to RSA keypair file |
| `KEYSTORE_TYPE` | JKS | Keystore type to use, JKS or PKCS12 |
| `KEYSTORE_PATH` | /etc/ega/inbox.jks | Path to Keystore file |
| `KEYSTORE_PASSWORD` | | Password to access the Keystore |
| `CACHE_TTL` | 3600.0 | CEGA credentials time-to-live |
| `CEGA_ENDPOINT` | | CEGA REST endpoint |
| `CEGA_ENDPOINT_CREDS` | | CEGA REST credentials |
| `S3_ENDPOINT` | inbox-backend:9000 | Inbox S3 backend URL |
| `S3_REGION` | us-east-1 | Inbox S3 backend region(us-east-1 is default in Minio) |
| `S3_ACCESS_KEY` | | Inbox S3 backend access key (S3 disabled if not specified) |
| `S3_SECRET_KEY` | | Inbox S3 backend secret key (S3 disabled if not specified) |
| `USE_SSL` | true | true if S3 Inbox backend should be accessed by HTTPS |
| `LOGSTASH_HOST` | | Hostname of the Logstash instance (if any) |
| `LOGSTASH_PORT` | | Port of the Logstash instance (if any) |

As mentioned above, the implementation is based on Java library Apache
Mina SSHD.
{%
include-markdown "services/sftpinbox.md"
heading-offset=3
%}

> NOTE:
> Sources are located at the separate repository:
Expand Down

0 comments on commit 9f64e52

Please sign in to comment.