Skip to content

Commit

Permalink
Adding Dmytro and Maja's comments. Thanks!
Browse files Browse the repository at this point in the history
  • Loading branch information
Frédéric Haziza authored and Frédéric Haziza committed Jan 22, 2018
1 parent af3bc95 commit 4a9cc59
Show file tree
Hide file tree
Showing 6 changed files with 54 additions and 155 deletions.
29 changes: 18 additions & 11 deletions docs/connection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,31 @@ Connection CEGA |connect| LEGA
==============================

All Local EGA instances are connected to Central EGA using
`RabbitMQ`_. The latter is the **only** component with the necessary
credentials to connect to Central EGA.
`RabbitMQ`_, a message broker, that allows application components to
send and receive messages. Messages are queued, not lost, and resend
on network failure or connection problems. Naturally, this is configurable.

The RabbitMQ message brokers of each LocalEGA are the **only**
components with the necessary credentials to connect to Central
EGA. The other LocalEGA components can not.

We call ``CegaMQ`` and ``LegaMQ``, the RabbitMQ message brokers of,
respectively, Central EGA and Local EGA.

.. note:: We have fixed the RabbitMQ version to ``3.6.14``.


CentralEGA declares a ``vhost`` per LocalEGA instance. It also
creates the credentials to connect to that ``vhost`` in the form
of a *username/password* pair. The connection uses the AMQP(S)
protocol (The S adds TLS encryption to the traffic).
``CegaMQ`` declares a ``vhost`` for each LocalEGA instance. It also
creates the credentials to connect to that ``vhost`` in the form of a
*username/password* pair. The connection uses the AMQP(S) protocol
(The S adds TLS encryption to the traffic).

LocalEGA uses then a connection string with the following syntax:
``LegaMQ`` then uses a connection string with the following syntax:

.. code-block:: console
amqp[s]://<user>:<password>@<cega-host>:<port>/<vhost>
We call ``CegaMQ`` and ``LegaMQ``, the RabbitMQ message brokers of,
respectively, Central EGA and Local EGA.
``CegaMQ`` contains an exchange named ``localega.v1``. ``v1`` is used for
versioning and is internal to CentralEGA. The queues connected to that
Expand Down Expand Up @@ -101,9 +107,10 @@ EGA. They can be added later on, if necessary.
.. _supported checksum algorithm: md5

Adding a new Local EGA instance
===============================
-------------------------------

Central EGA must only prepare a user/password pair along with a ``vhost`` in their RabbitMQ.
Central EGA only has to prepare a user/password pair along with a
``vhost`` in their RabbitMQ.

When Central EGA has communicated these details to the given Local EGA
instance, the latter can contact Central EGA using the federated queue
Expand Down
37 changes: 19 additions & 18 deletions docs/inbox.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,19 @@ Inbox login system

Central EGA contains a database of users, with IDs and passwords.

We have developped an NSS+PAM solution to allow user
authentication via either a password or an RSA key against the
CentralEGA database itself. The user is chrooted into their home
folder.
We have developed an NSS+PAM solution to allow user authentication via
either a password or an RSA key against the CentralEGA database
itself. The user is chroot'ed into their home folder.

The solution uses CentralEGA's user IDs but can also be extended to
use Elixir IDs (of which we strip the @elixir-europe.org suffix).


The procedure is as follows. The inbox is started without any created
user. When a user wants log into the inbox (actually, only sftp
user. When a user wants to log into the inbox (actually, only sftp
uploads are allowed), the NSS module looks up the username in a local
cache, and, if not found, queries the CentralEGA database. Upon
return, we stores the user credentials in the local cache and create
return, we store the user credentials in the local cache and create
the user's home directory. The user now gets logged in if the password
or public key authentication succeeds. Upon subsequent login attempts,
only the local cache is queried, until the user's credentials
Expand Down Expand Up @@ -128,13 +127,13 @@ source code has its own `repository
<https://github.com/NBISweden/LocalEGA-auth>`_. A makefile is provided
to compile and install the necessary shared libraries.

We copied the ``/sbin/sshd`` into an ``/sbin/ega`` binary and configured
the *ega* service by adding a file into the ``/etc/pam.d`` directory. In
this case, name the file ``/etc/pam.d/ega``.
We copied the ``/sbin/sshd`` into an ``/sbin/ega`` binary and
configured the *ega* service by adding a file into the ``/etc/pam.d``
directory. In this case, the name of the file is ``/etc/pam.d/ega``.

.. literalinclude:: /../deployments/docker/images/inbox/pam.ega

The *ega* service is configured as ``sshd`` would. We only use the
The *ega* service is configured just like ``sshd`` is. We only use the
``-c`` switch to specify where the configuration file is. The service
runs for the moment on port 9000.

Expand All @@ -148,9 +147,9 @@ whether the user has a valid ssh public key. If it is not the case,
the user is prompted to input a password. Central EGA stores password
hashes using the `BLOWFISH
<https://en.wikipedia.org/wiki/Blowfish_(cipher)>`_ hashing
algorithm. LocalEGA supports also the usual ``MD5``, ``SHA256`` and
``SHA512`` available on most Linux distribution (They are part of the
C library).
algorithm. LocalEGA also supports the usual ``md5``, ``sha256`` and
``sha512`` algorithms available on most Linux distribution (They are
part of the C library).

Updating a user password is not allowed (ie therefore the ``password``
*type* is configure to deny every access).
Expand All @@ -167,9 +166,11 @@ the session closes, ``setcred`` is bound to fail. However, it
succeeded on the original login, and it will again on the subsequent
logins. That way, if a user logs again, within a cache TTL delay, we
do not re-query the CentralEGA database. After the TTL has elapsed, we
shall query anew the CentralEGA database, eventually receiving new
credentials for that user. Note that it is unlikely that a user will
keep logging in and out, while its password and/or ssh key have been
reset. If so, we can implement a flush mechanism, given to CentralEGA,
if necessary.
do query anew the CentralEGA database, eventually receiving new
credentials for that user.

Note that it is unlikely that a user will keep logging in and out,
while its password and/or ssh key have been reset. If so, we can
implement a flush mechanism, given to CentralEGA, if necessary (not
complicated, and ... not a priority).

34 changes: 1 addition & 33 deletions docs/ingestion/db.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ schema is as follows.

.. literalinclude:: /../extras/db.sql
:language: sql
:lines: 5,6,14-23,94-110,130-136
:lines: 5-7,14-31,50-56

We do not use any Object-Relational Model (ORM, such as
SQLAlchemy). Instead, we simply implemented, in SQL, a few functions
Expand All @@ -29,35 +29,3 @@ in order to insert or manipulate the database entry.
Look at :doc:`the SQL definitions </../extras/db.sql>` if you are also
interested in the database triggers.


..
.. code-block:: sql
FUNCTION sanitize_id(elixir_id users.elixir_id%TYPE)
RETURNS users.elixir_id%TYPE

FUNCTION insert_user(elixir_id users.elixir_id%TYPE,
password_hash users.password_hash%TYPE,
public_key users.pubkey%TYPE,
exp_int users.expiration%TYPE DEFAULT INTERVAL '1' MONTH)
RETURNS users.id%TYPE

-- Delete other user entries that are too old
FUNCTION refresh_user(elixir_id users.elixir_id%TYPE)
RETURNS void

-- Refresh expiration for user
FUNCTION update_users()
RETURNS trigger AS $update_users$
BEGIN
DELETE FROM users WHERE last_accessed < current_timestamp - expiration;
RETURN NEW;
END;
$update_users$ LANGUAGE plpgsql;

TRIGGER delete_expired_users_trigger AFTER UPDATE ON users EXECUTE PROCEDURE update_users();

-- Remove user entry from the database cache
FUNCTION flush_user(elixir_id users.elixir_id%TYPE)
RETURNS void
14 changes: 8 additions & 6 deletions docs/ingestion/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,21 @@ procedure. We assume the files are already uploaded in the user inbox.
:target: ../_static/CEGA-LEGA.png
:alt: General Architecture

Central EGA drops a message per file to ingest, containing the
*username*, the *filename* and the *checksums* (along with their
related algorithm) of the encrypted file and the decrypted
content. The message is picked up by some ingestion workers. Many
ingestion workers can be running concurrently.
For a given LocalEGA, Central EGA selects the associated ``vhost`` and
drops, in the ``files`` queue, one message per file to ingest. A
message contains the *username*, the *filename* and the *checksums*
(along with their related algorithm) of the encrypted file and the
decrypted content. The message is picked up by some ingestion
workers. Several ingestion workers may be running concurrently at any
given time.

For each file, if it is found in the inbox, checksums are computed to
verify the integrity of the file (ie. whether the file was properly
uploaded). If the checksums are not provided, they will be derived
from companion files. Each worker retrieves the decryption key in a
secure manner, from the keyserver, and decrypts the file.

To improve efficiency, each block that are decrypted are piped into a
To improve efficiency, each block that is decrypted is piped into a
separate process for re-encryption. This has the advantage to
constrain the memory usage per worker and save the re-encryption
time. In addition to the re-encryption, we also compute the checksum
Expand Down
13 changes: 7 additions & 6 deletions docs/setup.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,19 @@ The sources for LocalEGA can be downloaded and installed from the `NBIS Github r
$ pip install git+https://github.com/NBISweden/LocalEGA.git
The preferred method is however to use one of our deployment strategy: either on `docker`_ or on `Openstack cloud`_.
The preferred method is however to use one of our deployment strategy: either on `Docker`_ or on `OpenStack cloud`_.

Configuration
=============

A few files are required in order to connect the different components.

The main configurations are set by default, and it is possible to
overwrite any of them. All Python components can be indeed started
overwrite any of them. All Python components can indeed be started
using the ``--conf <file>`` switch to specify the configuration file.

The settings are loaded, in order:

* from the package's ``defaults.ini``
* from the file ``/etc/ega/conf.ini`` (if it exists)
* and finally from the file specified as the ``--conf`` argument.
Expand Down Expand Up @@ -56,8 +57,8 @@ Bootstrap
=========

In order to simplify the setup of LocalEGA's components, we have
developped a few bootstrap scripts (one for the `docker`_ deployment
and one for the `Openstack cloud`_ deployment).
developped a few bootstrap scripts (one for the `Docker`_ deployment
and one for the `OpenStack cloud`_ deployment).

Those script create random passwords, configuration files, GnuPG keys,
RSA keys and connect the different components togehter.
Expand All @@ -68,6 +69,6 @@ file there.


.. _NBIS Github repo: https://github.com/NBISweden/LocalEGA
.. _docker: https://github.com/NBISweden/LocalEGA/tree/dev/deployments/docker
.. _Openstack cloud: https://github.com/NBISweden/LocalEGA/tree/dev/deployments/terraform
.. _Docker: https://github.com/NBISweden/LocalEGA/tree/dev/deployments/docker
.. _OpenStack cloud: https://github.com/NBISweden/LocalEGA/tree/dev/deployments/terraform
.. _available: https://github.com/NBISweden/LocalEGA/tree/dev/lega/conf/loggers
82 changes: 1 addition & 81 deletions extras/db.sql
Original file line number Diff line number Diff line change
Expand Up @@ -8,92 +8,12 @@ CREATE TYPE hash_algo AS ENUM ('md5', 'sha256');
CREATE EXTENSION pgcrypto;


-- ##################################################
-- USERS
-- ##################################################
CREATE TABLE users (
id SERIAL, PRIMARY KEY(id), UNIQUE(id),
elixir_id TEXT NOT NULL, UNIQUE(elixir_id),
password_hash TEXT,
pubkey TEXT,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT clock_timestamp(),
last_accessed TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT clock_timestamp(),
expiration INTERVAL NOT NULL,
CHECK (password_hash IS NOT NULL OR pubkey IS NOT NULL)
);

CREATE FUNCTION sanitize_id(elixir_id users.elixir_id%TYPE)
RETURNS users.elixir_id%TYPE AS $sanitize_id$
DECLARE
eid users.elixir_id%TYPE;
BEGIN
-- eid := trim(trailing '@elixir-europe.org' from elixir_id);
eid := regexp_replace(elixir_id, '@.*', '');
RETURN eid;
END;
$sanitize_id$ LANGUAGE plpgsql;

CREATE FUNCTION insert_user(elixir_id users.elixir_id%TYPE,
password_hash users.password_hash%TYPE,
public_key users.pubkey%TYPE,
exp_int users.expiration%TYPE DEFAULT INTERVAL '1' MONTH)

RETURNS users.id%TYPE AS $insert_user$
#variable_conflict use_column
DECLARE
user_id users.elixir_id%TYPE;
eid users.elixir_id%TYPE;
BEGIN
eid := sanitize_id(elixir_id);
INSERT INTO users (elixir_id,password_hash,pubkey,expiration) VALUES(eid,password_hash,public_key,exp_int)
ON CONFLICT (elixir_id) DO UPDATE SET last_accessed = DEFAULT, expiration = exp_int
RETURNING users.id INTO user_id;
RETURN user_id;
END;
$insert_user$ LANGUAGE plpgsql;

-- Delete other user entries that are too old
CREATE FUNCTION refresh_user(elixir_id users.elixir_id%TYPE)

RETURNS void AS $refresh_user$
#variable_conflict use_column
DECLARE
eid users.elixir_id%TYPE;
BEGIN
eid := sanitize_id(elixir_id);
UPDATE users SET last_accessed = DEFAULT WHERE elixir_id = eid;
RETURN;
END;
$refresh_user$ LANGUAGE plpgsql;

CREATE FUNCTION update_users()
RETURNS trigger AS $update_users$
BEGIN
DELETE FROM users WHERE last_accessed < current_timestamp - expiration;
RETURN NEW;
END;
$update_users$ LANGUAGE plpgsql;

CREATE TRIGGER delete_expired_users_trigger AFTER UPDATE ON users EXECUTE PROCEDURE update_users();

CREATE FUNCTION flush_user(elixir_id users.elixir_id%TYPE)
RETURNS void AS $flush_user$
#variable_conflict use_column
DECLARE
eid users.elixir_id%TYPE;
BEGIN
eid := sanitize_id(elixir_id);
DELETE FROM users WHERE elixir_id = eid; -- Future: and ega_user is true
RETURN;
END;
$flush_user$ LANGUAGE plpgsql;

-- ##################################################
-- FILES
-- ##################################################
CREATE TABLE files (
id SERIAL, PRIMARY KEY(id), UNIQUE (id),
elixir_id TEXT REFERENCES users (elixir_id) ON DELETE CASCADE,
elixir_id TEXT NOT NULL,
filename TEXT NOT NULL,
enc_checksum TEXT,
enc_checksum_algo hash_algo,
Expand Down

0 comments on commit 4a9cc59

Please sign in to comment.