Skip to content

Commit

Permalink
Merge pull request #29 from EGA-archive/bye_bye_vault
Browse files Browse the repository at this point in the history
Rename Vault to Archive
  • Loading branch information
blankdots committed Feb 18, 2019
2 parents 08eb815 + bc1f2f4 commit 69c687d
Show file tree
Hide file tree
Showing 25 changed files with 141 additions and 141 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ LocalEGA is divided into several components, as docker containers.
| inbox | SFTP server, acting as a dropbox, where user credentials are fetched from CentralEGA |
| ingesters | Split the Crypt4GH header and move the remainder to the storage backend. No cryptographic task, nor access to the decryption keys. |
| verifiers | Decrypt the stored files and checksum them against their embedded checksum. |
| vault | Storage backend: as a regular file system or as a S3 object store. |
| archive | Storage backend: as a regular file system or as a S3 object store. |
| finalizers | Handle the so-called _Stable ID_ filename mappings from CentralEGA. |
| outgesters | Front-facing checks for download permissions. |
| streamers | Fetch the files from the vault and re-encrypt its header for the given requester. |
| streamers | Fetch the files from the archive and re-encrypt its header for the given requester. |

Find the [LocalEGA documentation](http://localega.readthedocs.io) hosted on [ReadTheDocs.org](https://readthedocs.org/).
2 changes: 1 addition & 1 deletion deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Use `docker-compose up -d --scale ingest=3 --scale verify=5` instead,
if you want to start 3 ingestion and 5 verification workers.

Note that, in this architecture, we use separate volumes, e.g. for
the inbox area, for the vault (here backed by S3). They
the inbox area, for the archive (here backed by S3). They
will be created on-the-fly by docker-compose.

## Stopping
Expand Down
40 changes: 20 additions & 20 deletions deploy/bootstrap/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -196,19 +196,19 @@ database = lega
try = 30
sslmode = require
[vault]
driver = S3Storage
url = http://vault:9000
access_key = ${S3_ACCESS_KEY}
secret_key = ${S3_SECRET_KEY}
[archive]
storage_driver = S3Storage
s3_url = http://archive:9000
s3_access_key = ${S3_ACCESS_KEY}
s3_secret_key = ${S3_SECRET_KEY}
#region = lega
EOF

if [[ ${INBOX_BACKEND} == 's3' ]]; then
cat >> ${PRIVATE}/conf.ini <<EOF
[inbox]
driver = S3Storage
storage_driver = S3Storage
url = http://inbox-s3-backend:9000
access_key = ${S3_ACCESS_KEY_INBOX}
secret_key = ${S3_SECRET_KEY_INBOX}
Expand Down Expand Up @@ -239,7 +239,7 @@ networks:
volumes:
db:
inbox:
vault:
archive:
EOF

if [[ ${INBOX_BACKEND} == 's3' ]]; then
Expand Down Expand Up @@ -378,8 +378,8 @@ cat >> ${PRIVATE}/lega.yml <<EOF
labels:
lega_label: "ingest"
environment:
- VAULT_ACCESS_KEY=${S3_ACCESS_KEY}
- VAULT_SECRET_KEY=${S3_SECRET_KEY}
- S3_ACCESS_KEY=${S3_ACCESS_KEY}
- S3_SECRET_KEY=${S3_SECRET_KEY}
- AWS_ACCESS_KEY_ID=${S3_ACCESS_KEY}
- AWS_SECRET_ACCESS_KEY=${S3_SECRET_KEY}
volumes:
Expand Down Expand Up @@ -463,8 +463,8 @@ cat >> ${PRIVATE}/lega.yml <<EOF
image: egarchive/lega-base:latest
environment:
- LEGA_PASSWORD=${LEGA_PASSWORD}
- VAULT_ACCESS_KEY=${S3_ACCESS_KEY}
- VAULT_SECRET_KEY=${S3_SECRET_KEY}
- S3_ACCESS_KEY=${S3_ACCESS_KEY}
- S3_SECRET_KEY=${S3_SECRET_KEY}
- AWS_ACCESS_KEY_ID=${S3_ACCESS_KEY}
- AWS_SECRET_ACCESS_KEY=${S3_SECRET_KEY}
volumes:
Expand All @@ -479,7 +479,7 @@ cat >> ${PRIVATE}/lega.yml <<EOF
# Data Out re-encryption service
res:
depends_on:
- vault
- archive
- keys
hostname: res
container_name: res
Expand All @@ -500,7 +500,7 @@ cat >> ${PRIVATE}/lega.yml <<EOF
- EGA_SHAREDPASS_PATH=/etc/ega/pgp/ega.shared.pass
- EGA_EBI_AWS_ACCESS_KEY=${S3_ACCESS_KEY}
- EGA_EBI_AWS_ACCESS_SECRET=${S3_SECRET_KEY}
- EGA_EBI_AWS_ENDPOINT_URL=http://vault:${DOCKER_PORT_s3}
- EGA_EBI_AWS_ENDPOINT_URL=http://archive:${DOCKER_PORT_s3}
- EGA_EBI_AWS_ENDPOINT_REGION=
volumes:
- ./pgp/ega.shared.pass:/etc/ega/pgp/ega.shared.pass:ro
Expand All @@ -509,17 +509,17 @@ cat >> ${PRIVATE}/lega.yml <<EOF
- lega
# Storage backend: S3
vault:
hostname: vault
container_name: vault
archive:
hostname: archive
container_name: archive
labels:
lega_label: "vault"
lega_label: "archive"
image: minio/minio:RELEASE.2018-12-19T23-46-24Z
environment:
- MINIO_ACCESS_KEY=${S3_ACCESS_KEY}
- MINIO_SECRET_KEY=${S3_SECRET_KEY}
volumes:
- vault:/data
- archive:/data
restart: on-failure:3
networks:
- lega
Expand Down Expand Up @@ -659,8 +659,8 @@ DB_LEGA_OUT_USER = lega_out
CEGA_CONNECTION = ${CEGA_CONNECTION}
CEGA_ENDPOINT_CREDS = ${CEGA_USERS_CREDS}
#
VAULT_ACCESS_KEY = ${S3_ACCESS_KEY}
VAULT_SECRET_KEY = ${S3_SECRET_KEY}
S3_ACCESS_KEY = ${S3_ACCESS_KEY}
S3_SECRET_KEY = ${S3_SECRET_KEY}
#
DOCKER_PORT_inbox = ${DOCKER_PORT_inbox}
DOCKER_PORT_mq = ${DOCKER_PORT_mq}
Expand Down
30 changes: 15 additions & 15 deletions deploy/images/db/download.sql
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ CREATE TABLE local_ega_download.status (
INSERT INTO local_ega_download.status(id,code,description)
VALUES (10, 'INIT' , 'Initializing a download request'),
(20, 'REENCRYPTING', 'Re-Encrypting the header for a given user'),
(30, 'STREAMING' , 'Streaming file from the Vault'),
(30, 'STREAMING' , 'Streaming file from the Archive'),
(40, 'DONE' , 'Download completed'), -- checksums are in the Crypt4GH formatted file
-- and validated by the decryptor
(0, 'ERROR' , 'An Error occured, check the error table');
Expand Down Expand Up @@ -41,37 +41,37 @@ CREATE TABLE local_ega_download.main (
);


-- Insert new request, and return some vault information
-- Insert new request, and return some archive information
CREATE TYPE request_type AS (req_id INTEGER, -- local_ega_download.main.id%TYPE,
file_id INTEGER, -- local_ega.vault_files.id%TYPE,
header TEXT, -- local_ega.vault_files.header%TYPE,
vault_path TEXT, -- local_ega.vault_files.vault_file_reference%TYPE,
vault_type local_ega.storage);--local_ega.vault_files.vault_file_type%TYPE);
file_id INTEGER, -- local_ega.archive_files.id%TYPE,
header TEXT, -- local_ega.archive_files.header%TYPE,
archive_path TEXT, -- local_ega.archive_files.archive_file_reference%TYPE,
archive_type local_ega.storage);--local_ega.archive_files.archive_file_type%TYPE);

CREATE FUNCTION make_request(sid local_ega.main.stable_id%TYPE)
RETURNS request_type AS $make_request$
#variable_conflict use_column
DECLARE
req local_ega_download.request_type;
vault_rec local_ega.vault_files%ROWTYPE;
archive_rec local_ega.archive_files%ROWTYPE;
rid INTEGER;
BEGIN

SELECT * INTO vault_rec FROM local_ega.vault_files WHERE stable_id = sid LIMIT 1;
SELECT * INTO archive_rec FROM local_ega.archive_files WHERE stable_id = sid LIMIT 1;

IF vault_rec IS NULL THEN
RAISE EXCEPTION 'Vault file not found for stable_id: % ', sid;
IF archive_rec IS NULL THEN
RAISE EXCEPTION 'Archive file not found for stable_id: % ', sid;
END IF;

INSERT INTO local_ega_download.main (file_id, status)
VALUES (vault_rec.id, 'INIT')
VALUES (archive_rec.id, 'INIT')
RETURNING local_ega_download.main.id INTO rid;

req.req_id := rid;
req.file_id := vault_rec.id;
req.header := vault_rec.header;
req.vault_path := vault_rec.vault_file_reference;
req.vault_type := vault_rec.vault_file_type;
req.file_id := archive_rec.id;
req.header := archive_rec.header;
req.archive_path := archive_rec.archive_file_reference;
req.archive_type := archive_rec.archive_file_type;
RETURN req;
END;
$make_request$ LANGUAGE plpgsql;
Expand Down
10 changes: 5 additions & 5 deletions deploy/images/db/ebi.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
CREATE VIEW local_ega.ebi_files AS
SELECT id AS file_id,
stable_id AS file_name,
vault_file_reference AS file_path,
vault_file_type AS file_type,
vault_file_size AS file_size,
vault_file_checksum AS unencrypted_checksum,
vault_file_checksum_type AS unencrypted_checksum_type,
archive_file_reference AS file_path,
archive_file_type AS file_type,
archive_file_size AS file_size,
archive_file_checksum AS unencrypted_checksum,
archive_file_checksum_type AS unencrypted_checksum_type,
header AS header,
created_by AS created_by,
last_modified_by AS last_updated_by,
Expand Down
2 changes: 1 addition & 1 deletion deploy/images/db/grants.sql
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ CREATE USER lega_out;
GRANT USAGE ON SCHEMA local_ega TO lega_in, lega_out;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA local_ega TO lega_in; -- Read/Write access on local_ega.* for lega_in
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA local_ega TO lega_in; -- Don't forget the sequences
GRANT SELECT ON local_ega.vault_files TO lega_out; -- Read-Only access for lega_out
GRANT SELECT ON local_ega.archive_files TO lega_out; -- Read-Only access for lega_out
GRANT SELECT ON local_ega.ebi_files TO lega_out; -- Used by EBI
GRANT SELECT ON local_ega.index_files TO lega_out; -- Used by EBI
GRANT SELECT ON local_ega.file2dataset TO lega_out; -- Used by EBI
Expand Down
48 changes: 24 additions & 24 deletions deploy/images/db/main.sql
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ CREATE TABLE local_ega.status (
INSERT INTO local_ega.status(id,code,description)
VALUES (10, 'INIT' , 'Initializing a file ingestion'),
(20, 'IN_INGESTION', 'Currently under ingestion'),
(30, 'ARCHIVED' , 'File moved to Vault'),
(40, 'COMPLETED' , 'File verified in Vault'),
(30, 'ARCHIVED' , 'File moved to Archive'),
(40, 'COMPLETED' , 'File verified in Archive'),
(50, 'READY' , 'File ingested, ready for download'),
-- (60, 'IN_INDEXING', 'Currently under index creation'),
(0, 'ERROR' , 'An Error occured, check the error table'),
Expand All @@ -41,12 +41,12 @@ VALUES (10, 'INIT' , 'Initializing a file ingestion'),
-- ##################################################
-- ENCRYPTION FORMAT
-- ##################################################
CREATE TABLE local_ega.vault_encryption (
CREATE TABLE local_ega.archive_encryption (
mode VARCHAR(16) NOT NULL, PRIMARY KEY(mode), UNIQUE (mode),
description TEXT
);

INSERT INTO local_ega.vault_encryption(mode,description)
INSERT INTO local_ega.archive_encryption(mode,description)
VALUES ('CRYPT4GH' , 'Crypt4GH encryption (using version)'),
('PGP' , 'OpenPGP encryption (RFC 4880)'),
('AES' , 'AES encryption with passphrase'),
Expand Down Expand Up @@ -78,15 +78,15 @@ CREATE TABLE local_ega.main (
submission_file_size BIGINT NULL,
submission_user TEXT NOT NULL, -- Elixir ID, or internal user

-- Vault information
vault_file_reference TEXT, -- file path if POSIX, object id if S3
vault_file_type storage, -- S3 or POSIX file system
vault_file_size BIGINT,
vault_file_checksum VARCHAR(128) NULL, -- NOT NULL,
vault_file_checksum_type checksum_algorithm,
-- Archive information
archive_file_reference TEXT, -- file path if POSIX, object id if S3
archive_file_type storage, -- S3 or POSIX file system
archive_file_size BIGINT,
archive_file_checksum VARCHAR(128) NULL, -- NOT NULL,
archive_file_checksum_type checksum_algorithm,

-- Encryption/Decryption
encryption_method VARCHAR REFERENCES local_ega.vault_encryption (mode), -- ON DELETE CASCADE,
encryption_method VARCHAR REFERENCES local_ega.archive_encryption (mode), -- ON DELETE CASCADE,
version INTEGER , -- DEFAULT 1, -- Crypt4GH version
header TEXT, -- Crypt4GH header
session_key_checksum VARCHAR(128) NULL, -- NOT NULL, -- To check if session key already used
Expand Down Expand Up @@ -138,11 +138,11 @@ SELECT id,
submission_file_calculated_checksum AS inbox_file_checksum,
submission_file_calculated_checksum_type AS inbox_file_checksum_type,
status,
vault_file_reference AS vault_path,
vault_file_type AS vault_type,
vault_file_size AS vault_filesize,
vault_file_checksum AS unencrypted_checksum,
vault_file_checksum_type AS unencrypted_checksum_type,
archive_file_reference AS archive_path,
archive_file_type AS archive_type,
archive_file_size AS archive_filesize,
archive_file_checksum AS unencrypted_checksum,
archive_file_checksum_type AS unencrypted_checksum_type,
stable_id,
header, -- Crypt4gh specific
version,
Expand Down Expand Up @@ -173,7 +173,7 @@ RETURNS local_ega.main.id%TYPE AS $insert_file$
submission_user,
submission_file_extension,
status,
encryption_method) -- hard-code the vault_encryption
encryption_method) -- hard-code the archive_encryption
VALUES(inpath,eid,file_ext,'INIT','CRYPT4GH') RETURNING local_ega.main.id
INTO file_id;
RETURN file_id;
Expand All @@ -198,7 +198,7 @@ CREATE FUNCTION finalize_file(inpath local_ega.files.inbox_path%TYPE,
-- inbox_path = inpath AND
-- status <> 'COMPLETED')
-- THEN
-- RAISE EXCEPTION 'Vault file not in completed state for stable_id: % ', sid;
-- RAISE EXCEPTION 'Archive file not in completed state for stable_id: % ', sid;
-- END IF;
-- Go ahead and mark _them_ done
UPDATE local_ega.files
Expand Down Expand Up @@ -275,12 +275,12 @@ CREATE TRIGGER mark_ready
-- For data-out
-- ##########################################################################

-- View on the vault files
CREATE VIEW local_ega.vault_files AS
-- View on the archive files
CREATE VIEW local_ega.archive_files AS
SELECT id,
stable_id,
vault_file_reference,
vault_file_type,
archive_file_reference,
archive_file_type,
header
FROM local_ega.main
WHERE status = 'READY';
Expand All @@ -290,14 +290,14 @@ WHERE status = 'READY';
-- ##########################################################################
--
--
-- We can support multiple encryption types in the vault
-- We can support multiple encryption types in the archive
-- (Say, for example, Crypt4GH, PGP and plain AES),
-- in the following manner:
--
-- We create a table for each method of encryption.
-- Each table will have its own set of fields, refering to data it needs for decryption
--
-- Then we update the main file table with a vault_encryption "keyword".
-- Then we update the main file table with a archive_encryption "keyword".
-- That will tell the main file table to look at another table for that
-- particular file. (Note that this file reference is found in only one
-- encryption table).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ public Outgetsion(Context context) {
When("^I download archived file$", () -> {
try {
Map<String, String> ingestionInformation = context.getIngestionInformation();
String filePath = ingestionInformation.get("vault_path");
String filePath = ingestionInformation.get("archive_path");
URL resURL = new URL(String.format("http://localhost:8081/file?sourceKey=%s&sourceIV=%s&filePath=%s",
context.getSessionKey(),
context.getIv(),
Expand Down
2 changes: 1 addition & 1 deletion deploy/tests/src/test/resources/config.properties
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ container.label.inbox = inbox
container.label.ingest = ingest
container.label.keys = keys
container.label.mq = mq
container.label.s3 = vault
container.label.s3 = archive
container.label.verify = verify
container.label.cega-users = cega-users
20 changes: 10 additions & 10 deletions docs/connection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,16 @@ use the stub implementation of CentralEGA and the following queues, per

``LegaMQ`` contains two exchanges named ``lega`` and ``cega``, and the following queues, in the default ``vhost``:

+-----------------+-------------------------------------+
| Name | Purpose |
+=================+=====================================+
| files | Trigger for file ingestion |
+-----------------+-------------------------------------+
| archived | The file is in the vault |
+-----------------+-------------------------------------+
| qc | The file is "verified" in the vault |
| | and Quality Controllers can execute |
+-----------------+-------------------------------------+
+-----------------+---------------------------------------+
| Name | Purpose |
+=================+=======================================+
| files | Trigger for file ingestion |
+-----------------+---------------------------------------+
| archived | The file is in the archive |
+-----------------+---------------------------------------+
| qc | The file is "verified" in the archive |
| | and Quality Controllers can execute |
+-----------------+---------------------------------------+

``LegaMQ`` registers ``CegaMQ`` as an *upstream* and listens to the
incoming messages in ``files`` using a *federated queue*. Ingestion
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The workflow consists of two ordered parts:

The user first logs onto the Local EGA's inbox and uploads its
files. He/She then goes to the Central EGA's interface to prepare a
submission. Upon completion, the files are ingested into the vault and
submission. Upon completion, the files are ingested into the archive and
become searchable by the Central EGA's engine.

----
Expand Down
2 changes: 1 addition & 1 deletion docs/ingestion.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ either a regular file system on disk, or an S3 object storage.

The files are read chunk by chunk in order to bound the memory
usage. After completion, the remainder of the file (the AES encrypted
bulk part) is in the vault and a message is dropped into the local
bulk part) is in the archive and a message is dropped into the local
message broker to signal that the next step can start.

The next step is a verification step to ensure that the stored file is
Expand Down

0 comments on commit 69c687d

Please sign in to comment.