Stable IDs #341

silverdaz · 2018-09-03T14:54:03Z

According to Oscar, the stable IDs generation for CentralEGA is handled by a process at EBI and it depends on the content of the original file.

We use the generated stable ID after a successful ingestion, which allows us:

not to waste any useful stable ID in case of errors (and no need to recycle or clean them up)
mark the ingested file as completed, but not yet ready for data-out. Once the stable ID comes in, we flag the file as ready for download.

silverdaz · 2018-09-03T15:22:38Z

@blankdots : When you are back from holidays and read that, could you please give me a hand for the tests? (My local test worked fine).

The tests fail on the eureka, but I have neither touched the eureka code, nor its test.
So either it failed before, or there is some magic happening in the mock things.

dtitov · 2018-09-04T11:44:36Z

@silverdaz Could you, please, elaborate why not request the stable ID in the verify.py, right after successful checksum validation? What is the necessity of introducing new microservice?

In my mind, the logic of verify.py could be:

Validate checksum:
a. If validation failed - do the same what we do in such case currently;
b. If validation succeeded - do the same what we do in such case currently plus query CentralEGA for stable ID and update it in our database.

Will it work like that or am I missing something?

silverdaz · 2018-09-04T11:51:14Z

Ah, yes, sure, I had not thought about putting that code inside verify.py.

But this is asynchronous: We do not "query" CentralEGA and wait for the response synchronously.
We instead send them a message and they respond when they find the time.
That's why there is another service that listens to their answer, and that verify does not handle the stable ID.

dtitov · 2018-09-04T12:03:09Z

Are we making our system more complicated just to make CEGA's life easier? :)

To be honest, I would even include stable ID querying to the ingest.py, because I don't care much about wasting of some IDs: if we take a look at what UUID is, we can see that it's 128-bit long identifier, so you can imagine what a nearly endless space of IDs does it give to us.

And also I don't think that generation of ID is so much heavy and CPU consuming operation or that it's hard to implement: it's basically a call to UUID.randomUUID().

So, concluding, I would definitely not overload LocalEGA's setup with yet another microservice just because CentralEGA cannot or doesn't want to generate UUIDs for us upon request: neither them, nor we should be concerned about running out of ID's or about the heaviness of a synchronous approach.

Can you discuss this with Oscar and Sabela?

This reverts commit 498d6e9.

jrambla · 2018-09-05T13:35:39Z

@dtitov we are talking about Stable IDs, not UUIDs. StableID have a different life cycle that include tracking them for creation, updates (with versions) and eventual deprecation or obsolescence. Thus, we prefer to do not "waste" them, if possible.

sdelatorrep · 2018-09-05T14:04:21Z

I'm not sure about what we are discussing here: async vs sync communication and CEGA IDs vs LEGA IDs (=UUIDs)?
In my opinion, I don't see any benefit of using synchronous communication here. If CEGA is down what would you do? Retry until it is up again? What if there is a communication error? Retry again? How many times? And what happens if CEGA never answers? Retry? Or the other way around, what if LEGA requests an ID but something happens when answering back, what should CEGA do? Retry? We already have RabbitMQ in place, so let's use it :) It will guarantee messages are not lost and neither CEGA nor LEGA need this information immediately.
About using UUIDs instead of CEGA IDs, I guess it's another solution but with probably lots of consequences which must be considered.

dtitov · 2018-09-05T14:28:40Z

@jrambla Thank you for the clarification, I was not aware of StableIDs lifecycle - that, of course, makes things a bit more difficult.

At the moment, after your comment, I see two possible ways:

CentralEGA creates and provides us with the "StableIDs API" (probably the REST one) so that we can perform all kinds of the operations that you've specified: issue new IDs, version them, deprecate them and so on.
Or LocalEGA creates yet another microservice to handle that (basically, this PR).

In other words, one of the sides should become more complicated. And, obviously, as a representative of LocalEGA team, I would prefer not adding any extra complexity to LocalEGA's architecture if it's possible to avoid that.

@sdelatorrep I see your point and it makes perfect sense to me, but any communication channel has its own benefits and downsides. What if we send you a message, you receive it and after that, during ID generation, CentralEGA goes down for some reason (e.g. power outage)? Then we will never receive the message back because, after the restart, your side will simply "forget" about the fact that it actually needs to respond (no messages are queued - no state is preserved). In this rare, but still possible case, the archived file will never get its ID. And with HTTP request we at least can be aware of the fact that ID retrieval failed so that we can fall back to some backup mechanism.

Look, I propose another variant not only because of my personal preferences but more because of practical reasons that relate to our local infrastructural limitations. But, I guess, we should better discuss it tomorrow in a meeting. Anyway, thanks for the explanations.

silverdaz · 2018-09-05T16:18:13Z

What if we send you a message, you receive it and after that, during ID generation, CentralEGA goes down for some reason (e.g. power outage)? Then we will never receive the message back because, after the restart, your side will simply "forget" about the fact that it actually needs to respond (no messages are queued - no state is preserved).

This is what RabbitMQ handles. The message is not forgotten, and it will be (re)processed.

This PR is a solution to issue #260 .
There is always the possibility to make another PR/issue if you, @dtitov, feel that this solution is not suitable.
In the Agile spirit.

dtitov · 2018-09-05T17:58:53Z

@silverdaz I have read more about acknowledgments in RabbitMQ and yes, the case above can be handled.

blankdots · 2018-09-06T06:49:15Z

docker/bootstrap/lega.sh

+    container_name: id-mapper
+    volumes:
+       - ./lega/conf.ini:/etc/ega/conf.ini:ro
+       - ~/_ega/lega:/home/lega/.local/lib/python3.6/site-packages/lega


this should only be for local development (even more so that this is specific to @silverdaz env :) ), not particularly fond of having this sort of paths in the public repo (the same for other services)

ah... oups...I thought I had removed all of them...

blankdots · 2018-09-06T07:01:09Z

lega/utils/eureka.py

@@ -25,12 +25,11 @@


 def retry_loop(func):
-    """Retry connection for ``try`` times every ``try_interval`` seconds."""
+    """Decorator retry something ``try`` times every ``try_interval`` seconds."""


I am using https://www.python.org/dev/peps/pep-0257/ - this line should be imperative (the same goes for line 32 below).
see D401 http://www.pydocstyle.org/en/2.1.1/error_codes.html

I know we are not following pep8 and pep257, and it will take quite the effort to go over the previous code to format it following the rules, however I would like for the "newer" (a bit relative here what "new" means) to follow these guides.

blankdots · 2018-09-06T07:15:44Z

@silverdaz regarding this #341 (comment), you kinda touched eureka in the config (try and try_interval)
you can set them as env varisables in test using https://docs.python.org/3.6/library/test.html#test.support.EnvironmentVarGuard

blankdots · 2018-09-06T12:26:56Z

setup.py

@@ -3,7 +3,7 @@

 setup(name='lega',
      version=__version__,
-      url='http://lega.nbis.se',
+      url='http://LocalEGA.nbis.se',


url is not valid, why not have http://localega.readthedocs.io/ there or the github repo url ?

blankdots · 2018-09-06T12:54:00Z

lega/verify.py

-    LOG.info('Verification completed. Updating database.')
-    db.set_status(file_id, db.Status.Completed)
+    # Updating the database
+    db.mark_completed(file_id)

    # Send to QC
    data.pop('status', None)
    data['key_id'] = key_id
    LOG.debug(f'Sending message to QC: {data}')
    publish(data, channel, 'lega', 'qc') # We keep the org msg in there


isn't qc queue deleted from defs.json ? Maybe a miss, or does it serve some purpose ?

viklund

I looked at the logic of the python script and I think it is ok.

I do think that the name mapper is a bit confusing. Maybe just, stable-id-reciever?

README.md

…r documentation

blankdots

ID Mapper is indeed not the best name.

Code has been reviewed. Changes to the code base imply changes to the deployment specified in these issues:

Stable IDs

Stable IDs

20718b5

silverdaz added this to the Sprint 35 milestone Sep 3, 2018

silverdaz self-assigned this Sep 3, 2018

Updating the tests

2ebc4b1

silverdaz force-pushed the feature/stable-ids branch from e990a63 to 2ebc4b1 Compare September 4, 2018 09:46

dtitov added 2 commits September 4, 2018 14:09

Update Crypt4GH usage according to recent changes.

498d6e9

Revert "Update Crypt4GH usage according to recent changes."

02edf90

This reverts commit 498d6e9.

silverdaz requested review from viklund, dtitov and blankdots September 4, 2018 13:42

Frédéric Haziza added 2 commits September 5, 2018 10:41

Update the doc

2b7e3c7

Adding a bit more documentation

37120b3

blankdots reviewed Sep 6, 2018

View reviewed changes

add env var test eureka for try and try interval

3a2cbad

Removing local dev paths from bootstrap

0ab9fca

blankdots reviewed Sep 6, 2018

View reviewed changes

viklund approved these changes Sep 6, 2018

View reviewed changes

blankdots removed this from the Sprint 35 milestone Sep 7, 2018

blankdots added this to the Sprint 36 milestone Sep 7, 2018

blankdots reviewed Sep 10, 2018

View reviewed changes

README.md Show resolved Hide resolved

blankdots added 3 commits September 10, 2018 08:55

remove qc message and adding tests for mapper; add url to setup.py fo…

dadb347

…r documentation

small fixes for typos

9c21c67

update documentation with new components and fixes

e476d23

silverdaz removed the request for review from dtitov September 10, 2018 13:34

blankdots approved these changes Sep 11, 2018

View reviewed changes

blankdots merged commit 5441696 into dev Sep 11, 2018

blankdots deleted the feature/stable-ids branch September 11, 2018 04:30

This was referenced Sep 11, 2018

Add new component ID Mapper (for stableIDs) NBISweden/LocalEGA-deploy-swarm#1

Closed

Add new component ID Mapper (for stableIDs) neicnordic/sda-deploy-init#4

Closed

viklund pushed a commit that referenced this pull request Nov 22, 2018

Merge pull request #341 from NBISweden/feature/stable-ids

1d53e53

Stable IDs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable IDs #341

Stable IDs #341

silverdaz commented Sep 3, 2018

silverdaz commented Sep 3, 2018

dtitov commented Sep 4, 2018 •

edited

Loading

silverdaz commented Sep 4, 2018

dtitov commented Sep 4, 2018 •

edited

Loading

jrambla commented Sep 5, 2018

sdelatorrep commented Sep 5, 2018 •

edited

Loading

dtitov commented Sep 5, 2018 •

edited

Loading

silverdaz commented Sep 5, 2018

dtitov commented Sep 5, 2018

blankdots Sep 6, 2018

silverdaz Sep 6, 2018

blankdots Sep 6, 2018 •

edited

Loading

blankdots commented Sep 6, 2018 •

edited

Loading

blankdots Sep 6, 2018

blankdots Sep 6, 2018

viklund left a comment

blankdots left a comment

Stable IDs #341

Stable IDs #341

Conversation

silverdaz commented Sep 3, 2018

silverdaz commented Sep 3, 2018

dtitov commented Sep 4, 2018 • edited Loading

silverdaz commented Sep 4, 2018

dtitov commented Sep 4, 2018 • edited Loading

jrambla commented Sep 5, 2018

sdelatorrep commented Sep 5, 2018 • edited Loading

dtitov commented Sep 5, 2018 • edited Loading

silverdaz commented Sep 5, 2018

dtitov commented Sep 5, 2018

blankdots Sep 6, 2018

Choose a reason for hiding this comment

silverdaz Sep 6, 2018

Choose a reason for hiding this comment

blankdots Sep 6, 2018 • edited Loading

Choose a reason for hiding this comment

blankdots commented Sep 6, 2018 • edited Loading

blankdots Sep 6, 2018

Choose a reason for hiding this comment

blankdots Sep 6, 2018

Choose a reason for hiding this comment

viklund left a comment

Choose a reason for hiding this comment

blankdots left a comment

Choose a reason for hiding this comment

dtitov commented Sep 4, 2018 •

edited

Loading

dtitov commented Sep 4, 2018 •

edited

Loading

sdelatorrep commented Sep 5, 2018 •

edited

Loading

dtitov commented Sep 5, 2018 •

edited

Loading

blankdots Sep 6, 2018 •

edited

Loading

blankdots commented Sep 6, 2018 •

edited

Loading