-
Notifications
You must be signed in to change notification settings - Fork 1
Conversation
@blankdots : When you are back from holidays and read that, could you please give me a hand for the tests? (My local test worked fine). The tests fail on the eureka, but I have neither touched the eureka code, nor its test. |
e990a63
to
2ebc4b1
Compare
@silverdaz Could you, please, elaborate why not request the stable ID in the In my mind, the logic of
Will it work like that or am I missing something? |
Ah, yes, sure, I had not thought about putting that code inside But this is asynchronous: We do not "query" CentralEGA and wait for the response synchronously. |
Are we making our system more complicated just to make CEGA's life easier? :) To be honest, I would even include stable ID querying to the And also I don't think that generation of ID is so much heavy and CPU consuming operation or that it's hard to implement: it's basically a call to So, concluding, I would definitely not overload LocalEGA's setup with yet another microservice just because CentralEGA cannot or doesn't want to generate UUIDs for us upon request: neither them, nor we should be concerned about running out of ID's or about the heaviness of a synchronous approach. Can you discuss this with Oscar and Sabela? |
@dtitov we are talking about Stable IDs, not UUIDs. StableID have a different life cycle that include tracking them for creation, updates (with versions) and eventual deprecation or obsolescence. Thus, we prefer to do not "waste" them, if possible. |
I'm not sure about what we are discussing here: async vs sync communication and CEGA IDs vs LEGA IDs (=UUIDs)? |
@jrambla Thank you for the clarification, I was not aware of StableIDs lifecycle - that, of course, makes things a bit more difficult. At the moment, after your comment, I see two possible ways:
In other words, one of the sides should become more complicated. And, obviously, as a representative of LocalEGA team, I would prefer not adding any extra complexity to LocalEGA's architecture if it's possible to avoid that. @sdelatorrep I see your point and it makes perfect sense to me, but any communication channel has its own benefits and downsides. What if we send you a message, you receive it and after that, during ID generation, CentralEGA goes down for some reason (e.g. power outage)? Then we will never receive the message back because, after the restart, your side will simply "forget" about the fact that it actually needs to respond (no messages are queued - no state is preserved). In this rare, but still possible case, the archived file will never get its ID. And with HTTP request we at least can be aware of the fact that ID retrieval failed so that we can fall back to some backup mechanism. Look, I propose another variant not only because of my personal preferences but more because of practical reasons that relate to our local infrastructural limitations. But, I guess, we should better discuss it tomorrow in a meeting. Anyway, thanks for the explanations. |
This is what RabbitMQ handles. The message is not forgotten, and it will be (re)processed. This PR is a solution to issue #260 . |
@silverdaz I have read more about acknowledgments in RabbitMQ and yes, the case above can be handled. |
docker/bootstrap/lega.sh
Outdated
container_name: id-mapper | ||
volumes: | ||
- ./lega/conf.ini:/etc/ega/conf.ini:ro | ||
- ~/_ega/lega:/home/lega/.local/lib/python3.6/site-packages/lega |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should only be for local development (even more so that this is specific to @silverdaz env :) ), not particularly fond of having this sort of paths in the public repo (the same for other services)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah... oups...I thought I had removed all of them...
@@ -25,12 +25,11 @@ | |||
|
|||
|
|||
def retry_loop(func): | |||
"""Retry connection for ``try`` times every ``try_interval`` seconds.""" | |||
"""Decorator retry something ``try`` times every ``try_interval`` seconds.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using https://www.python.org/dev/peps/pep-0257/ - this line should be imperative (the same goes for line 32 below).
see D401 http://www.pydocstyle.org/en/2.1.1/error_codes.html
I know we are not following pep8 and pep257, and it will take quite the effort to go over the previous code to format it following the rules, however I would like for the "newer" (a bit relative here what "new" means) to follow these guides.
@silverdaz regarding this #341 (comment), you kinda touched eureka in the config ( |
setup.py
Outdated
@@ -3,7 +3,7 @@ | |||
|
|||
setup(name='lega', | |||
version=__version__, | |||
url='http://lega.nbis.se', | |||
url='http://LocalEGA.nbis.se', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
url is not valid, why not have http://localega.readthedocs.io/ there or the github repo url ?
lega/verify.py
Outdated
LOG.info('Verification completed. Updating database.') | ||
db.set_status(file_id, db.Status.Completed) | ||
# Updating the database | ||
db.mark_completed(file_id) | ||
|
||
# Send to QC | ||
data.pop('status', None) | ||
data['key_id'] = key_id | ||
LOG.debug(f'Sending message to QC: {data}') | ||
publish(data, channel, 'lega', 'qc') # We keep the org msg in there |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't qc
queue deleted from defs.json
? Maybe a miss, or does it serve some purpose ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the logic of the python script and I think it is ok.
I do think that the name mapper is a bit confusing. Maybe just, stable-id-reciever
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ID Mapper is indeed not the best name.
Code has been reviewed. Changes to the code base imply changes to the deployment specified in these issues:
According to Oscar, the stable IDs generation for CentralEGA is handled by a process at EBI and it depends on the content of the original file.
We use the generated stable ID after a successful ingestion, which allows us:
completed
, but not yetready
for data-out. Once the stable ID comes in, we flag the file asready
for download.