Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evidence index creation threading issue #350

Closed
afaulconbridge opened this issue Dec 5, 2018 · 2 comments

Comments

Projects
None yet
1 participant
@afaulconbridge
Copy link

commented Dec 5, 2018

Currently, if the evidence index does not exist before the index step is run, the first time each worker try to save the index it will create it. However, this can cause collisions as each thread will try to create it as the same time and there is no inter-process communication. For example:

2018-12-05 14:28:05,908 - mrtarget.common.EvidencesHelpers_255 - ERROR - creation of index master_validated-data was not acknowledged. ERROR:{u'index_uuid': u'uR-LG7JxRDe6r1uSsULFmg', u'index': u'master_validated-data', u'root_cause': [{u'index_uuid': u'uR-LG7JxRDe6r1uSsULFmg', u'index': u'master_validated-data', u'reason': u'index [master_validated-data/uR-LG7JxRDe6r1uSsULFmg] already exists', u'type': u'index_already_exists_exception'}], u'type': u'index_already_exists_exception', u'reason': u'index [master_validated-data/uR-LG7JxRDe6r1uSsULFmg] already exists'}
Traceback (most recent call last):
  File "mrtarget/modules/Evidences.py", line 306, in write_evidences
    process_context.put(x)
  File "mrtarget/common/EvidencesHelpers.py", line 116, in put
    auto_optimise=True)
  File "mrtarget/common/ElasticsearchLoader.py", line 108, in put
    self.create_new_index(index_name)
  File "mrtarget/common/ElasticsearchLoader.py", line 297, in create_new_index
    self._safe_create_index(index_name, mapping)
  File "mrtarget/common/ElasticsearchLoader.py", line 248, in _safe_create_index
    raise ValueError('creation of index %s was not acknowledged. ERROR:%s'%(index_name,str(res['error'])))
ValueError: creation of index master_validated-data was not acknowledged. ERROR:{u'index_uuid': u'uR-LG7JxRDe6r1uSsULFmg', u'index': u'master_validated-data', u'root_cause': [{u'index_uuid': u'uR-LG7JxRDe6r1uSsULFmg', u'index': u'master_validated-data', u'reason': u'index [master_validated-data/uR-LG7JxRDe6r1uSsULFmg] already exists', u'type': u'index_already_exists_exception'}], u'type': u'index_already_exists_exception', u'reason': u'index [master_validated-data/uR-LG7JxRDe6r1uSsULFmg] already exists'}

To solve this we need to ensure the relevant indexes are created before the worker processes are executed - in EvidenceHelpers.py ProcessContextESWriter __init__ method

@afaulconbridge

This comment has been minimized.

Copy link
Author

commented Jan 8, 2019

Work in progress in data_pipeline branch af-350-evidence-index-threading

@afaulconbridge

This comment has been minimized.

Copy link
Author

commented Jan 8, 2019

Should be solved by PR opentargets/data_pipeline#420

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.