Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[21.09] Fix creating tags within sessionless context #13552

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Mar 16, 2022

xref: #13551, fixes test_collection_tools_tag_propagation

galaxy.jobs ERROR 2022-03-15 00:21:29,515 [pN:main,p:2180,tN:LocalRunner.work_thread-3] problem importing job outputs. stdout [] stderr [
Traceback (most recent call last):
  File "metadata/set.py", line 1, in <module>
    from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 121, in set_metadata
    set_metadata_portable()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 291, in set_metadata_portable
    collect_dynamic_outputs(job_context, output_collections)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/job_execution/output_collect.py", line 147, in collect_dynamic_outputs
    persist_elements_to_hdca(job_context, elements, hdca, collector=DEFAULT_DATASET_COLLECTOR)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 719, in persist_elements_to_hdca
    filenames,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 292, in populate_collection_elements
    final_job_state=final_job_state,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 354, in _populate_elements
    self.add_tags_to_datasets(datasets=element_datasets["datasets"], tag_lists=element_datasets["tag_lists"])
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 576, in add_tags_to_datasets
    self.tag_handler.add_tags_from_list(user, dataset, tags, flush=False)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 61, in add_tags_from_list
    return self.set_tags_from_list(user, item, new_tags_set, flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 75, in set_tags_from_list
    self.apply_item_tags(user, item, unicodify(new_tags_str, "utf-8"), flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 207, in apply_item_tags
    self.apply_item_tag(user, item, name, value, flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 179, in apply_item_tag
    tag = self._get_or_create_tag(lc_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 281, in _get_or_create_tag
    tag = self._create_tag(scrubbed_tag_str)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 246, in _create_tag
    tag = self._create_tag_instance(tag_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 428, in _create_tag_instance
    tag = super()._create_tag_instance(tag_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 259, in _create_tag_instance
    Session = sessionmaker(self.sa_session.bind)
AttributeError: 'NoneType' object has no attribute 'bind'
]
Traceback (most recent call last):
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/__init__.py", line 1828, in finish
    import_model_store.perform_import(history=job.history, job=job)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 209, in perform_import
    datasets_attrs = self.datasets_properties()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 930, in datasets_properties
    datasets_attrs = load(open(datasets_attrs_file_name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp19s6x278/tmpt0rr8vmw/tmp2mjq6ysx/database/job_working_directory1/000/7/metadata/outputs_populated/datasets_attrs.txt'
galaxy.jobs.runners ERROR 2022-03-15 00:21:29,516 [pN:main,p:2180,tN:LocalRunner.work_thread-3] (7/) Job wrapper finish method failed
Traceback (most recent call last):
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/runners/__init__.py", line 594, in _finish_or_resubmit_job
    job_stderr=job_stderr,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/__init__.py", line 1828, in finish
    import_model_store.perform_import(history=job.history, job=job)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 209, in perform_import
    datasets_attrs = self.datasets_properties()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 930, in datasets_properties
    datasets_attrs = load(open(datasets_attrs_file_name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp19s6x278/tmpt0rr8vmw/tmp2mjq6ysx/database/job_working_directory1/000/7/metadata/outputs_populated/datasets_attrs.txt'

(Please replace this header with a description of your pull request. Please include BOTH what you did and why you made the changes. The "why" may simply be citing a relevant Galaxy issue.)
(If fixing a bug, please add any relevant error or traceback)
(For UI components, it is recommended to include screenshots or screencasts)

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

xref: galaxyproject#13551, fixes
```
galaxy.jobs ERROR 2022-03-15 00:21:29,515 [pN:main,p:2180,tN:LocalRunner.work_thread-3] problem importing job outputs. stdout [] stderr [
Traceback (most recent call last):
  File "metadata/set.py", line 1, in <module>
    from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 121, in set_metadata
    set_metadata_portable()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 291, in set_metadata_portable
    collect_dynamic_outputs(job_context, output_collections)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/job_execution/output_collect.py", line 147, in collect_dynamic_outputs
    persist_elements_to_hdca(job_context, elements, hdca, collector=DEFAULT_DATASET_COLLECTOR)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 719, in persist_elements_to_hdca
    filenames,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 292, in populate_collection_elements
    final_job_state=final_job_state,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 354, in _populate_elements
    self.add_tags_to_datasets(datasets=element_datasets["datasets"], tag_lists=element_datasets["tag_lists"])
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/discover.py", line 576, in add_tags_to_datasets
    self.tag_handler.add_tags_from_list(user, dataset, tags, flush=False)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 61, in add_tags_from_list
    return self.set_tags_from_list(user, item, new_tags_set, flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 75, in set_tags_from_list
    self.apply_item_tags(user, item, unicodify(new_tags_str, "utf-8"), flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 207, in apply_item_tags
    self.apply_item_tag(user, item, name, value, flush=flush)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 179, in apply_item_tag
    tag = self._get_or_create_tag(lc_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 281, in _get_or_create_tag
    tag = self._create_tag(scrubbed_tag_str)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 246, in _create_tag
    tag = self._create_tag_instance(tag_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 428, in _create_tag_instance
    tag = super()._create_tag_instance(tag_name)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/tags.py", line 259, in _create_tag_instance
    Session = sessionmaker(self.sa_session.bind)
AttributeError: 'NoneType' object has no attribute 'bind'
]
Traceback (most recent call last):
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/__init__.py", line 1828, in finish
    import_model_store.perform_import(history=job.history, job=job)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 209, in perform_import
    datasets_attrs = self.datasets_properties()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 930, in datasets_properties
    datasets_attrs = load(open(datasets_attrs_file_name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp19s6x278/tmpt0rr8vmw/tmp2mjq6ysx/database/job_working_directory1/000/7/metadata/outputs_populated/datasets_attrs.txt'
galaxy.jobs.runners ERROR 2022-03-15 00:21:29,516 [pN:main,p:2180,tN:LocalRunner.work_thread-3] (7/) Job wrapper finish method failed
Traceback (most recent call last):
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/runners/__init__.py", line 594, in _finish_or_resubmit_job
    job_stderr=job_stderr,
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/jobs/__init__.py", line 1828, in finish
    import_model_store.perform_import(history=job.history, job=job)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 209, in perform_import
    datasets_attrs = self.datasets_properties()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/model/store/__init__.py", line 930, in datasets_properties
    datasets_attrs = load(open(datasets_attrs_file_name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp19s6x278/tmpt0rr8vmw/tmp2mjq6ysx/database/job_working_directory1/000/7/metadata/outputs_populated/datasets_attrs.txt'
```
@mvdbeek mvdbeek added kind/bug area/database Galaxy's database or data access layer labels Mar 16, 2022
@github-actions github-actions bot added this to the 22.01 milestone Mar 16, 2022
@nsoranzo
Copy link
Member

While you are here, I noticed that GalaxySessionlessTagHandler.get_tag_by_name() is missing the return before self.created_tags.get(tag_name) .

Also, it's quite confusing why we have both a _get_tag() and get_tag_by_name() methods in these classes which are almost identical.

@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 16, 2022

While you are here, I noticed that GalaxySessionlessTagHandler.get_tag_by_name() is missing the return before self.created_tags.get(tag_name) .

Yeah, that looks odd, but I don't think it has any effect. I don't think I want to touch this right now, but I can follow up on dev. The only place where this is used (in a sessionless context) is https://github.com/mvdbeek/galaxy/blob/86f696cc0b7f58651a59670b9909ce011fe7fb0c/lib/galaxy/model/tags.py#L273, so this should be ok.

Also, it's quite confusing why we have both a _get_tag() and get_tag_by_name() methods in these classes which are almost identical.

I agree, those could be merged, but I would do this on dev as well.

@nsoranzo
Copy link
Member

While you are here, I noticed that GalaxySessionlessTagHandler.get_tag_by_name() is missing the return before self.created_tags.get(tag_name) .

Yeah, that looks odd, but I don't think it has any effect. I don't think I want to touch this right now, but I can follow up on dev. The only place where this is used (in a sessionless context) is https://github.com/mvdbeek/galaxy/blob/86f696cc0b7f58651a59670b9909ce011fe7fb0c/lib/galaxy/model/tags.py#L273, so this should be ok.

Both GalaxySessionlessTagHandler's methods were introduced in commit f3e48cf , I think for this sessionless context code path: add_tags_from_list() -> set_tags_from_list() -> apply_item_tags() -> apply_item_tag() -> _get_or_create_tag() -> get_tag_by_name() , which seems to be broken because get_tag_by_name() always returns None.

Also, it's quite confusing why we have both a _get_tag() and get_tag_by_name() methods in these classes which are almost identical.

I agree, those could be merged, but I would do this on dev as well.

Agreed, that's a refactoring that can target dev.

@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 16, 2022

Both GalaxySessionlessTagHandler's methods were introduced in commit f3e48cf , I think for this sessionless context code path: add_tags_from_list() -> set_tags_from_list() -> apply_item_tags() -> apply_item_tag() -> _get_or_create_tag() -> get_tag_by_name() , which seems to be broken because get_tag_by_name() always returns None.

But that's not broken when we're sessionless, we'll create a new tag and all we serialize back is the tag string. This is covered by tests, and I don't want to change this on a stable release.

@mvdbeek

This comment was marked as off-topic.

@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 18, 2022

Alright, I ran the API tests with the additional fix and GALAXY_CONFIG_OVERRIDE_METADATA_STRATEGY=extended, that seems to have worked.

Seems like the only way you could previously include tags
was if you included them in the data fetch tool.
@mvdbeek mvdbeek force-pushed the fix_tag_creation_metadata_strategy_extended branch from bc22aad to 7aae40d Compare March 18, 2022 11:16
@mvdbeek mvdbeek merged commit 74a51d9 into galaxyproject:release_21.09 Mar 18, 2022
nsoranzo added a commit to nsoranzo/galaxy that referenced this pull request Mar 18, 2022
that I broke when merging
galaxyproject#13552 forward.

Fix:

```
app = <galaxy.model.unittest_utils.data_app.GalaxyDataTestApp object at 0x7f0e00dd3640>, target = '/tmp/tmpyj5d8268', work_directory = '/tmp/tmp_a7e191v'

    def _import_directory_to_history(app, target, work_directory):
        sa_session = app.model.context

        u = model.User(email="collection@example.com", password="password")
        import_history = model.History(name="Test History for Import", user=u)

        sa_session = app.model.context
        sa_session.add_all([u, import_history])
        sa_session.flush()

        assert len(import_history.datasets) == 0

        import_options = store.ImportOptions(allow_dataset_object_edit=True)
>       import_model_store = store.get_import_model_store_for_directory(target, app=app, user=u, import_options=import_options, tag_handler=app.tag_handler.create_tag_handler_session())
E       AttributeError: 'GalaxyDataTestApp' object has no attribute 'tag_handler'

test/unit/data/model/test_model_discovery.py:235: AttributeError
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer kind/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants