Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ShedToolLineage and StockLineage to ToolLineage #4119

Merged
merged 14 commits into from Jul 5, 2017

Conversation

Projects
None yet
4 participants
@mvdbeek
Copy link
Member

commented May 29, 2017

This means we don't rely on the Tool Shed metadata anymore and instead
deduce the tool lineage via the tool_id, which contains the version.
Comparing versions is done by comparing LooseVersions.
This works for toolshed and non-toolshed tools, restoring proper
grouping in the tool panel and version switches, even for tool shed
tools whose lineage is broken.

Should fix #4097 and would make fixing #1532 unnecessary (IMHO).

@mvdbeek mvdbeek added this to the 17.09 milestone May 29, 2017

@mvdbeek mvdbeek force-pushed the mvdbeek:version_switch_fix branch 3 times, most recently from f376bd7 to c2ca262 May 29, 2017

@mvdbeek mvdbeek changed the title [WIP] Replace ShedToolLineage with enhanced StockLineage Replace ShedToolLineage with enhanced StockLineage May 30, 2017

@mvdbeek mvdbeek added status/review and removed status/WIP labels May 30, 2017

@jmchilton

This comment has been minimized.

Copy link
Member

commented May 30, 2017

I couldn't love the ambition of this pull request more - this is fantastic. We should probably stipulate that there potentially degenerate cases where this provides a different lineage than what the tool shed would have provided before - which only effectively matters if a different tool really appears at the top of the lineage. I don't know if any actual repository exhibits such a lineage - but the fix in that case is pretty easy I think - just push out a new version of that tool with a new corrected absolute greater version and upgrade to that.

We can potentially delete even more client code that sets up those lineages and reads them from the server now right? That doesn't have to be done as part of this PR - but we should create an issue stipulate that we can do that I think.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented May 30, 2017

We should probably stipulate that there potentially degenerate cases where this provides a different lineage than what the tool shed would have provided before

Right, that could potentially happen. We could include this, and the solution to that problem in the release notes, right? And/Or we could go through the toolshed and see if that would happen -- I think the API exposes everything we need to do this remotely.

A corner-case may be versions like 1.0beta-1 appearing after 1.0 in the lineage, but we could solve this by switching to a version parser that implements pep440, like https://packaging.pypa.io/en/latest/version/#packaging.version.parse instead of LooseVersion.

We can potentially delete even more client code that sets up those lineages and reads them from the server now right? That doesn't have to be done as part of this PR - but we should create an issue stipulate that we can do that I think.

We can definitely remove the lineage mako in the admin settings and remove the ToolVersionManager on the galaxy side. For now I think we may need to keep the ToolVersion and ToolVersionAssociation on the Tool Shed side for older Galaxy versions.

@jmchilton

This comment has been minimized.

Copy link
Member

commented Jun 5, 2017

@davebx do you know if there are any tool shed tests for tool migrations or put another way if this were to interfere with tool migrations do you think the tests would catch it?

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 20, 2017

You should also remove tool_version_cache from test/unit/tools_support.py .

@@ -436,7 +436,7 @@ def sa_session( self ):
@property
def tool_version( self ):
"""Return a ToolVersion if one exists for our id"""

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 20, 2017

Member

I think now it returns a StockLineage object, so s/ToolVersion/StockLineage/ , right?

mvdbeek added some commits May 26, 2017

Replace ShedToolLineage with enhanced StockLineage
This means we don't rely on the Tool Shed metadata anymore and instead
deduce the tool lineage via the tool_id, which contains the version.
Version comparison then happens via LooseVersions.
This works for toolshed and non-toolshed tools, restoring proper
grouping in the tool panel and version switches, even for tool shed
tools whose lineage is broken.

@mvdbeek mvdbeek force-pushed the mvdbeek:version_switch_fix branch from 142ccb0 to 6864187 Jun 20, 2017

@nsoranzo
Copy link
Member

left a comment

Amazing refactoring @mvdbeek! Hopefully this will solve the duplications of TS tools in the toolbox.

if reverse:
version_ids.reverse()
version_ids = reversed(version_ids)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

I think that's not correct, reversed returns an iterator, while I think the expected return type of this method is a list.

@@ -0,0 +1,19 @@
def remove_version_from_guid(guid):

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

I'd remove this empty line.

if from_toolshed:
lineage = ToolShedLineage.from_tool(self.app, tool)
else:
versionless_tool_id = remove_version_from_guid( tool_id)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

I think you meant to also remove the space after the (

lineage = StockLineage.from_tool( tool )
if versionless_tool_id and versionless_tool_id not in self.lineage_map:
self.lineage_map[versionless_tool_id] = lineage
if tool_id not in self.lineage_map:
self.lineage_map[tool_id] = lineage
lineage.register_version(tool.version)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

I don't think that's needed: register_version(tool.version) is already called inside StockLineage.from_tool( tool ), which is the only method used to create the StockLineage objects added to the self.lineage_map dictionary.

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 21, 2017

Author Member

At first glance I agree with you, but removing this line breaks the lineages ... I'll have to dig into why that happens

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 22, 2017

Member

Maybe a threading issue? See my comment below.

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 22, 2017

Author Member

OK, I think this is what happens:
When a tool is loaded the first time, it will enter self.lineage_map, and have its version registered.
When a related tool is added, it will be fetched from self.lineage_map via the versionless_tool_id, and not run through ToolLineage.from_tool. So we need to add the current tool's version via lineage.register_version(tool.version). I'll add a comment to clarify this.

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 22, 2017

Member

I see! Then you can move this up a bit changing lines 18-19 above to:

        if lineage:
            lineage.register_version(tool.version)
        else:
            lineage = ToolLineage.from_tool(tool)
return guid[:last_slash]


def get_version_from_guid(guid):

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

This method is unused.

tool_versions.append( self )
__descendants( app, self )
return tool_versions

def get_version_ids( self, app, reverse=False ):

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

This method is not used any more.

This comment has been minimized.

Copy link
@mvdbeek

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

Sorry I missed that! Up to you if you want to replace it or leave it as it is.

if tool:
tool_id = tool.id
lineage = trans.app.toolbox._lineage_map.get(tool_id)
if lineage:

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

If lineage is None, the method returns an empty string. I don't think that should ever happen, so it may be worth to remove this if and let an Exception be raised on the next line if something is really broken.

mvdbeek added some commits Jun 21, 2017

Remove duplication between ToolLineage and TVM
Getting tool versions and tool ids is now done through the tool lineage.
Also assume that the lineage can always be found for a given loaded tool in
the ToolVersionListGrid class. Thanks @nsoranzo for these suggestions!
@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 21, 2017

Should lib/galaxy/tools/toolbox/lineages/stock.py be merged in lib/galaxy/tools/toolbox/lineages/interface.py leaving only the ToolLineage class?

tool_ids_str += '%s<br/>' % tool_id
tool = toolbox._tools_by_id.get(tool_version.tool_id)
if tool:
tool_id = tool.id

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 21, 2017

Member

This line can be removed.

@mvdbeek mvdbeek changed the title Replace ShedToolLineage with enhanced StockLineage Merge ShedToolLineage and StockLineage to ToolLineage Jun 22, 2017

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 22, 2017

I think we can also remove line 418-419 in lib/galaxy/tools/__init__.py

@nsoranzo
Copy link
Member

left a comment

Thanks @mvdbeek for the changes!

lineage = StockLineage.from_tool( tool )
if versionless_tool_id and versionless_tool_id not in self.lineage_map:
self.lineage_map[versionless_tool_id] = lineage
if tool_id not in self.lineage_map:
self.lineage_map[tool_id] = lineage
lineage.register_version(tool.version)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 22, 2017

Member

Maybe a threading issue? See my comment below.

if tool_id not in lineages_by_id:
lineages_by_id[ tool_id ] = ToolLineage( tool_id )
lineage = lineages_by_id[ tool_id ]
lineage.register_version( tool.version )

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 22, 2017

Member

I'm not sure that modifying lineage here by calling its register_version() method is thread-safe. What about rewriting lines 67-71 as:

        with ToolLineage.lock:
            lineage = lineages_by_id.get(tool_id, ToolLineage(tool_id))
            lineage.register_version( tool.version )
            lineages_by_id[ tool_id ] = lineage

?

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 22, 2017

Author Member

Hmm, that didn't seem to be enough ...

@@ -441,8 +441,8 @@ def sa_session( self ):

@property
def tool_version( self ):

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 22, 2017

Member

Crazy idea: would it make sense to rename this property to lineage and remove:

tool.lineage = tool_lineage

from AbstractToolBox._load_tool_tag_set() method in lib/galaxy/tools/toolbox/base.py ?

I think this property is only used and would need to be renamed:

  • in the next method tool_versions
  • in this file at line1835
  • in lib/galaxy/webapps/galaxy/controllers/admin_toolshed.py at line 2064.

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 22, 2017

Author Member

absolutely, I think that makes it clearer.

@nsoranzo
Copy link
Member

left a comment

Not sure if you also want to implement my "crazy idea" before merge, but 👍 from me anyway.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2017

I have been playing with this for a while now, the problem is that if we do the renaming operation we need to have a working app.toolbox._lineage_map object in the unit tests ... and that seems to be tricky.

@mvdbeek mvdbeek force-pushed the mvdbeek:version_switch_fix branch from 639ff55 to bfa24ce Jun 23, 2017

Move tool.tool_version to tool.lineage
This change makes it obvious that a Lineage object will be returned.

Also removes two unused methods from ToolLineageVersion.

@mvdbeek mvdbeek force-pushed the mvdbeek:version_switch_fix branch from bfa24ce to 41d739b Jun 23, 2017

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Jun 23, 2017

@nsoranzo I have tried to go with the spirit of your 'crazy idea'. If 41d739b is fine for you this is ready to be merged.

tool = trans.app.toolbox.load_tool( os.path.abspath( tool_config ), guid=tool_metadata[ 'guid' ] )
if tool:
tool._lineage = trans.app.toolbox._lineage_map.register( tool )
tool_lineage = tool.lineage.get_version_ids(reverse=True)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 23, 2017

Member

This line should be indented as the line above, i.e. executed only if tool exists.

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 24, 2017

Author Member

Then we wouldn't get the lineage if the tool was already in the toolbox.
I have added another if tool: statement in 1e11c46

@@ -1,4 +1,5 @@
import json
import time
import tarfile

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 24, 2017

Member

Travis complains about the import order here (assuming you added the changes to this file on purpose).

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 24, 2017

Author Member

nope ... :(, that came from #4222

@mvdbeek mvdbeek force-pushed the mvdbeek:version_switch_fix branch from 1e11c46 to 84e0e7e Jun 24, 2017

@nsoranzo
Copy link
Member

left a comment

@jmchilton Ready for merge?

@jmchilton jmchilton merged commit a42abc0 into galaxyproject:dev Jul 5, 2017

5 checks passed

api test Build finished. 279 tests run, 0 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 150 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 34 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 579 tests run, 0 skipped, 0 failed.
Details
@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 5, 2017

Great - thanks a bunch for the excellent implementation @mvdbeek and the detailed review @nsoranzo!

@guerler

This comment has been minimized.

Copy link
Contributor

commented Jul 6, 2017

I like this PR a lot but I noticed an issue when loading a previously imported workflow and I think it might be related.

`File '/Users/guerler/galaxy/lib/galaxy/web/framework/decorators.py', line 54 in call_and_format
  rval = func( self, trans, *args, **kwargs )
File '/Users/guerler/galaxy/lib/galaxy/webapps/galaxy/controllers/workflow.py', line 639 in load_workflow
  return workflow_contents_manager.workflow_to_dict( trans, stored, style="editor" )
File '/Users/guerler/galaxy/lib/galaxy/managers/workflows.py', line 324 in workflow_to_dict
  return self._workflow_to_dict_editor( trans, stored )
File '/Users/guerler/galaxy/lib/galaxy/managers/workflows.py', line 448 in _workflow_to_dict_editor
  config_form = module.get_config_form()
File '/Users/guerler/galaxy/lib/galaxy/workflow/modules.py', line 708 in get_config_form
  return self.tool.to_json( self.trans, incoming, workflow_building_mode=True )
File '/Users/guerler/galaxy/lib/galaxy/tools/__init__.py', line 1834 in to_json
  tool_versions = self.lineage.tool_versions
AttributeError: 'NoneType' object has no attribute 'tool_versions'`

I can debug this more and provide additional information if needed.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Jul 6, 2017

Hmm, that is interesting. Once a tool is loaded into the tool panel it should get a lineage. Any chance this workflow is missing a tool ?

@guerler

This comment has been minimized.

Copy link
Contributor

commented Jul 6, 2017

Yes it is missing tools.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Jul 6, 2017

OK, I think we can just be a bit more defensive in determining the tool_versions. I'll try to reproduce this.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Jul 6, 2017

@guerler can you try mvdbeek@ad5431c and see if that is all that is needed? Also, any idea how I can reproduce this? If I just uninstall a tool the workflow export fails with

File '/Users/mvandenb/src/galaxy/lib/galaxy/managers/workflows.py', line 585 in _workflow_to_dict_export
  if module.tool.tool_shed_repository:
AttributeError: 'NoneType' object has no attribute 'tool_shed_repository'

but that is not a new error ...

@guerler

This comment has been minimized.

Copy link
Contributor

commented Jul 6, 2017

Yes, this fixes the issue I mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.