Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scottx611x/is tool based cleanup #2222

Merged
merged 19 commits into from Oct 17, 2017
Merged

Conversation

scottx611x
Copy link
Member

@scottx611x scottx611x commented Oct 10, 2017

Fixes #2194

TO DO:

  • Make sure code coverage is up to par
  • Manual testing

@codecov-io
Copy link

codecov-io commented Oct 10, 2017

Codecov Report

Merging #2222 into develop will increase coverage by 0.59%.
The diff coverage is 88.4%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #2222      +/-   ##
==========================================
+ Coverage     48.8%   49.4%   +0.59%     
==========================================
  Files          416     416              
  Lines        28222   27477     -745     
  Branches      1307    1307              
==========================================
- Hits         13773   13574     -199     
+ Misses       14449   13903     -546
Impacted Files Coverage Δ
refinery/galaxy_connector/galaxy_workflow.py 100% <ø> (+86.9%) ⬆️
refinery/analysis_manager/models.py 83.09% <ø> (+0.9%) ⬆️
refinery/analysis_manager/urls.py 100% <ø> (ø) ⬆️
refinery/core/tests.py 99.64% <ø> (ø) ⬆️
refinery/analysis_manager/tests.py 97.32% <100%> (-0.78%) ⬇️
refinery/analysis_manager/views.py 46.78% <100%> (+10.71%) ⬆️
refinery/tool_manager/models.py 95.89% <100%> (ø) ⬆️
refinery/tool_manager/utils.py 84.21% <100%> (-0.2%) ⬇️
refinery/tool_manager/tests.py 99.84% <100%> (-0.01%) ⬇️
refinery/analysis_manager/utils.py 58.55% <80%> (+3.4%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 986397c...a63fcec. Read the comment docs.

@@ -20,176 +20,45 @@

logger = logging.getLogger(__name__)

# Allow JSON Schema to find the JSON pointers we define in our schemas
JSON_SCHEMA_FILE_RESOLVER = RefResolver(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you kept this, could pieces like the UUID RE be defined separately? Would that be useful?

@@ -200,9 +69,9 @@ def fetch_objects_required_for_analysis(validated_analysis_config):
:return: dict w/ mapping to the commonly used objects
:raises: RuntimeError
"""
study_uuid = validated_analysis_config["studyUuid"]
study_uuid = validated_analysis_config["study_uuid"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the dependency between this string and the json schema, it would probably be overkill to have the json created by python code?

tool = workflow_tool.objects.get(analysis__uuid=self.uuid)
except(workflow_tool.DoesNotExist,
workflow_tool.MultipleObjectsReturned) as e:
logger.error("Could not properly fetch Tool: %s", e)
Copy link
Member

@mccalluc mccalluc Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is logging an error sufficient? We'd then skip the else and drop to if self.history_id:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mccalluc Now that I look at the surrounding code the exceptions would be pretty odd to hit, but I don't think we should re-raise anything and stop histories from being deleted as well.

@@ -73,9 +73,6 @@ def refinery_import_state(self):
return get_task_group_state(self.refinery_import_task_group_id)

def galaxy_file_import_state(self):
return get_task_group_state(self.galaxy_import_task_group_id)

def tool_based_galaxy_file_import_state(self):
if self.galaxy_import_state and self.galaxy_import_progress != 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

galaxy_import_state is a field of AnalysisStatus. Is it necessary to check whether it exists?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can still construct an AnalysisStatus like so w/o a galaxy_import_state: AnalysisStatus.objects.create(analysis=analysis)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but the field would actually exists with a default value, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for a blank=True is u'' which is still falsey

For example, this test case passes:

def test_analysis_status_galaxy_import_state_is_falsey_by_default(self):
        analysis_status = AnalysisStatus.objects.create(analysis=self.analysis)
        self.assertFalse(analysis_status.galaxy_import_state)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but the field is there, so checking for its existence is not necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its necessary as far as I can tell.

One would need to assert that there has been a galaxy_import_state set and that some galaxy_import_progress has happened before going ahead and handing a response over to the front end to display.
Otherwise theres nothing meaningful for the UI to display yet.

I could imagine a scenario occurring where the galaxy_import_state is still equal to u'' and even though the galaxy_import_state field is present, passing that information along to the frontend without checking it first would yield odd results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK but you are already checking it by self.galaxy_import_progress != 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because there is positive galaxy_import_progress doesn't mean that a valid galaxy_import_state has been set as well?

Something like AnalyisStatus.objects.create(analysis=<analysis>, galaxy_import_progress=2) Would break the UI with your proposed change

else:
input_file_uuid_list = analysis.get_input_file_uuid_list()
tool = _get_workflow_tool(analysis_uuid)
input_file_uuid_list = tool.get_input_file_uuid_list()

for input_file_uuid in input_file_uuid_list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_file_uuid_list is used only once. Replace with for input_file_uuid in tool.get_input_file_uuid_list() and remove the line above?

@@ -486,121 +419,9 @@ def _run_tool_based_galaxy_workflow(analysis_uuid):
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary? Same as lines 125, 141, 223, 291 and 365?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just continued using these returns as they we're being used beforehand in this area of the codebase. If you are confident that I can remove them then I can.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were necessary previously because they provided exit points in the middle of the function. Now they are all at the end of their respective functions, so they don't have any effect.

def _tool_based_galaxy_file_import(analysis_uuid, file_store_item_uuid,
history_dict, library_dict):
def _galaxy_file_import(analysis_uuid, file_store_item_uuid, history_dict,
library_dict):
tool = _get_workflow_tool(analysis_uuid)

file_store_item = FileStoreItem.objects.get(uuid=file_store_item_uuid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling?

file_type = 'zip'
# TODO: when changing permanent=True, fix update of % download
# of file
filestore_uuid = create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fit on one line.


common_analysis_objects = fetch_objects_required_for_analysis(
validated_analysis_config
common_analysis_objects = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wrap this function call in parentheses?

name="{} - {} - {}".format(
tool.get_tool_name(),
get_aware_local_time().strftime("%Y/%m/%d %H:%M:%S"),
tool.get_owner_username().title()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Username has a title?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think title is just a method available to strings in python

Validate incoming Tool Launch Configurations
:param tool_launch_config: json data containing a ToolLaunchConfiguration
Validate incoming Analysis Configurations
:param analysis_config: json data containing an Analysis configuration
"""
with open(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the JsonSchema validations are within the scopes of atomic transactions so I want any exceptions to propagate. I'll make a note.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, a comment at the top of this function would be helpful. Btw, it looks like AnalysisConfig.json is essentially a configuration file, so I would move the reading operation out into base settings (similar to tutorial_steps.json, etc). This would make validate_analysis_config() cleaner and more testable. Also, should this file be named using snake case or camel case?

# original galaxy workflow which we don't want to delete
if not self.is_tool_based:
if self.workflow_galaxy_id:
workflow_tool = tool_manager.models.WorkflowTool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning a model to a variable is an anti-pattern.

@scottx611x scottx611x added this to the Release 1.6.1 milestone Oct 16, 2017
@scottx611x
Copy link
Member Author

@hackdna I'm going to take care of your schema concerns here: #2236 rather than diverging from this PR's original intent.

@scottx611x
Copy link
Member Author

@hackdna I think that PRs grow to large sizes and become harder for reviewers when this exact scenario occurs (being asked to provide changes that were not part of a PRs original motivations). I don't see how having concise, directed PRs complicates things at all.

@hackdna
Copy link
Member

hackdna commented Oct 16, 2017

My comment was specifically about 2f1419f

@scottx611x
Copy link
Member Author

Oops, didn't see that. I believe @mccalluc mentioned in his review that it would be nice to be able to have things like reused regexes throughout schemas in a single location. 2f1419f created core/schemas/Base.json so that schemas in other apps could reuse values/patterns from there.


from django.conf import settings
from django.contrib.auth.models import User
from django.http import HttpResponseServerError
from django.utils import timezone

from jsonschema import RefResolver, ValidationError, validate
from jsonschema import ValidationError, validate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be import jsonschema

JSON_SCHEMA_FILE_RESOLVER = RefResolver(
"file://{}/".format(os.path.join(BASE_DIR, "refinery")),
None
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An instance of RefResolver should not be a Django setting. Also, it looks like base_uri doesn't need to be a file path but could be any URI: something like http://refinery-platform.org to make things simpler.
Finally, a better approach might be to add the $id keyword to the schema and then use from_schema class method to create a resolver?

@scottx611x
Copy link
Member Author

@hackdna I appreciate you insight for a better solution to the JSONSchema changes. I'm going to revert 2f1419f and address your concerns in #2236 as well since these changes aren't a part of #2222's original motivation

@scottx611x scottx611x merged commit bfa780d into develop Oct 17, 2017
@scottx611x scottx611x deleted the scottx611x/is_tool_based_cleanup branch October 17, 2017 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants