Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[21.05] Do not persist tool state when invoking workflow #12141

Conversation

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Jun 14, 2021

If the tool_inputs column of the workflow_step is loaded as
MutableJSONType any "accidental" change of values in the dict marks the
object as "dirty" in the sqlalchemy session it is attached to and will
be persisted when the session is being flushed.
Making the column type JSONType instead of MutableJSONType means
that the object is not marked as dirty and will not be persisted in that
case. The correct method to use to actually persist step inputs is
WorkflowModule.save_to_step or setting step.tool_inputs directly. This
fixes #9886 and (probably, wasn't able to reproduce myself) #12081 (comment)

The issue in #9886 is that
when you submit a workflow the step state is converted to a tool state
in
https://github.com/mvdbeek/galaxy/blob/90dff9af5bec8bed1258620e5351d0350e284206/lib/galaxy/workflow/modules.py#L661:

    def step_state_to_tool_state(self, state):
        state = safe_loads(state)
        if "format" in state:
            formats = state["format"]
            if formats:
                formats = ",".join(listify(formats))
                state["format"] = formats
        state = json.dumps(state)
        return state

and safe_loads just returns the same object if the object that is passed
in is not actually a JSON string. So state["format"] = formats
manipulates the step state in place and that gets committed when
flushing the new invocation.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:

This is tricky to write good tests for. You could revert #9884 and run @dannon's workflow in https://gist.github.com/dannon/d8abf15c848405a7dc15a792149087d3 and make sure that after running it once you don't see "tool_state": "{\"optional\": false, \"format\": [\"f\", \"\", \"\", \"a\", \"\", \"\", \"s\", \"\", \"\", \"t\", \"\", \"\", \"a\"]}", in the tool state. That's what I used to figure out how to fix this.

License

If the `tool_inputs` column of the `workflow_step` is loaded as
MutableJSONType any "accidental" change of values in the dict marks the
object as "dirty" in the sqlalchemy session it is attached to and will
be persisted when the session is being flushed.
Making the column type `JSONType` instead of `MutableJSONType` means
that the object is not marked as dirty and will not be persisted in that
case. The correct method to use to actually persist step inputs is
WorkflowModule.save_to_step or setting step.tool_inputs directly.  This
fixes galaxyproject#9886 and
galaxyproject#12081 (comment)

The issue in galaxyproject#9886 is that
when you submit a workflow the step state is converted to a tool state
in
https://github.com/mvdbeek/galaxy/blob/90dff9af5bec8bed1258620e5351d0350e284206/lib/galaxy/workflow/modules.py#L661:
```
    def step_state_to_tool_state(self, state):
        state = safe_loads(state)
        if "format" in state:
            formats = state["format"]
            if formats:
                formats = ",".join(listify(formats))
                state["format"] = formats
        state = json.dumps(state)
        return state
```
and safe_loads just returns the same object if the object that is passed
in is not actually a JSON string. So `state["format"] = formats`
manipulates the step state in place and that gets committed when
flushing the new invocation.
It's not used in many places, this shouldn't have much impact on
performance, and I guess it's a fair assumption that something called
json.safe_loads would return a deepcopy. That's complementary to the
previous commit and would also separately fix the issues referenced
there.
@mvdbeek mvdbeek changed the title [21.05] Fix persisting tool state when invoking workflow [21.05] Do not persist tool state when invoking workflow Jun 14, 2021
@mvdbeek mvdbeek requested a review from jmchilton Jun 14, 2021
@jmchilton jmchilton merged commit 06183bd into galaxyproject:release_21.05 Jun 15, 2021
31 checks passed
Loading
@jmchilton
Copy link
Member

@jmchilton jmchilton commented Jun 15, 2021

Thank you!

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants