Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove inputs special keyword argument #1790

Open
annawoodard opened this issue Jul 6, 2020 · 10 comments
Open

Remove inputs special keyword argument #1790

annawoodard opened this issue Jul 6, 2020 · 10 comments

Comments

@annawoodard
Copy link
Collaborator

@makslevental recently reported a serious bug #1788 and proposed a solution in #1789. One reservation I have with that approach is that it would change current Parsl behavior in a way that might be confusing for users: if a list or tuple of futures is passed as a positional arg, Parsl would wait for each component future to be resolved before executing the app. This might lead users to expect that other structures (like dicts, etc) would also be introspected.

@benclifford pointed out on Slack that because we now check args, keyword args, and the inputs special keyword arg for files to potentially stage in, the inputs special keyword argument is redundant. Borrowing Max's example:

import parsl
from parsl import python_app

@python_app
def monte_carlo():
    from random import random
    from math import sin, cos
    u = random()
    return (sin(u)) / (u + cos(u) ** 2)

@python_app
def mean1(inputs=[]):
    return sum(inputs) / len(inputs)

@python_app
def mean2(inputs=[]):
    return (inputs[0] + inputs[1] + inputs[2]) / len(inputs)

could be replaced by:

import parsl
from parsl import python_app

@python_app
def monte_carlo():
    from random import random
    from math import sin, cos
    u = random()
    return (sin(u)) / (u + cos(u) ** 2)

@python_app
def mean1(*args):
    return sum(args) / len(args)

@python_app
def mean2(*args):
    return (args[0] + args[1] + args[2]) / len(args)

And to clarify how this plays with other arguments and give a full example:

import parsl
from parsl import python_app

@python_app
def monte_carlo():
    from random import random
    from math import sin, cos
    u = random()
    return (sin(u)) / (u + cos(u) ** 2)

@python_app
def mean(foo, *futures, baz='zap'):
    return foo, baz, (futures[0] + futures[1] + futures[2]) / len(futures)

futures = [monte_carlo() for _ in range(3)]
foo, baz, average = mean('bar', *futures, baz='zapzap').result()
print("Foo: {}, Baz: {}, Average: {:.5f}".format(foo, baz, average))

One advantage of removing inputs is that it would be more pythonic, and less Parsl-specific jargon for people to grok. Removing it would also be a fix for #1788.

@benclifford
Copy link
Collaborator

I support this.

@makslevental
Copy link

@annawoodard I think the behavior I was naively expecting is exactly that which you rightly pointed out others might end up expecting - that any Future anywhere would be identified as a dependency and waited on. For peak pythonicity I think in fact that's how it should work but I'm not sure if it's possible in Python.

Anyway this is a breaking change (removing inputs) but if this is what you guys think is the right thing to do then I can take it on.

@benclifford
Copy link
Collaborator

benclifford commented Jul 13, 2020

When we've removed features before, we've made them deprecated for one release, and then removed the code in the next release.

What that would look like here, I think, is that from now until parsl (1.0.0 + 1 version) is released, the code should log a warning when inputs is used, such as "inputs is deprecated and will be removed in the next parsl release. See #1790 for more information."

Perhaps that warning could be placed in parsl.app.app.AppBase.__init__ which already has a test to see if inputs is used.

Then this issue should be left open until sometime actually removes all the code, sometime after (1.0.0 + 1 version) is released

@kylechard
Copy link
Collaborator

I don't think this will be a breaking change? The change here would expand the way we introspect arguments such that any list/tuple would be treated the same way that inputs are currently treated. So those that want to use inputs can continue to do so. We can simply remove inputs from the documentation as a special keyword.

@benclifford
Copy link
Collaborator

benclifford commented Jul 13, 2020

I think @annawoodard 's proposal is to remove inputs, but not add in new deep data structure inspection for parsl-magic (that is, not do #1788, #1789). If so, that makes inputs into a regular argument that is not deep-inspected for parsl-magic, in the same way that any other argument which happens to be a list is also not inspected. (parsl-magic = Futures and Files). That breaks almost any use of inputs.

@annawoodard
Copy link
Collaborator Author

@kylechard Ben's summary of the proposal is correct. I favor logging a warning, changing docs, and removing code in next release. Alternatively we could retain the current behavior for backwards compatibility but just change the docs.

@yadudoc @ZhuozhaoLi @danielskatz interested if you have any opinions on this

@danielskatz
Copy link
Member

We talked about this in the Parsl meeting Friday, and were leaning towards your

retain the current behavior for backwards compatibility but just change the docs.

We also talked about removing outputs but no one but @kylechard really seemed to care

@yadudoc
Copy link
Member

yadudoc commented Jul 13, 2020

Summarizing discussion from Friday:

  1. There's general interest in removing inputs and outputs as special kwargs.
  2. One proposal is to move towards inspecting all params at call time to capture Futures, and supporting inspection of lists, tuples and dicts would cover most cases.
  3. Another proposal was to have deep inspection of all data-structures. This comes with a performance penalty and practical limitations from complex data structures within which Futures can be placed.
  4. One proposal regarding outputs was to create a new class that indicates that a param is specifying an output File object :
output_file = File("foo.txt")
app(input_file, ParslOutput(output_file))

Fixing outputs is a less pressing issue, and is mostly likely always going to be jarring given the idea of specifying outputs of a function ahead of time is a foreign concept in python.

@annawoodard
Copy link
Collaborator Author

Thanks for the summary, @yadudoc!

  1. There's general interest in removing inputs and outputs as special kwargs.

Yay! Really we should remove all special keyword args as they all suffer from the bug @makslevental reported in #1788.

  1. One proposal is to move towards inspecting all params at call time to capture Futures, and supporting inspection of lists, tuples and dicts would cover most cases.

I think inspecting some data structures and not others would be confusing to users. In my view, we should either only resolve bare Futures, or do deep inspection of all data structures. Since I don't think deep inspection is practical, I favor only resolving bare Futures.

  1. Another proposal was to have deep inspection of all data-structures. This comes with a performance penalty and practical limitations from complex data structures within which Futures can be placed.

I am strongly against this proposal. I think properly implementing it, if it can be done at all, it would be a huge amount of work and even then would be riddled with bugs and unexpected behavior. And the only thing we gain is a bit of syntactic sugar. Users are always free to write an app which takes the component futures and returns their data structure of choice.

  1. One proposal regarding outputs was to create a new class that indicates that a param is specifying an output File object :
output_file = File("foo.txt")
app(input_file, ParslOutput(output_file))

Fixing outputs is a less pressing issue, and is mostly likely always going to be jarring given the idea of specifying outputs of a function ahead of time is a foreign concept in python.

That proposal would need tweaking, as users would need a way to map the output passed as an argument to the returned DataFuture as is currently done with the index in the list:

future = app(stderr='baz', outputs=[foo, bar])
foo, bar = future.outputs

Instead, to solve the general problem with special keyword args, I favor a general class:

future = app(RuntimeConfig(stderr='baz', outputs=[foo, bar]))
foo, bar = future.outputs

But specifically for outputs, I'd like to see us move away from the "specify outside of the app" model in favor of the approach proposed in #483.

@ZhuozhaoLi
Copy link
Contributor

For inputs

  1. I favor only inspecting list, dict, and tuple as the first attempt, as most people would use these structures. We can see what users think (e.g., in terms of usability and performance slowdown) after we change. Alternatively, we can allow the users to specify what data structure(s) they want a deep inspection (e.g., in the decorator).

  2. In the meeting, we mentioned that we can retain the capability of input for backward compatibility, and only change docs and add deprecation messages in the next release. Finally, after a few releases, we delete the capability. I think this is the way we should proceed.

For output, I think adding a container is generally fine, though it requires the users to go through the Docs again. But like input, we should add deprecation messages in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants