Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't map over set #1821

Closed
bnaul opened this issue Dec 10, 2019 · 2 comments
Closed

Can't map over set #1821

bnaul opened this issue Dec 10, 2019 · 2 comments

Comments

@bnaul
Copy link

bnaul commented Dec 10, 2019

Example:

from prefect import task, Flow

@task
def double(x):
    return 2 + x

with Flow('set_map') as flow:
    set_double = double.map({1, 2})  # double.map([1, 2]) works

flow.run()

Currently fails on 0.7.3 with:

[2019-12-10 07:01:48,973] ERROR - prefect.TaskRunner | Task 'double': unexpected error while running task: TypeError("'set' object is not subscriptable")
Traceback (most recent call last):
  File "/Users/brett/Dropbox/Documents/model/.venv/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 260, in run
    executor=executor,
  File "/Users/brett/Dropbox/Documents/model/.venv/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 720, in run_mapped_task
    upstream_state.result[i],
TypeError: 'set' object is not subscriptable
ERROR:prefect.TaskRunner:Task 'double': unexpected error while running task: TypeError("'set' object is not subscriptable")
Traceback (most recent call last):
  File "/Users/brett/Dropbox/Documents/model/.venv/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 260, in run
    executor=executor,
  File "/Users/brett/Dropbox/Documents/model/.venv/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 720, in run_mapped_task
    upstream_state.result[i],
TypeError: 'set' object is not subscriptable

<Failed: "Some reference tasks failed.">

Maybe this is expected behavior but I would have thought it would just work as long as the thing being mapped has a len (even if the output wasn't guaranteed to stay a set).

@cicdw
Copy link
Member

cicdw commented Dec 15, 2019

Sorry for the delayed response here; the biggest reason that non-ordered collections don't play well with mapping is that in Cloud (and also somewhat in Core) we identify mapped task runs as a pair (task_run_id, map_index) where the map index corresponds to the position of the argument being mapped over. This becomes more important in situations such as retries when we need to make sure we aren't mismatching state updates.

There's a decent chance we'll be refactoring mapping in the near future for performance reasons, and when we do we can revisit some of these assumptions and make sure they aren't overly stringent!

@bnaul
Copy link
Author

bnaul commented Dec 16, 2019

Makes sense :) Maybe there's a different error that could be raised that makes it clearer that the behavior is intentional? Regardless feel free to close if you think this won't be addressed except as part of a larger refactor

@cicdw cicdw added this to the 0.9.2 milestone Jan 27, 2020
@cicdw cicdw self-assigned this Jan 27, 2020
@joshmeek joshmeek modified the milestones: 0.9.2, 0.9.3, 0.9.4 Jan 30, 2020
@joshmeek joshmeek modified the milestones: 0.9.4, 0.9.5 Feb 13, 2020
@joshmeek joshmeek modified the milestones: 0.9.5, 0.9.6 Feb 25, 2020
@cicdw cicdw removed their assignment Nov 29, 2021
@kvnkho kvnkho closed this as completed Dec 8, 2021
zanieb pushed a commit that referenced this issue Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants