fix: prefer known to unknown lengths in broadcasting #2561

agoose77 · 2023-07-03T15:05:34Z

This PR is a first pass over the broadcasting logic to change direction from "unknown lengths are infectious" to "known lengths are infectious".

As we've discussed here previously, typetracer should only fail for ahead-of-time known errors. We defer validation of the unknown data to runtime, via a second pass using a known-data backend. As such, we can rewrite assertions of the form assert unknown_value == x with unknown_value = x, rather than propagating unknown values everywhere.

i.e. operations like

require_equal_lengths(contents)
next_length = unknown_length

become

require_equal_lengths(contents)
next_length = contents[0].length

I think we might have assumed this before; I'm just getting around to changing the code after re-orienting my thinking a while back.

Relatedly, slicing typetracer arrays should assume that the length succeeds, and use the concrete length, for obvious reason.

src/awkward/_broadcasting.py

jpivarski

Yes, "known lengths are infectious" is the right direction: if we try to broadcast length N with length unknown, then the broadcasted result is length N because unknown might be equal to 1. If it's not, we'll find out when a Dask worker tries to actually do it with no unknown lengths.

agoose77 added 7 commits July 3, 2023 15:07

refactor: add type hints for broadcasting

4317243

refactor: cleanup record broadcasting

61abff6

refactor: detect length using checklength

134bb74

refactor: don't check lengths twice

d2262fa

refactor: add more type hints

2b1fc0d

fix: prefer known lengths

d71950f

fix: better handling of unknown lengths

c873b98

agoose77 commented Jul 3, 2023

View reviewed changes

src/awkward/_broadcasting.py Outdated Show resolved Hide resolved

agoose77 commented Jul 3, 2023

View reviewed changes

src/awkward/_broadcasting.py Show resolved Hide resolved

chore: appease pylint

b50d286

agoose77 requested a review from jpivarski July 3, 2023 15:17

agoose77 temporarily deployed to docs-preview July 3, 2023 15:22 — with GitHub Actions Inactive

jpivarski approved these changes Jul 3, 2023

View reviewed changes

agoose77 merged commit eaf60bf into main Jul 3, 2023
36 checks passed

agoose77 deleted the agoose77/refactor-broadcasting-lengths branch July 3, 2023 17:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prefer known to unknown lengths in broadcasting #2561

fix: prefer known to unknown lengths in broadcasting #2561

agoose77 commented Jul 3, 2023 •

edited

Loading

jpivarski left a comment

fix: prefer known to unknown lengths in broadcasting #2561

fix: prefer known to unknown lengths in broadcasting #2561

Conversation

agoose77 commented Jul 3, 2023 • edited Loading

jpivarski left a comment

Choose a reason for hiding this comment

agoose77 commented Jul 3, 2023 •

edited

Loading