Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local sparsity support with ADTypes #72

Closed
gdalle opened this issue May 18, 2024 · 11 comments · Fixed by #81
Closed

Local sparsity support with ADTypes #72

gdalle opened this issue May 18, 2024 · 11 comments · Fixed by #81
Assignees

Comments

@gdalle
Copy link
Collaborator

gdalle commented May 18, 2024

I think our TracerSparsityDetector should offer the option to use either local or global patterns.

I don't think it should be the job of ADTypes to provide that distinction, and here's why: the notion of "sparsity" is intrinsically ill-defined.
Our concept sparsity is not the same as that of Symbolics (for instance wrt dependency on control flow if/else), which may not be the same as that of a pre-specified pattern.
It's not up to ADTypes to define a complete classification of the various types of sparsity. Rather, the sparsity detectors themselves should document their approach and limitations, and offer all the necessary toggles

@adrhill
Copy link
Owner

adrhill commented May 18, 2024

We could add local tracing as a kwarg to TracerSparsityDetector, or add a separate LocalTracerSparsityDetector, but this might cause downstream packages like DI to not recompute sparsity patterns when they should.

For this reason, I would argue that local and global sparsity detection should be distinguished between on the level of ADTypes.

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

We could add local tracing as a kwarg to TracerSparsityDetector()

First of all, if we add it it should be as a type parameter to enable dispatch.

this might cause downstream packages like DI to not recompute sparsity patterns when they should.

It's not "packages" who decide when to recompute the sparsity pattern, it's users. They make the call as to whether prepare_jacobian should be run again or not when x changes.

Even what we call "global sparsity" depends on the control flow, so it's not robust to any change in x. Sure, we throw an error "primal value missing" in some cases when we see a comparison, but there must be more cases we missed.

@adrhill
Copy link
Owner

adrhill commented May 18, 2024

Our concept sparsity is not the same as that of Symbolics (for instance wrt dependency on control flow if/else), which may not be the same as that of a pre-specified pattern.

This is not the issue. The issue is that in DI, LocalTracerSparsityDetector should only be used in combination with prepare_*_same_point and never with prepare_*.

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

The issue is that in DI, LocalTracerSparsityDetector should only be used in combination with prepare_same_point and never with prepare.

Okay but that is something we want to warn users about, not explicitly forbid.

The same warning holds for other types of preparation, like the tape in ReverseDiff: it depends on the control flow, and thus possibly on some values inside x. We still allow this preparation, we just let users decide how many times they can safely reuse extras

@adrhill
Copy link
Owner

adrhill commented May 18, 2024

It's not "packages" who decide when to recompute the sparsity pattern, its users. They make the call as to whether prepare_jacobian should be run again or not when x changes.

My point is basically holding the hand of the user: there is never any use to call prepare_jacobian with LocalTracerSparsityDetector, so I personally just wouldn't allow that footgun.

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

there is never any use to call prepare_jacobian with LocalTracerSparsityDetector

Yes there is, because even when you compute a one-time Jacobian,

jacobian(f, backend, x) = jacobian(f, backend, x, prepare_jacobian(f, backend, x))

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

In such cases, the user wouldn't call prepare_jacobian explicitly, but the preparation still happens even for one shot

@adrhill
Copy link
Owner

adrhill commented May 18, 2024

Yeah, now I see the issue.

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

Basically

  • ADTypes cannot explicitly distinguish every possible brand of sparsity
  • We need local sparsity to be supported by ADTypes because that's what DI uses
  • Thus we need the option in our TracerSparsityDetector, with all the necessary red flags

@adrhill
Copy link
Owner

adrhill commented May 18, 2024

Then let's make it explicit and add a separate LocalTracerSparsityDetector (or TracerLocalSparsityDetector).

@gdalle
Copy link
Collaborator Author

gdalle commented May 18, 2024

Sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants