Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composing uncomposable foreign keys #16

Closed
kris-brown opened this issue Apr 22, 2022 · 12 comments
Closed

Composing uncomposable foreign keys #16

kris-brown opened this issue Apr 22, 2022 · 12 comments
Labels
bug Something isn't working

Comments

@kris-brown
Copy link
Collaborator

It should be an error to compose uncomposable foreign keys.

using Catlab, Catlab.Graphs
G = complete_graph(Graph, 2)
G[[:src, :tgt]] # [2,1] no error
@epatters epatters transferred this issue from AlgebraicJulia/Catlab.jl May 30, 2023
@epatters epatters added the bug Something isn't working label May 30, 2023
@jpfairbanks
Copy link
Member

I looked at the source for subpart and noticed this:

collect_or_id(xs::AbstractVector{T}) where {T} = T[xs...]

Is it a performance problem?

@slwu89
Copy link
Member

slwu89 commented Feb 9, 2024

Also happens with attributes, the below code will erroneously return a value.

g = WeightedGraph{Float64}(4)
add_edges!(g, 1:3, 2:4, weight=[0.25, 0.5, 0.75])
subpart(g, 1, [:src, :weight])

Happy to take this small issue on to fix. @jpfairbanks what do you mean by performance problem?

@jpfairbanks
Copy link
Member

I'm not sure what the performance of splatting into a T[] is. Maybe it should be T.()

@slwu89
Copy link
Member

slwu89 commented Feb 10, 2024

I am thinking of how to do this check because I was showing a friend around ACSets today and they did something like the example I showed above and it really threw them when I said "uh oh, that shouldn't work" (because, look, it actually does work! sort of). So I think it's good to get this solved soon.

It appears like the check would have to be done before this method is called https://github.com/AlgebraicJulia/ACSets.jl/blob/main/src/ACSetInterface.jl#L168-L172, however I don't see a way to do it at the ACSetInterface level. It seems to me like we'd have to move that logic down into DenseACSets, and check before calling foldl that the path given by the user actually composes, using comptime to do it compiled for StructACSets and at runtime for DynamicACSets. Does that seem like the best path @kris-brown?

@kris-brown
Copy link
Collaborator Author

So unrelated to the potential performance problem James notes is the question of whether we want to always check sequences that they are composable, e.g. I'm imagining a hot loop where we're repeatedly accessing some composite. One idea is that the ACSet could have a cache of previously seen composites and we only run the check if that composite has not been seen before - I'm wondering if there are downsides to that option.

@jpfairbanks
Copy link
Member

Of course checking the cache would also be in the hot loop in that scenario. Is it worth introducing a manual check that you could ask a schema "is this an acceptable path?" And then you could rely on that check in the loop.

That would at least help with debug ability without introducing a potential performance regression

@epatters
Copy link
Member

@kris-brown and I just discussed this briefly. Turns out I made an issue long ago with a suggested strategy but never got around to implementing it: #9. Still seems like a reasonable approach that could now be done cleanly using CompTime.

@jpfairbanks
Copy link
Member

Yes that seems reasonable and for most schemas the number of actually used paths is small so the code gen cost will small relative to the usage. We could also do the check at compile time so that it is only performed once, lifting it out of the hot loop.

Moar codegen!

@slwu89
Copy link
Member

slwu89 commented Feb 10, 2024

@epatters I see the ideas in #9, very nice idea to separate behavior by tuple vs vector input. For the tuple based version, it looks to me like we ought to have a forwarding method in ACSetInterface and then use runtime vs compile time versions defined in DenseACSets depending on if we get a DynamicACSet or a StructACSet. In that case, would the suggested use be to use the tuple version unless one had some strong reason to be sure that everything was runtime computed regardless of the type of input acset?

@epatters
Copy link
Member

The use case for the tuple method would be the situation where (1) the tuple is known at compile time, usually because it is a tuple literal in the source code and (2) the access will happen in a hot loop or at least more than once. Then, as James said, the checking of domains and codomain can be done just once, when the code is generated, and each runtime access can be fast.

This is relevant only to struct acsets. For dynamic acsets, passing a tuple or vector would be the same.

@slwu89
Copy link
Member

slwu89 commented Feb 11, 2024

Great, yep, we're on the same page. Thanks.

@slwu89
Copy link
Member

slwu89 commented Feb 16, 2024

I think this can be closed with #109, although @jpfairbanks brought up the potential performance issue of

collect_or_id(xs::AbstractVector{T}) where {T} = T[xs...]
, which it seems like should be moved into a new discussion, before closing this one.

@slwu89 slwu89 closed this as completed Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants