Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicolumn indexes #17

Open
kris-brown opened this issue Aug 2, 2022 · 2 comments
Open

Multicolumn indexes #17

kris-brown opened this issue Aug 2, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@kris-brown
Copy link
Collaborator

Multicolumn indexes are useful in databases and also would be useful for ACSets. For example, a mapping table (i.e. a span A<-R->B, encoding a relation) is naturally indexed by its two outgoing morphisms to quickly check if (a,b) are related.

I'm not sure what needs to change in the underlying data structure, but an interface might look like this:

@present SchTest(FreeSchema) begin
 (A,B,C,R)::Ob
 f::Hom(R,A)
 g::Hom(R,B)
 h::Hom(R,C)
end

@acset_type Test(SchTest, index=[:h, (:f,:g)])

X = apex(terminal(Test))

incident(X, (1,1), (:f,:g)) # == [1]
@kris-brown kris-brown added the enhancement New feature or request label Aug 2, 2022
@epatters epatters transferred this issue from AlgebraicJulia/Catlab.jl May 30, 2023
@slwu89
Copy link
Member

slwu89 commented Sep 27, 2023

@kris-brown just want to note that I think this would be fantastically useful. In trying to use acsets to store data to parameterize practical optimization tasks, I have found myself wishing this feature existed.

I wonder also if it might work nicely with the ideas in AlgebraicJulia/Catlab.jl#844, certainly it would be very nice to index products this way.

@slwu89
Copy link
Member

slwu89 commented Oct 11, 2023

In addition to being able to do something similar to @kris-brown's proposed interface as above, I wonder if sequences of morphisms could also be supported in some future implementation of this feature. The following example is an example of a pattern that seems to come up very often in working with data in applied settings:

using ACSets

MySch = BasicSchema(
    [:X,:Y,:Rel],
    [(:proj_x,:Rel,:X),(:proj_y,:Rel,:Y)],
    [:NameType,:NumberType],
    [(:xname,:X,:NameType),(:yname,:Y,:NameType),(:cost,:Rel,:NumberType)]
)

@acset_type MyDataType(MySch, index=[:proj_x,:proj_y], unique_index=[:xname,:yname])

mydata = @acset MyDataType{Symbol,Float64} begin
    X = 4
    Y = 3
    xname = [:widget1,:widget2,:widget3,:widget4]
    yname = [:place1,:place2,:place3]
end

costs = rand(4,3)

add_parts!(
    mydata, :Rel, prod(size(costs)),
    proj_x = repeat(1:4,3),
    proj_y = vcat(map(i->fill(i,4),1:3)...),
    cost = vcat(costs...)
)

# a pattern that comes up very, very often
mydata[
    intersect(
        incident(mydata, :widget3, [:proj_x,:xname]),
        incident(mydata, :place2, [:proj_y,:yname])
    ),
    :cost
]

In addition, I believe it was previously discussed that at least in the case where the relation is a binary product, and where we want to be able to efficiently do this with caching, one would store a matrix to get these preimages, or an multidimensional array for the n-ary product. I'm not sure if the initial implementation of this feature also needs to consider the general case of a relation that may not be a (multi)product, where I guess one would store a sparse array.

I think that a prototype that nontheless would be useful immediately upon release would be to get the "non cached" version working that computes everything on the fly. This would also help work out the general methods and or structs that need to be implemented/modified for the overall case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants