Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add topk features #260

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

aurorarossi
Copy link
Collaborator

With this PR I add the topk_nodes and topk_edges features (see issue #41).

src/utils.jl Outdated Show resolved Hide resolved
@CarloLucibello
Copy link
Owner

I think it is more useful and general to have a function with signature

topk_nodes(g::GNNGraph, x::AbstractArray, k::Int; rev::Bool = true, sortby::Union{Nothing, Int} = nothing)

instead of passing the tensor name.

Also, the docstring needs much more explanation of the purpose of this function, the various arguments, and the returned objects.

Following DGL, I think it is useful to return both the sorted array and permutation.
So if we call (y, partialperm) the output:

  • y should be an array of size (num_feat, k, num_graphs)
  • partialperm should be an array of integers of size (k, num_graphs) if sortby === nothing and of size (num_feat, k, num_graphs) if sortby is an integer.

@aurorarossi aurorarossi reopened this Mar 28, 2023
@aurorarossi aurorarossi marked this pull request as draft March 28, 2023 12:09
@aurorarossi aurorarossi marked this pull request as ready for review March 28, 2023 18:55
@aurorarossi aurorarossi marked this pull request as draft March 9, 2024 20:34
@aurorarossi aurorarossi marked this pull request as ready for review March 10, 2024 10:41
@CarloLucibello
Copy link
Owner

Does it work on gpu as well?

@aurorarossi
Copy link
Collaborator Author

Yes, it works on the GPU.

@CarloLucibello
Copy link
Owner

I think it would be nice to contribute topk_matrix as a topk function in MLUtils.jl and then base this PR on top of that.

Then I would have here two functions, topk_nodes and topk_edges as in Deep Graph Library (https://docs.dgl.ai/generated/dgl.topk_nodes.html#dgl.topk_nodes)

- `feat`: a feature array of size `(number_features, g.num_nodes)` or `(number_features, g.num_edges)` of the graph `g`.
- `k`: the number of top features to return.
- `rev`: if `true`, sort in descending order otherwise returns the `k` smallest elements.
- `sortby`: the index of the feature to sort by. If `nothing`, every row independently.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence is not clear

function _topk_matrix(matrix::AbstractArray, k::Int; rev::Bool = true, sortby::Union{Nothing, Int} = nothing)
if sortby === nothing
sorted_matrix = sort(matrix, dims = 2; rev)[:, 1:k]
vector_indices = map(x -> sortperm(x; rev), eachrow(matrix))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of sorting the whole matrix, it would be more efficient to use partialsortperm. I'm not sure is supported by CUDA.jl though

if g.num_graphs == 1
return _topk_matrix(feat, k; rev, sortby)
else
matrices = [feat[:, g.graph_indicator .== i] for i in 1:(g.num_graphs)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the masking would be different for edge feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants