You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello author. I am concerned about the part of constructing a semantic perspective in section 4.1 of the paper, where an aggregation process is first used to construct representations of item and entity nodes (Equation 1), I am a bit confused about the relationship between Equation 1 and Equation 5, since e_i^{(k+1)} appears in both of them. Is Equation 1 involved in the calculation of the losses later?
In addition, I noticed that the authors use the top-k function in their code to get the highest scoring edges:
knn_val, knn_ind = torch.topk(sim, topk, dim=-1) But can the top-k process be differentiated? Wouldn't this cause the gradient of the previous calculation process to break?
The text was updated successfully, but these errors were encountered:
Thank you for your interest.
About the relationship between Equation 1 and Equation 5, Equation 1 means the aggregating process for constructing the item-item semantic graph ( for generating the item embeddings); but Equation 5 is the aggregation in collaborative graph (\ie user-item graph).
As for the kNN sparsification, it makes differentiation by calculating the similarity between item embeddings aggregated from Equation 1. Then for the purpose of top-k function, it utilizes the aggregated embeddings in Equation 1 for graph contrusting, the selection of top-k highest scoring edges aims to decrease the computational demanding. This wouldn't bring in your mentioned "gradient break" problem, for that the top-k selected ones would be propagated . (You could refer to this paper for kNN sparsification: Mining Latent Structures for Multimedia Recommendation 2021 MM, or other related kNN papers.)
Hello author. I am concerned about the part of constructing a semantic perspective in section 4.1 of the paper, where an aggregation process is first used to construct representations of item and entity nodes (Equation 1), I am a bit confused about the relationship between Equation 1 and Equation 5, since e_i^{(k+1)} appears in both of them. Is Equation 1 involved in the calculation of the losses later?
In addition, I noticed that the authors use the top-k function in their code to get the highest scoring edges:
knn_val, knn_ind = torch.topk(sim, topk, dim=-1)
But can the top-k process be differentiated? Wouldn't this cause the gradient of the previous calculation process to break?
The text was updated successfully, but these errors were encountered: