Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model diagrams for the GNN examples #556

Closed
code-rex1 opened this issue Apr 21, 2022 · 16 comments
Closed

Model diagrams for the GNN examples #556

code-rex1 opened this issue Apr 21, 2022 · 16 comments

Comments

@code-rex1
Copy link

code-rex1 commented Apr 21, 2022

❓ Questions and Help

This repo presents couple of nice examples for the GNN.

I am particularly interested about:

Do you have the model architecture described somewhere as part of the tutorial or documentation?
Alternately do you have a canonical architecture described somewhere for these Graph2Seq based models?
Is the model same as the Graph2Seq: A Generalized Seq2Seq Model for Graph Inputs?

@code-rex1 code-rex1 changed the title Reference model diagrams for sample GNN example Model diagrams for the GNN examples Apr 21, 2022
@code-rex1
Copy link
Author

@AlanSwift @hugochan can you please help?

@AlanSwift
Copy link
Contributor

Currently, we don't provide the architecture graph about the specific applications. But we have visualized specific graph types such as dependency and etc. in our survey paper.

@AlanSwift
Copy link
Contributor

There are some differences.
We apply RNN or bert encoding before GNN to initialize the node embedding. And we use separate attention: 1. attention on node embedding, 2. attention on node initial embedding. This is just an example. For more details, please refer to our docs.

@code-rex1
Copy link
Author

@AlanSwift thanks for your response. But I can't find that much details in the document.

I see at first you generate initial node embedding using word2ve or BERT.

But your statement about the separate attention: 1. attention on node embedding, 2. attention on node initial embedding is not clear. Can you please elaborate? 🙏

@smith-co
Copy link

smith-co commented Apr 22, 2022

@AlanSwift also a bit confused here you said, we use separate attention:

  1. attention on node embedding
  2. attention on node initial embedding.

But the example for the NMT is with a GCN. GCN does not use attention. So lost here. Please elaborate so that I can understand the model a little better.

Thanks in advance for your help.

@AlanSwift
Copy link
Contributor

Just an example:
encoder pipeline: RNNencoder --> GNN encoder
decoder pipeline: 1. attention on RNNencoder results 2. attention on GNN encoder results. 3. fuse them

@smith-co
Copy link

@AlanSwift this RNNencoder comes after the word2vec or bert embedding. The document states:

For instance, for single-token item, w2v_bilstm strategy means we first use word2vec embeddings to initialize each item, and then apply a BiLSTM encoder to encode the whole graph (assuming the node order reserves the sequential order in raw text). 

I do not understand why/how the BiLSTM encoder is used to encode the whole graph. Can you please explain?

@smith-co
Copy link

@AlanSwift this part is quite confusing. Why/how do you encode the whole graph with the BiLSTM?

Also for the decoder pipeline, you mentioned:

  1. attention on RNNencoder results
  2. attention on GNN encoder results.
  3. fuse them

Any other paper used this approach? Can you please provide me any reference paper?

Also would appreciate if you provide me with pointers how is it done in the code?

@AlanSwift
Copy link
Contributor

AlanSwift commented Apr 25, 2022

The word2vec, BiLSTM, Bert and etc. are used to initialize the node embedding, which can enrich contextural information. This trick is widely used in NLP&GNN research https://arxiv.org/pdf/1908.04942.pdf (only an example).

For technique details, please refer to the implementations.

@smith-co
Copy link

smith-co commented Apr 25, 2022

@AlanSwift I not asking about word2vec or BERT to initialize the node embedding. I am asking why the BiLSTM is used after learning word2vec or BERT embedding.

As you see the document states:

For instance, for single-token item, w2v_bilstm strategy means we first use word2vec embeddings to initialize each item, and then apply a BiLSTM encoder to encode the whole graph (assuming the node order reserves the sequential order in raw text). 

As per the document:

  1. learn word2vec embeddings to initialize each item
  2. then apply a BiLSTM encoder to encode the whole graph

I am asking about the step 2.

@AlanSwift
Copy link
Contributor

Considering the bidirectional sequential information is beneficial for most NLP tasks.

@smith-co
Copy link

@AlanSwift got it. But why the BiLSTM encoder to encode the whole graph? I would be thinking that is used to update the embedding for the node. Isn't it?

Is the description incorrect?

@smith-co
Copy link

@AlanSwift the BiLSTM encoder is used to update the embedding for the node embedding. So it appears to me:

  1. initialize node embedding with word2vec or BERT
  2. update using BiLSTM

Now feed it to the GCN encoder.

Is this understanding correct?

@smith-co
Copy link

@AlanSwift I understand bidirectional sequential information is beneficial for NLP tasks. But the BiLSTM encoder updates the initial word2vec/bert word embedding before feeding it to the GCN encoder.

So I am confused when you state that BiLSTM encoder to encode the whole graph.

Would you please assist me with this question?

@AlanSwift
Copy link
Contributor

Yes. It is correct.

@AlanSwift
Copy link
Contributor

This issue will be closed. Feel free to reopen it if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants