Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do the last FC layers get out of the Hyperbolic space? #3

Closed
dragon9001 opened this issue Sep 14, 2019 · 3 comments
Closed

Do the last FC layers get out of the Hyperbolic space? #3

dragon9001 opened this issue Sep 14, 2019 · 3 comments

Comments

@dragon9001
Copy link

The network architecture begins with a few hyperbolic layers but ends with Euclidean layers as states in:

self.dense_1 = nn.Linear(feature_num, int(feature_num/2))

The full architecture was:

    def __init__(self, feature_num, word_embed, label_embed, hidden_size=5, if_gru=True, 
                 default_dtype=th.float64, **kwargs):
        super().__init__(**kwargs)
        
        self.hidden_size = hidden_size
        
        self.word_embed = gt.ManifoldParameter(word_embed, manifold=gt.PoincareBall())
        self.label_embed = gt.ManifoldParameter(label_embed, manifold=gt.PoincareBall())
        self.default_dtype = default_dtype
        
        if(if_gru):
            self.rnn = hyperGRU(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        else:
            self.rnn = hyperRNN(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        
        self.dense_1 = nn.Linear(feature_num, int(feature_num/2))
        self.dense_2 = nn.Linear(int(feature_num/2), 1)

How can we optimize with Riemann SGD if not all the parameters are on hyperbolic space?

@bcol23
Copy link
Owner

bcol23 commented Sep 14, 2019

The final MLP uses the similarity scores to do the prediction part so its parameters are in the Euclidean space and are updated via vanille optimization methods.

Only the word embeddings and label embeddings (and the bias in the hyperbolic RNN/GRU) are located in the hyperbolic space and are updated with Riemannian optimization methods.

@bcol23 bcol23 closed this as completed Sep 14, 2019
@dragon9001
Copy link
Author

Does Riemannian Optimizer distinguish Euclidean parameters with Hyperbolic parameters?
Because only one optimizer is used as:

HyperIM/HyperIM.py

Lines 41 to 45 in c257d1c

net = HyperIM(word_num, word_embed, label_embed, hidden_size=embed_dim, if_gru=if_gru)
net.to(cuda_device)
loss = nn.BCEWithLogitsLoss()
optim = gt.optim.RiemannianAdam(net.parameters(), lr=lr)

@bcol23
Copy link
Owner

bcol23 commented Sep 15, 2019

For parameters in the hyperbolic manifold, RiemannianAdam is applied, while vanilla Adam is applied to other parameters in the Euclidean space.

Repository owner locked and limited conversation to collaborators Apr 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants