Do the last FC layers get out of the Hyperbolic space? #3

dragon9001 · 2019-09-14T06:34:05Z

The network architecture begins with a few hyperbolic layers but ends with Euclidean layers as states in:

Line 32 in c257d1c

self.dense_1 = nn.Linear(feature_num, int(feature_num/2))

The full architecture was:

    def __init__(self, feature_num, word_embed, label_embed, hidden_size=5, if_gru=True, 
                 default_dtype=th.float64, **kwargs):
        super().__init__(**kwargs)
        
        self.hidden_size = hidden_size
        
        self.word_embed = gt.ManifoldParameter(word_embed, manifold=gt.PoincareBall())
        self.label_embed = gt.ManifoldParameter(label_embed, manifold=gt.PoincareBall())
        self.default_dtype = default_dtype
        
        if(if_gru):
            self.rnn = hyperGRU(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        else:
            self.rnn = hyperRNN(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        
        self.dense_1 = nn.Linear(feature_num, int(feature_num/2))
        self.dense_2 = nn.Linear(int(feature_num/2), 1)

How can we optimize with Riemann SGD if not all the parameters are on hyperbolic space?

bcol23 · 2019-09-14T08:42:31Z

The final MLP uses the similarity scores to do the prediction part so its parameters are in the Euclidean space and are updated via vanille optimization methods.

Only the word embeddings and label embeddings (and the bias in the hyperbolic RNN/GRU) are located in the hyperbolic space and are updated with Riemannian optimization methods.

dragon9001 · 2019-09-15T02:39:46Z

Does Riemannian Optimizer distinguish Euclidean parameters with Hyperbolic parameters?
Because only one optimizer is used as:

HyperIM/HyperIM.py

Lines 41 to 45 in c257d1c

    
           net = HyperIM(word_num, word_embed, label_embed, hidden_size=embed_dim, if_gru=if_gru) 
        
           net.to(cuda_device) 
        
           loss = nn.BCEWithLogitsLoss() 
        
           optim = gt.optim.RiemannianAdam(net.parameters(), lr=lr)

bcol23 · 2019-09-15T06:27:32Z

For parameters in the hyperbolic manifold, RiemannianAdam is applied, while vanilla Adam is applied to other parameters in the Euclidean space.

bcol23 closed this as completed Sep 14, 2019

Repository owner locked and limited conversation to collaborators Apr 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do the last FC layers get out of the Hyperbolic space? #3

Do the last FC layers get out of the Hyperbolic space? #3

dragon9001 commented Sep 14, 2019

bcol23 commented Sep 14, 2019

dragon9001 commented Sep 15, 2019

bcol23 commented Sep 15, 2019

Do the last FC layers get out of the Hyperbolic space? #3

Do the last FC layers get out of the Hyperbolic space? #3

Comments

dragon9001 commented Sep 14, 2019

bcol23 commented Sep 14, 2019

dragon9001 commented Sep 15, 2019

bcol23 commented Sep 15, 2019