Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CudnnRNN is not differentiable twice #1

Closed
erschmidt opened this issue Dec 18, 2017 · 4 comments
Closed

CudnnRNN is not differentiable twice #1

erschmidt opened this issue Dec 18, 2017 · 4 comments

Comments

@erschmidt
Copy link

Hi,

First of all thank you very much for providing this great repository!

I am currently implementing GAIL using a MLP as well as an RNN policy net (Two different experiments). The MLP network is working as intended but if I switch to the RNN policy I get a RuntimeError: CudnnRNN is not differentiable twice during execution of this line in core.trpo.Fvp_fim:

Jv = torch.autograd.grad(Jtv, t, retain_graph=True)[0]

The only difference between my MLP and RNN policy implementation is the initialization of the hidden state during get_log_prob and get_fim within my policy class.

Given your recent commit (d66765eecad38ddc3f6e0f33d35ef70a7ed11892) I thought that the network is only differentiating once during TRPO.

Am I doing something wrong or is the network still differentiating twice ?

Thank you very much!

@Khrylx
Copy link
Owner

Khrylx commented Dec 18, 2017

Hi,

I solved this problem by switching to CUDA 9.0 and reinstall PyTorch. Another thing is to use LSTMCell than LSTM.

Best,
Ye

@erschmidt
Copy link
Author

Thanks for the fast feedback.

Sadly still no luck. I switched my GRU Layer to a GRUCell which only changed the Error to RuntimeError: GRUFused is not differentiable twice.

Since I'm working in an environment where can't easily change the CUDA Version (Currently 7.5) using a different CUDA is no option. Are you sure that this would solve the problem?
The relevant functions of my policy look like this:

def forward(self, inputs):
        x = self.hidden_activation(self.input_layer(inputs))

        # Hidden Layers
        for hidden_layer in self.hidden_layers:
            x = self.hidden_activation(hidden_layer(x))
            
        # GRUCell
        outputs = []
        for seq in range(x.size(1)):
                self.hidden = self.gru(x[:, seq], self.hidden)
                outputs.append(self.hidden)
        x = torch.stack(outputs, 1)
        
        # Output Layer
        action_mean = self.output_layer(x)
        action_log_std = self.a_logstd.expand_as(action_mean)
        action_std = torch.exp(action_log_std)

        return action_mean, action_log_std, action_std

def get_log_prob(self, x, actions):
        self.hidden = self.init_hidden(x.size(0))
        action_mean, action_log_std, action_std = self.forward(x)
        return normal_log_density(actions, action_mean, action_log_std, action_std, is_recurrent=True)

def get_fim(self, x):
        self.hidden = self.init_hidden(x.size(0))
        mean, _, _ = self.forward(x)
        cov_inv = self.a_logstd.data.exp().pow(-2).squeeze(0).repeat(x.size(0))
        param_count = 0
        std_index = 0
        id = 0
        for name, param in self.named_parameters():
            if name == "a_logstd":
                std_id = id
                std_index = param_count
            param_count += param.data.view(-1).shape[0]
            id += 1
        return cov_inv, mean, {'std_id': std_id, 'std_index': std_index}

I'm really hoping to solve this issue since I need to implement this using a RNN policy.

@Khrylx
Copy link
Owner

Khrylx commented Dec 18, 2017

I’m pretty positive that changing the CUDA version will solve the problem if you are using GRUCell since that was my case and I didn’t change a single line of code. Alternatively, you can use PPO instead of TRPO, which should give you similar performance.

@erschmidt
Copy link
Author

I did change from TRPO to PPO. I will compare results but it seems to train fine. Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants