Skip to content

issues Search Results · repo:sweetice/Deep-reinforcement-learning-with-pytorch language:Python

Filter by

35 results
 (125 ms)

35 results

insweetice/Deep-reinforcement-learning-with-pytorch (press backspace or delete to remove)

I met the error as follow: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 1]], which is output 0 of AsStridedBackward0, ...
  • VansWaston
  • 2
  • Opened 
    on Jan 15
  • #50

Hello everyone. I am planning to create my own environment in Python using my aircraft s specifications. Most of the code I see on GitHub uses pre-prepared environments from Gym. How can I use my own 6DOF ...
  • aminrbspace
  • 2
  • Opened 
    on Jul 12, 2024
  • #48

state = torch.from_numpy(state).float().unsqueeze(0) report this bug. The input state is: image The torch version is: 1.11.0+cu113. Any suggestion is appreciated~~~
  • zhang-qiang-github
  • 1
  • Opened 
    on Jun 3, 2024
  • #47

In gridworld.py ,77 lines,self.position = [np.random.randint(tot_row), np.random.randint(tot_col)]. I think it should modify self.position = [np.random.randint(self.world_row), np.random.randint(self.world_col)] ...
  • Wei-yao-Cheng
  • Opened 
    on Apr 7, 2024
  • #46

img width= 893 alt= Screenshot 2023-03-30 at 8 46 07 PM src= https://user-images.githubusercontent.com/37682760/229017694-94dc0496-6a74-4576-963b-b360778476a8.png
  • CajetanRodrigues
  • 1
  • Opened 
    on Mar 31, 2023
  • #45

If raise NotImplementedError, just modify the functions title of NormalizedActions class, modify _action to aciton and _reverse_action to reverse_action.
  • QinwenLuo
  • 2
  • Opened 
    on Dec 6, 2022
  • #43

in sac.py s = torch.tensor([t.s for t in self.replay_buffer]).float().to(device) Traceback (most recent call last): File D:\PycharmProject\Deep-reinforcement-learning-with-pytorch-master\Char09 SAC\SAC.py ...
  • aut6620
  • 3
  • Opened 
    on May 23, 2022
  • #38

the update value network should be: alpha_w = 1e-3 # 初始化 optimizer_w = optim.Adam(**s_value_func**.parameters(), lr=alpha_w) optimizer_w.zero_grad() policy_loss_w =-delta policy_loss_w.backward(retain_graph ...
  • hlhang9527
  • 3
  • Opened 
    on Mar 21, 2022
  • #37

log_prob should be multiplied by temperature factor (alpha) when calculating pi_loss in ALL implementations of SAC.
  • Darkness-hy
  • 1
  • Opened 
    on Mar 11, 2022
  • #36

In dist = Normal(mu, sigma) , sigma should be a positive value, but actor_net output can be negative, so action_log_prob = dist.log_prob(action) can be nan. Try: import torch a = torch.FloatTensor([1]).cuda() ...
  • Vinson-sheep
  • 3
  • Opened 
    on Feb 16, 2022
  • #35
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub