Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples/minimal rl py #1205

Merged
merged 10 commits into from May 10, 2021
Merged

Examples/minimal rl py #1205

merged 10 commits into from May 10, 2021

Conversation

Scitator
Copy link
Member

@Scitator Scitator commented May 7, 2021

Before submitting (checklist)

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contribution guide?
  • Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle).
  • Did you make sure to update the docs? We use Google format for all the methods and classes.
  • Did you check the docs with make check-docs?
  • Did you write any new necessary tests?
  • Did you check that your code passes the unit tests pytest . ?
  • Did you add your new functionality to the docs?
  • Did you update the CHANGELOG?
  • Did you run colab minimal CI/CD with latest and minimal requirements?

Description

Related Issue

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

FAQ

Please review the FAQ before submitting an issue:

@Scitator Scitator requested review from bagxi and ditwoo as code owners May 7, 2021 05:48
@pep8speaks
Copy link

pep8speaks commented May 7, 2021

Hello @Scitator! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 17:76: E231 missing whitespace after ','

Comment last updated at 2021-05-10 06:14:18 UTC

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

pep8

examples/reinforcement_learning/dqn.py|45 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/dqn.py|46 col 64| B008 Do not perform function calls in argument defaults. The call is performed only once at function definition time. All calls to your function will reuse the result of that definition-time function call. If this is intended, assign the function call to a module-level variable and use that variable as a default value.
examples/reinforcement_learning/dqn.py|50 col 1| D105 Missing docstring in magic method
examples/reinforcement_learning/dqn.py|55 col 1| D105 Missing docstring in magic method
examples/reinforcement_learning/dqn.py|68 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/dqn.py|79 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/dqn.py|89 col 9| B007 Loop control variable 't' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/dqn.py|105 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/dqn.py|114 col 9| B007 Loop control variable 'i_episone' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/dqn.py|123 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/dqn.py|144 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/dqn.py|164 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/dqn.py|185 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/dqn.py|204 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/dqn.py|222 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/dqn.py|243 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/dqn.py|251 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|16 col 76| E231 missing whitespace after ','
examples/reinforcement_learning/reinforce.py|16 col 77| C819 trailing comma prohibited
examples/reinforcement_learning/reinforce.py|19 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|24 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|27 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|36 col 1| D105 Missing docstring in magic method
examples/reinforcement_learning/reinforce.py|42 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|46 col 1| D105 Missing docstring in magic method
examples/reinforcement_learning/reinforce.py|52 col 1| D105 Missing docstring in magic method
examples/reinforcement_learning/reinforce.py|59 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|60 col 6| N806 variable 'G' in function should be lowercase
examples/reinforcement_learning/reinforce.py|67 col 1| D210 No whitespaces allowed surrounding docstring text
examples/reinforcement_learning/reinforce.py|75 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|83 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|94 col 9| B007 Loop control variable 't' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/reinforce.py|113 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|122 col 9| B007 Loop control variable 'i_episone' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/reinforce.py|131 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|152 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|158 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|167 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|180 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|188 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|204 col 10| N806 variable 'J_hat' in function should be lowercase
examples/reinforcement_learning/reinforce.py|232 col 88| C819 trailing comma prohibited
examples/reinforcement_learning/reinforce.py|247 col 72| C819 trailing comma prohibited

)


class ReplayBuffer:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D101 Missing docstring in public class

def __init__(self, capacity: int):
self.buffer = deque(maxlen=capacity)

def append(self, transition: Transition):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method

def append(self, transition: Transition):
self.buffer.append(transition)

def sample(self, size: int) -> Sequence[np.array]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method

)
return states, actions, rewards, dones, next_states

def __len__(self) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method


# as far as RL does not have some predefined dataset,
# we need to specify epoch length by ourselfs
class ReplayDataset(IterableDataset):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D101 Missing docstring in public class

),
}

runner = CustomRunner(gamma=gamma, tau=tau, tau_period=tau_period,)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
C819 trailing comma prohibited

)


class ReplayBuffer:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D101 Missing docstring in public class

def __init__(self, capacity: int):
self.buffer = deque(maxlen=capacity)

def append(self, transition: Transition):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method

def append(self, transition: Transition):
self.buffer.append(transition)

def sample(self, size: int) -> Sequence[np.array]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method

)
return states, actions, rewards, dones, next_states

def __len__(self) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

pep8

examples/reinforcement_learning/reinforce.py|83 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|94 col 9| B007 Loop control variable 't' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/reinforce.py|113 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|122 col 9| B007 Loop control variable 'i_episone' not used within the loop body. If this is intended, start the name with an underscore.
examples/reinforcement_learning/reinforce.py|131 col 1| D103 Missing docstring in public function
examples/reinforcement_learning/reinforce.py|152 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|158 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|167 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|180 col 1| D101 Missing docstring in public class
examples/reinforcement_learning/reinforce.py|188 col 1| D102 Missing docstring in public method
examples/reinforcement_learning/reinforce.py|204 col 10| N806 variable 'J_hat' in function should be lowercase
examples/reinforcement_learning/reinforce.py|232 col 88| C819 trailing comma prohibited
examples/reinforcement_learning/reinforce.py|247 col 72| C819 trailing comma prohibited


# as far as RL does not have some predefined dataset,
# we need to specify epoch length by ourselfs
class ReplayDataset(IterableDataset):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D101 Missing docstring in public class

# as far as RL does not have some predefined dataset,
# we need to specify epoch length by ourselfs
class ReplayDataset(IterableDataset):
def __init__(self, buffer: ReplayBuffer, epoch_size: int = int(1e3)):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
B008 Do not perform function calls in argument defaults. The call is performed only once at function definition time. All calls to your function will reuse the result of that definition-time function call. If this is intended, assign the function call to a module-level variable and use that variable as a default value.

self.buffer = buffer
self.epoch_size = epoch_size

def __iter__(self) -> Iterator[Sequence[np.array]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method

for i in range(len(dones)):
yield states[i], actions[i], rewards[i], dones[i], next_states[i]

def __len__(self) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method

# DQN


def get_action(env, network: nn.Module, state: np.array, epsilon: float = -1) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function

yield states, actions, rewards
self.buffer.buffer.clear()

def __len__(self) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method

# REINFORCE


def get_cumulative_rewards(rewards, gamma=0.99):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function



def get_cumulative_rewards(rewards, gamma=0.99):
G = [rewards[-1]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
N806 variable 'G' in function should be lowercase



def to_one_hot(y, n_dims=None):
""" Takes an integer vector and converts it to 1-hot matrix. """
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D210 No whitespaces allowed surrounding docstring text

return y_one_hot


def get_action(env, network: nn.Module, state: np.array, epsilon: float = -1) -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function

return int(action)


def generate_session(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function

states, actions, rewards = [], [], []
state = env.reset()

for t in range(t_max):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
B007 Loop control variable 't' not used within the loop body. If this is intended, start the name with an underscore.

return total_reward, t


def generate_sessions(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function

num_sessions: int = 100,
) -> tp.Tuple[float, int]:
sessions_reward, sessions_steps = 0, 0
for i_episone in range(num_sessions):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
B007 Loop control variable 'i_episone' not used within the loop body. If this is intended, start the name with an underscore.

return sessions_reward, sessions_steps


def get_network(env, num_hidden: int = 128):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D103 Missing docstring in public function

runner.epoch_metrics["_epoch_"]["v_reward"] = valid_rewards


class CustomRunner(dl.Runner):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D101 Missing docstring in public class

self.gamma: float = gamma
self._initialized = False

def handle_batch(self, batch: tp.Sequence[np.array]):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method

n_actions = probas.shape[1]
logprobas_for_actions = torch.sum(logprobas * to_one_hot(actions, n_dims=n_actions), dim=1)

J_hat = torch.mean(logprobas_for_actions * cumulative_returns)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
N806 variable 'J_hat' in function should be lowercase

model = get_network(env)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
loaders = {
"train_game": DataLoader(RolloutDataset(rollout_buffer), batch_size=batch_size,),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
C819 trailing comma prohibited

valid_metric="v_reward",
minimize_valid_metric=False,
load_best_on_end=True,
callbacks=[GameCallback(env=env, rollout_buffer=rollout_buffer,)],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
C819 trailing comma prohibited

examples/reinforcement_learning/ddpg.py Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
def get_action(
env, network: nn.Module, state: np.array, sigma: Optional[float] = None
) -> np.array:
state = torch.tensor(state, dtype=torch.float32).unsqueeze(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add info about the expected shapes of the inputs?

class GameCallback(dl.Callback):
def __init__(
self,
*,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the main reasons to force users to use key-value?

examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
examples/reinforcement_learning/ddpg.py Outdated Show resolved Hide resolved
@Scitator Scitator merged commit 6aaf794 into master May 10, 2021
@mergify mergify bot deleted the examples/minimal-rl-py branch May 10, 2021 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants