Easier environment definition #143

josephdviviano · 2023-10-20T17:35:07Z

added helper functions to DiscreteEnv to make mask definition easier.
added helper function for mask casting to base class.
changed default behaviour of the log_reward and reward methods.

fix __repr__ of modules

…mplemented, and log_reward simply takes the log of reward() and clips by default.

…ractices

saleml

Great PR ! Thanks. Could you also run pre-commit run --all at the end ? Some files will get modified

saleml · 2023-10-21T17:37:46Z

src/gfn/states.py

@@ -371,3 +384,55 @@ def _extend(masks, first_dim):

        self.forward_masks = _extend(self.forward_masks, required_first_dim)
        self.backward_masks = _extend(self.backward_masks, required_first_dim)
+
+    # The helper methods are convienience functions for common mask operations.


saleml · 2023-10-21T17:37:54Z

src/gfn/states.py

+    def set_nonexit_masks(self, cond, allow_exit: bool = False):
+        """Sets the allowable actions according to cond, appending the exit mask.
+
+        A convienience function for common mask operations.


i just apparently can't spell this one :)

saleml · 2023-10-21T17:38:43Z

src/gfn/states.py

+        A convienience function for common mask operations.
+
+        Args:
+            cond: a boolean of shape (batch_shape,) + (state_shape,), which


isn't this true only when state_shape = n_actions - 1 ?

I think you meant n_actions - 1 rather than state_shape

you're right -- good catch

saleml · 2023-10-21T18:32:30Z

src/gfn/env.py

@@ -184,12 +190,12 @@ def backward_step(
        return new_states

    def reward(self, final_states: States) -> TT["batch_shape", torch.float]:
-        """Either this or log_reward needs to be implemented."""
-        return torch.exp(self.log_reward(final_states))
+        """This (and potentially log_reward) needs to be implemented."""


fixed the docs

saleml · 2023-10-21T18:33:58Z

src/gfn/gym/hypergrid.py

-                    self.backward_masks,
-                )
-
+                self.set_default_typing()
                self.forward_masks[..., :-1] = self.tensor != env.height - 1


Do you think we should use set_nonexit_masks for this line ?

makes sense, good catch!

saleml · 2023-10-23T19:49:51Z

Also, it seems like there are GitHub actions now. Do you happen to know why the checks fail ?

josephdviviano · 2023-10-23T21:29:28Z

no it's some kind of environment thing -- I'm going to fix that as part of this PR once and for all :) Joseph Viviano @josephdviviano <https://twitter.com/josephdviviano> viviano.ca

…

On Mon, Oct 23, 2023 at 3:50 PM saleml ***@***.***> wrote: Also, it seems like there are GitHub actions now. Do you happen to know why the checks fail ? — Reply to this email directly, view it on GitHub <#143 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7TL2RV5FTAX3WJSCBUV7LYA3C6VAVCNFSM6AAAAAA6JLNRN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZVHEYTKMRWHA> . You are receiving this because you were assigned.Message ID: ***@***.***>

… longer (I am not sure what introduced this bug)

…chgfn into easier_environment_definition

josephdviviano · 2023-11-23T17:16:38Z

All comments fixed -- just waiting to see if the checks pass :)

saleml

LGTM

saleml · 2023-11-23T19:04:45Z

src/gfn/env.py

@@ -23,16 +23,21 @@ def __init__(
        sf: Optional[TT["state_shape", torch.float]] = None,
        device_str: Optional[str] = None,
        preprocessor: Optional[Preprocessor] = None,
+        log_reward_clip: Optional[float] = -100.0,


good idea !

saleml · 2023-11-23T19:05:50Z

src/gfn/env.py

        self.is_discrete = True
+        self.log_reward_clip = log_reward_clip


unnecessary

saleml · 2023-11-23T19:07:53Z

src/gfn/states.py

@@ -303,6 +303,19 @@ def __init__(
            self.forward_masks = cast(torch.Tensor, forward_masks)
            self.backward_masks = cast(torch.Tensor, backward_masks)

+        self.set_default_typing()
+
+    def set_default_typing(self) -> None:


great idea!

saleml · 2023-11-23T19:10:46Z

src/gfn/gym/helpers/box_utils.py

        super().__init__(**kwargs)
+        self.logZ_value = nn.Parameter(logZ_value)


The only place BoxStateFlowModule is used is in train_box.py:

logZ = torch.tensor(0.0, device=env.device, requires_grad=True) # We need a LogStateFlowEstimator module = BoxStateFlowModule( input_dim=env.preprocessor.output_dim, output_dim=1, hidden_dim=args.hidden_dim, n_hidden_layers=args.n_hidden, torso=None, # We do not tie the parameters of the flow function to PF logZ_value=logZ, )

Naive pytorch question: why do we need nn.Parameter ?

saleml · 2023-11-23T19:11:57Z

src/gfn/gym/hypergrid.py

-    def true_reward(
-        self, final_states: DiscreteStates
-    ) -> TT["batch_shape", torch.float]:
+    def reward(self, final_states: DiscreteStates) -> TT["batch_shape", torch.float]:


Right ! true_reward was useless

saleml · 2023-11-23T19:14:24Z

src/gfn/states.py

@@ -371,3 +384,55 @@ def _extend(masks, first_dim):

        self.forward_masks = _extend(self.forward_masks, required_first_dim)
        self.backward_masks = _extend(self.backward_masks, required_first_dim)
+
+    # The helper methods are convenience functions for common mask operations.
+    def set_nonexit_masks(self, cond, allow_exit: bool = False):


are there other places than hypergrid.py where this is used ?

saleml · 2023-11-23T19:14:35Z

src/gfn/states.py

+            dim=-1,
+        ).bool()
+
+    def init_forward_masks(self, set_ones: bool = True):


…ter)

josephdviviano · 2023-11-24T04:36:13Z

Tests pass!

saleml

This is a very important PR. Thank you @josephdviviano

saleml · 2023-11-24T10:35:32Z

Should this be merged to master or to stable ?
I thought the logic is: put everything in master, and once we're happy with master, we push master to stable.

saleml and others added 8 commits August 31, 2023 21:22

fix __repr__ of modules - no more env

dc774e9

Merge pull request #135 from saleml/fixrepr

ec4b415

fix __repr__ of modules

bump up version

6c618f4

convience functions for common mask operations in DiscreteStates

298405b

change name of enforce_exit_masks method for consistency

38ab3dc

added default log reward clipping. Also, reward() is by default not i…

6ecb60d

…mplemented, and log_reward simply takes the log of reward() and clips by default.

black formatting

b77ccad

updated scripts to be consistent with new mask and reward defintion p…

4b8cf79

…ractices

josephdviviano requested review from vict0rsch and saleml October 20, 2023 17:35

josephdviviano self-assigned this Oct 20, 2023

josephdviviano changed the base branch from master to stable October 20, 2023 17:42

josephdviviano added 4 commits October 20, 2023 15:55

log reward clipping is now -100 (much smaller)

a357475

I'm actually trying to trigger actions...

d3a72bb

added flake8 back into deps

4f6e965

missing quote

e2c74b6

saleml reviewed Oct 21, 2023

View reviewed changes

josephdviviano and others added 5 commits October 23, 2023 22:50

letting notebooks back in

d3bcdf7

added smiley FM & TB tutorial

382f396

bugfix: unrelated, but logz was not being learned during the task any…

4a2861a

… longer (I am not sure what introduced this bug)

black and autoflake reformatting

77abec6

specify flake8 and black versions

0400d04

saleml mentioned this pull request Oct 25, 2023

to be deleted #144

Closed

saleml and others added 5 commits October 25, 2023 15:56

ignore one error from flake8 and revert version selection

03f6ca6

add wandb for pytest ci, and add quotations around python versions

ad12af3

remove explicit wandb, and use pip install.[all]

e0d750b

line environment v1 done

8678f9c

Merge branch 'easier_environment_definition' of github.com:saleml/tor…

b1f1040

…chgfn into easier_environment_definition

josephdviviano added 4 commits November 7, 2023 03:09

smiley tweaks

f4f69d6

isort

0735ceb

typo fixed

4161dba

documentation nits and using convienience builtins

4d8a145

saleml reviewed Nov 23, 2023

View reviewed changes

josephdviviano added 4 commits November 23, 2023 14:38

using set_nonexit_action_masks properly, with a comment for clarity

459203c

Improved method name and documentaiton

fb7e819

changed test logic (will run on github instead of this terrible compu…

226185f

…ter)

new targets

84bb169

saleml approved these changes Nov 24, 2023

View reviewed changes

saleml changed the base branch from stable to master November 25, 2023 16:58

saleml merged commit 1603723 into master Nov 25, 2023
3 checks passed

josephdviviano deleted the easier_environment_definition branch February 16, 2024 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Easier environment definition #143

Easier environment definition #143

josephdviviano commented Oct 20, 2023

saleml left a comment

saleml Oct 21, 2023

saleml Oct 21, 2023

josephdviviano Nov 23, 2023

saleml Oct 21, 2023

josephdviviano Nov 23, 2023

saleml Oct 21, 2023

josephdviviano Nov 23, 2023

saleml Oct 21, 2023

josephdviviano Nov 23, 2023

saleml commented Oct 23, 2023

josephdviviano commented Oct 23, 2023 via email

josephdviviano commented Nov 23, 2023

saleml left a comment

saleml Nov 23, 2023

saleml Nov 23, 2023

saleml Nov 23, 2023

saleml Nov 23, 2023

saleml Nov 23, 2023

saleml Nov 23, 2023

saleml Nov 23, 2023

josephdviviano commented Nov 24, 2023

saleml left a comment

saleml commented Nov 24, 2023

		self.is_discrete = True
		self.log_reward_clip = log_reward_clip

		super().__init__(**kwargs)
		self.logZ_value = nn.Parameter(logZ_value)

Easier environment definition #143

Easier environment definition #143

Conversation

josephdviviano commented Oct 20, 2023

saleml left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saleml commented Oct 23, 2023

josephdviviano commented Oct 23, 2023 via email

josephdviviano commented Nov 23, 2023

saleml left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josephdviviano commented Nov 24, 2023

saleml left a comment

Choose a reason for hiding this comment

saleml commented Nov 24, 2023