-
|
Hey guys thanks for a super approachable GFN implementation! BackgroundOur state space is 3 dimensional with two of the dimensions being quite large. A state looks something like this [[Treatment, Start, End], [Treatment, Start, End] ... ]. Where treatment is choice of treatment method, and start and end define over what range to apply that treatment. The problem seemed similar to the scrabble example so the proxy and environment implementations are based on that. ChallengeAll dimensions are discrete but start and end have the following relation: I originally took inspiration from the scrabble example, but it seem that i may need a different approach. My concern is that the choices of start and end are in practice continuous as |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Hi, I'm glad to read that you're finding this implementation helpful! To be able to try to provide a helpful answer, I still need to understand a couple of things:
That said, I would still take advantage of the compositionality that GFlowNets can provide and break down the selection of that number into multiple steps. For example, divide the range [1, 2048] into a hierarchical selection with N steps, each step refining the number. Also, I would take a look at the Stack meta-environment, which might be useful in your case, if you want to stack multiple sub-environments together. |
Beta Was this translation helpful? Give feedback.
-
|
So the solution was was to split the distribution into subsets smaller than maximum allowed size. Calculate the probability of each sub set, normalize it. sample each subset using the categorical distribution in each set then sample the result based on the original probability of each sub set. |
Beta Was this translation helpful? Give feedback.
So the solution was was to split the distribution into subsets smaller than maximum allowed size. Calculate the probability of each sub set, normalize it. sample each subset using the categorical distribution in each set then sample the result based on the original probability of each sub set.