Too large state space, should i switch to continuous SS? #404

jensb-sdu · 2025-10-15T08:59:45Z

jensb-sdu
Oct 15, 2025

Hey guys thanks for a super approachable GFN implementation!
I'm currently working on a PhD where we want to use GFN's to explore an optimal use of treatments to a signal in order to classify it.

Background

Our state space is 3 dimensional with two of the dimensions being quite large. A state looks something like this [[Treatment, Start, End], [Treatment, Start, End] ... ]. Where treatment is choice of treatment method, and start and end define over what range to apply that treatment. The problem seemed similar to the scrabble example so the proxy and environment implementations are based on that.

Challenge

All dimensions are discrete but start and end have the following relation: $0 \leq start \lt end \leq N$, where $N \approx 2048$ for testing purposes but in reality could be higher.
This makes the PyTorch Categorial sampling function fail as it "only" supports distributions of size $> 2^{24}$.

I originally took inspiration from the scrabble example, but it seem that i may need a different approach. My concern is that the choices of start and end are in practice continuous as $N \rightarrow \infty$

Answered by jensb-sdu

Nov 12, 2025

So the solution was was to split the distribution into subsets smaller than maximum allowed size. Calculate the probability of each sub set, normalize it. sample each subset using the categorical distribution in each set then sample the result based on the original probability of each sub set.

View full answer

alexhernandezgarcia · 2025-10-15T19:07:58Z

alexhernandezgarcia
Oct 15, 2025
Maintainer

Hi,

I'm glad to read that you're finding this implementation helpful!

To be able to try to provide a helpful answer, I still need to understand a couple of things:

You say that the state looks like [[Treatment, Start, End], [Treatment, Start, End] ... ]. Does this mean that each trajectory (each sample) consists of multiple triplets [Treatment, Start, End]? How many?
If N is about 2048, why do you need a categorical distribution on a support larger than $$2^24$$? It seems to me that the action space to select an action in 1, 2, ..., 2048 has size 2048 and the Categorical distribution would not be too large.

That said, I would still take advantage of the compositionality that GFlowNets can provide and break down the selection of that number into multiple steps. For example, divide the range [1, 2048] into a hierarchical selection with N steps, each step refining the number.

Also, I would take a look at the Stack meta-environment, which might be useful in your case, if you want to stack multiple sub-environments together.

1 reply

jensb-sdu Oct 16, 2025
Author

Hi,
Yes each trajectory consists of multiple triplets, where each trajectory is currently limited to 16. My specific issues is that when sample_actions_batch flattens the state space in order to sample from it, PyTorch throws an error regarding the size of the distributions when the Categorical object is sampled from.

Like the example i construct the action space as follows:

def get_action_space(self) -> List[Tuple]:

    """

    Constructs list with all possible actions, including eos.

    An action is represented by a 3-element tuple indicating the index of the
    function to be added to the current sequence (state) and over what interval to apply that function.
    """
    function_actions = [self.token2idx[token] for token in self.functions] 
    
    actions = []


    for func in function_actions:
        for start in range(self.window_size - self.min_function_width) : # window size is the internal representation of N (looking at a windowed signal)
            for end in range(start + self.min_function_width, self.window_size) :
                actions.append(tuple((func, start, end)))

    actions.append((self.eos_idx,-1,-1))
    return actions`

This obviously becomes very large quickly.
as demonstared by:

>>> import math
>>> combs = [(func, start, end) for func in range(16) for start in range(0, 2048-64) for end  in range(start + 64, 2048)]
>>> len(combs)
31505920
>>> len(combs) - math.pow(2,24)
14728704.0

I tried looking into the Stack environment, but how to properly write up the environment was a bit unclear, so I wanted, to start with a simple Scrabble like implementation in order to better understand the repo, and how it functions. I realize this isnt the optimal approach, nut it was supposed to be quick and dirty just to gain some understanding. (Sorry, if that is a dumb move but this is my first time implementing GFNs :) )

I would very much prefer doing the selection sequentially, as it reduces the search space and takes up less space since the action space is a lot smaller if each dimension is kept separate.

jensb-sdu · 2025-11-12T11:08:18Z

jensb-sdu
Nov 12, 2025
Author

So the solution was was to split the distribution into subsets smaller than maximum allowed size. Calculate the probability of each sub set, normalize it. sample each subset using the categorical distribution in each set then sample the result based on the original probability of each sub set.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too large state space, should i switch to continuous SS? #404

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Too large state space, should i switch to continuous SS? #404

Uh oh!

jensb-sdu Oct 15, 2025

Background

Challenge

Replies: 2 comments · 1 reply

Uh oh!

alexhernandezgarcia Oct 15, 2025 Maintainer

Uh oh!

jensb-sdu Oct 16, 2025 Author

Uh oh!

jensb-sdu Nov 12, 2025 Author

jensb-sdu
Oct 15, 2025

Replies: 2 comments 1 reply

alexhernandezgarcia
Oct 15, 2025
Maintainer

jensb-sdu Oct 16, 2025
Author

jensb-sdu
Nov 12, 2025
Author