Skip to content

Patching code unsuitable for batch inference #123

@gahdritz

Description

@gahdritz

Hello! The code that runs the entropy model during the computation of patch boundaries appears to be bugged.

I ran into the issue while testing a slight modification of demo.py for batch inference. I essentially just changed the line:

prompts = [prompt]

to

prompts = [prompt] * 10

I then ran python3 demo.py "A BLT has" after downloading the usual HF weights. Even though greedy decoding is enabled by default, I unexpectedly ended up with 10 very different generations.

The issue seems to originate in this function in patcher.py, e.g. in these lines:

        max_length = getattr(entropy_model, "max_length", 8192)
        batch_numel = max_length * patching_batch_size
        splits = torch.split(tokens.flatten(), batch_numel)

The code generally does not appear to respect boundaries between sequences in a batch, except potentially in a "soft" way via EOS tokens, and will gladly flatten different sequences into the same chunk. Since entropy_model.max_length is not defined by default and ends up as 8192, in this case it simply puts all of the sequences into the same chunk without differentiating them. The result is that the entropy model assigns lower scores to every subsequent repetition of the same prompt in a batch, so that sequences later on in the batch end up with larger and larger patches.

This is obviously a silly example, but in general one could imagine repeated instructions during batch inference being unreasonably squashed into gigantic patches.

Is this unintentional (the shape annotations in this function match the actual code, so I'm unsure), and were the HF weights trained this way? This sort of thing makes more sense during training, when subsequent documents are on average less closely related to each other, but even there, this could change the distribution of patch sizes to one not seen even during single-sequence inference. Happy to write a PR once this is clarified.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions