Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Particle Filter #673

Closed
wants to merge 1 commit into from
Closed

Add Particle Filter #673

wants to merge 1 commit into from

Conversation

rlouf
Copy link
Member

@rlouf rlouf commented Feb 16, 2024

See Nicolas Chopin's book for a really nice introduction to the topic. The algorithms consists in carrying $N$ particles for each sequence in the batch, and at each step to:

  1. Sample a new token for each particle using the next-token logits.
  2. Resample the particles.

We use the multinomial resampling function in this first PR, although it is known to have very large variance. To make the implementation easier we combine (1) and (2) in a single step, similarly to what we do with beam search.

Note that there is a subtlety when doing structured generation. We can think of the simple following scheme to sample from the distribution of sequences that follow the structure:

  1. Move particles by one step using the unbiased next-token logits;
  2. Set the weight of each invalid particle to $-\infty$
  3. Resample

But this can be very inefficient. Instead, we move particles using a specific proposal: using the biased next-token logits. Since this is not exactly sampling from the original distribution we need to resample the particles using the factor $P_i / \tilde{P}_i$ as a weight where $P_i$ is the unbiased probability of token $i$ and $P_i$ the biased probability of token $i$ (importance sampling).

Note: I am wondering if we should correct the Beam Search algorithm as well.

@rlouf rlouf added enhancement transformers Linked to the `transformers` integration samplers labels Feb 16, 2024
@rlouf rlouf marked this pull request as draft February 16, 2024 10:56
@rlouf rlouf changed the title Add Sequential Monte Carlo Sampler Add Particle Filter Feb 28, 2024
@dottxt-ai dottxt-ai deleted a comment from lapp0 Feb 28, 2024
@rlouf
Copy link
Member Author

rlouf commented Feb 29, 2024

Doing this I started to wonder if we shouldn't see and implement greedy and multinomial sampling as particular cases of more general samplers (resp. a form of beam search and a form of particle filtering).

@rlouf rlouf force-pushed the smc-sampler branch 3 times, most recently from b78e8b3 to b33e645 Compare February 29, 2024 13:46
@rlouf rlouf closed this Jun 19, 2024
@rlouf rlouf deleted the smc-sampler branch November 4, 2024 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement samplers transformers Linked to the `transformers` integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant