Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix samplers documentation #980

Merged
merged 2 commits into from
Jun 18, 2024
Merged

Fix samplers documentation #980

merged 2 commits into from
Jun 18, 2024

Conversation

jrinder42
Copy link
Contributor

This fixes issue #978. Specifically for the classification example in the outlines cookbook section here, it fixes the piece of code in outlines.generate.api.SequenceGeneratorAdapter that keeps returning the wrong result (i.e. returning False even if a specific sampler is passed (greedy, multinomial, or beam search).

@lapp0
Copy link
Contributor

lapp0 commented Jun 17, 2024

Thanks for identifying and working towards fixing this issue!

Could we fix the docs instead of changing the code though? Some lines in the documentation and tests rely on samplers being callable, e.g.

outlines/docs/reference/samplers.md:sampler = samplers.beam_search(beams=5)
tests/generate/test_integration_llamacpp.py:    sampler = samplers.beam_search(1)
tests/generate/test_integration_transformers.py:    sampler = beam_search(5)
tests/test_samplers.py:    beam_search,
tests/test_samplers.py:    assert beam_search == BeamSearchSampler

@jrinder42
Copy link
Contributor Author

I understand that there are tests written regarding this, but something doesn't seem to be working correctly. The first line of the classification docs is as follows:

from outlines.samplers import greedy

This is where the issue begins. A potential solution would be to do the following in the docs:

from outlines.samplers import BeamSearchSampler, GreedySampler, MultinomialSampler

generator = outlines.generate.choices(model, ["URGENT", "STANDARD"], sampler=GreedySampler())

However, then that begs the question of whether greedy, multinomial, and beam_search from outlines.samplers would be useless (I honestly have not looked to see if these are used in the rest of the codebase). I still think that some code should be changed, I suppose the question is what code / behavior should be changed...

@jrinder42
Copy link
Contributor Author

And one thing about the about a test like:

tests/test_samplers.py:    assert beam_search == BeamSearchSampler

In this context, == and isinstance behave differently. There is an example demonstrating this in the description for issue #978. So the test above is technically not checking for the correct thing (unless I am misunderstanding something).

@lapp0
Copy link
Contributor

lapp0 commented Jun 17, 2024

I understand that there are tests written regarding this, but something doesn't seem to be working correctly. The first line of the classification docs is as follows:

from outlines.samplers import greedy

This is where the issue begins. A potential solution would be to do the following in the docs:

from outlines.samplers import BeamSearchSampler, GreedySampler, MultinomialSampler

generator = outlines.generate.choices(model, ["URGENT", "STANDARD"], sampler=GreedySampler())

However, then that begs the question of whether greedy, multinomial, and beam_search from outlines.samplers would be useless (I honestly have not looked to see if these are used in the rest of the codebase). I still think that some code should be changed, I suppose the question is what code / behavior should be changed...

samplers.greedy is a helper. It's equivalent to samplers.GreedySampler, but less verbose. I think we only need a docs change here.

tests/test_samplers.py: assert beam_search == BeamSearchSampler

In this context, == and isinstance behave differently. There is an example demonstrating this in the description for issue #978. So the test above is technically not checking for the correct thing (unless I am misunderstanding something).

The test is correct, it's asserting that beam_search is a convenience variable equivalent to the class itself.

@jrinder42
Copy link
Contributor Author

Ok, sounds good. I will make the necessary changes.

@jrinder42 jrinder42 closed this Jun 17, 2024
@jrinder42 jrinder42 reopened this Jun 17, 2024
@jrinder42
Copy link
Contributor Author

It should be good to go

@rlouf rlouf added the documentation Linked to documentation and examples label Jun 18, 2024
@rlouf rlouf changed the title Fix for samplers isinstance oddity in the generate api Fix samplers documentation Jun 18, 2024
@rlouf rlouf merged commit 63ac124 into dottxt-ai:main Jun 18, 2024
6 checks passed
@rlouf
Copy link
Member

rlouf commented Jun 18, 2024

Thank you!

@rlouf rlouf linked an issue Jun 18, 2024 that may be closed by this pull request
@jrinder42 jrinder42 deleted the fix-samplers branch June 19, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Linked to documentation and examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Samplers error from the Outline docs classification example
3 participants