Skip to content

Extra generations cause max_tokens of AWSModels to halve permanently each time #1465

@jisikoff

Description

@jisikoff

Extra generations in the Predict Module here:

https://github.com/stanfordnlp/dspy/blob/main/dsp/primitives/predict.py#L96

halve the max_tokens value of the model in the kwargs and try again this is I think supposed to be a temporary halving since its reading that from a global settings on the dsp.settings.lm.kwargs["max_tokens"] which is on the lm model object and passing the halved value in as kwargs for that generation only.

However this halving in made permanent going forward by code in the AWSmodels code which takes that lm.kwargs dictionary as a reference off "self" and sets the max_tokens back into it thus making all future generations start from the halved value and eventually causing all generations to end up at a limit of max_tokens=75 after a certain number of runs.

The code that sets the max_tokens back on the lm model at the halved value I believe is here:

https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/aws_models.py#L221-L223

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions