ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings when using meta.llama2-13b-chat-v1 in AWS SageMaker

I'm trying to use a `meta.llama2-13b-chat-v1` model from https://github.com/stanfordnlp/dspy/blob/main/intro.ipynb.

```
lm_meta = dspy.AWSMeta(bedrock, "meta.llama2-13b-chat-v1", max_tokens=1024)
colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.settings.configure(lm=lm_meta, rm=colbertv2_wiki17_abstracts)
```

There are no problems when using the `predict` method, but when using `ChainOfThought` as:

```
# Define the predictor. Notice we're just changing the class. The signature BasicQA is unchanged.
generate_answer_with_chain_of_thought = dspy.ChainOfThought(BasicQA)

# Call the predictor on the same input.
pred = generate_answer_with_chain_of_thought(question=dev_example.question)

# Print the input, the chain of thought, and the prediction.
print(f"Question: {dev_example.question}")
print(f"Thought: {pred.rationale.split('.', 1)[1].strip()}")
print(f"Predicted Answer: {pred.answer}")
```

Got this exception `ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings`

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[53], line 5
      2 generate_answer_with_chain_of_thought = dspy.ChainOfThought(BasicQA)
      4 # Call the predictor on the same input.
----> 5 pred = generate_answer_with_chain_of_thought(question=dev_example.question)
      7 # Print the input, the chain of thought, and the prediction.
      8 print(f"Question: {dev_example.question}")

File [~/dspy/dspy/predict/predict.py:61](https://rkvh4t7ndfq8eg2.studio.us-east-1.sagemaker.aws/jupyterlab/default/lab/tree/dspy/dspy/dspy/predict/predict.py#line=60), in Predict.__call__(self, **kwargs)
     60 def __call__(self, **kwargs):
---> 61     return self.forward(**kwargs)

File [~/dspy/dspy/predict/chain_of_thought.py:59](https://rkvh4t7ndfq8eg2.studio.us-east-1.sagemaker.aws/jupyterlab/default/lab/tree/dspy/dspy/dspy/predict/chain_of_thought.py#line=58), in ChainOfThought.forward(self, **kwargs)
     57     signature = new_signature
     58     # template = dsp.Template(self.signature.instructions, **new_signature)
---> 59 return super().forward(signature=signature, **kwargs)

File [~/dspy/dspy/predict/predict.py:103](https://rkvh4t7ndfq8eg2.studio.us-east-1.sagemaker.aws/jupyterlab/default/lab/tree/dspy/dspy/dspy/predict/predict.py#line=102), in Predict.forward(self, **kwargs)
    100 template = signature_to_template(signature)
    102 if self.lm is None:
--> 103     x, C = dsp.generate(template, **config)(x, stage=self.stage)
    104 else:
    105     # Note: query_only=True means the instructions and examples are not included.
    106     # I'm not really sure why we'd want to do that, but it's there.
    107     with dsp.settings.context(lm=self.lm, query_only=True):

File [~/dspy/dsp/primitives/predict.py:124](https://rkvh4t7ndfq8eg2.studio.us-east-1.sagemaker.aws/jupyterlab/default/lab/tree/dspy/dspy/dsp/primitives/predict.py#line=123), in _generate.<locals>.do_generate(example, stage, max_depth, original_example)
    121         finished_completions.append(completion)
    122         continue
    123     finished_completions.append(
--> 124         extend_generation(completion, field_names, stage, max_depth, original_example),
    125     )
    127 completions = Completions(finished_completions, template=template)
    128 example = example.copy(completions=completions)

File [~/dspy/dsp/primitives/predict.py:79](https://rkvh4t7ndfq8eg2.studio.us-east-1.sagemaker.aws/jupyterlab/default/lab/tree/dspy/dspy/dsp/primitives/predict.py#line=78), in _generate.<locals>.extend_generation(completion, field_names, stage, max_depth, original_example)
     72 max_tokens = (kwargs.get("max_tokens") or 
     73             kwargs.get("max_output_tokens") or
     74             dsp.settings.lm.kwargs.get("max_tokens") or 
     75             dsp.settings.lm.kwargs.get('max_output_tokens'))
     78 if max_tokens is None:
---> 79     raise ValueError("Required 'max_tokens' or 'max_output_tokens' not specified in settings.")
     80 max_tokens = min(max(75, max_tokens // 2), max_tokens)
     81 keys = list(kwargs.keys()) + list(dsp.settings.lm.kwargs.keys()) 

ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings when using meta.llama2-13b-chat-v1 in AWS SageMaker #1156

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings when using meta.llama2-13b-chat-v1 in AWS SageMaker #1156

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions