tokenizer.batch_decode(
model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=False),
)[0]
I have a very simple problem: How can I generate a different output with the same model. temperature, top_k etc. don't change the LLM's output. Setting the seed doesn't work, I'm not setting any seeds atm. I'm using gemma-3-1b-it and SFTTrainer.