Suffering from repeated output, I'm trying to add some rules to the postprocessing model to avoid duplication. I'm trying to early stop the service when duplication occurs, so i think this should be a wrapper above the streaming mode.
So the question is how to get the token one at a time in the ensemble model to make me do early stopping?