Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality scores when responseOptions.qualityScores:true but skip-cost:true #361

Open
jelmervdl opened this issue Feb 25, 2022 · 6 comments
Open
Labels
low-priority Things work, this could make things better.

Comments

@jelmervdl
Copy link
Member

When responseOptions.qualityScores:true is used with a model that has skip-cost:true, the quality scores output is still there and there is no error message. However, they average around 20 instead of the -0.1…-0.9 ish?

I'd expected that it would fail on an ABORT, instead of return completely different scores.

@jerinphilip
Copy link
Contributor

When responseOptions.qualityScores:true is used with a model that has skip-cost:true, the quality scores output is still there.

There is also a solution where the client does not try to do the above. In the short term, @abhi-agg is expected to know this caveat. Values being not within limits should be signal enough.

A neat way to solve the general problem is perhaps to leak model information (show-alignment, skip-cost) etc through Request to Response, and during Response construction use that to trigger ABORTS.

Use the following information:

const TranslationModel &model_;

Pass required stuff indicating capability of alignment, quality along below:

responseBuilder_(std::move(histories_));

There is also a config UI problem: show-alignment, skip-cost being completely alien vocabulary to HTML or quality. I'm not particularly a fan of Config = std::string, the TranslationModelConfig from https://github.com/browsermt/bergamot-translator/blob/fe3f3982debbd88e94f1909c313d1a611d2ff1c9/doc/Unified_API.md was super hard to modify, so I just took the std::string proposal. On the plus side, the std::string based approach has made correspondences to marian's and switching configuration quite fast allowing experimentation.

@jelmervdl
Copy link
Member Author

There is also a solution where the client does not try to do the above.

I know, but that feels like a you're holding it wrong type of solution.

I think I like most of your solution. If we could expose the (parsed?!) model config options through public interface of TranslationModel, and then pass reponseBuilder_ and eventually the QualityEstimator a reference to the model that was used, they could figure out themselves whether the model was okay and ABORT if that was not the case. QualityEstimator is specialised enough code that it is okay to specifically query skip-cost in there. It's not like it needs to be compatible with another Marian-like setup that doesn't use that parameter.

I'm pretty happy with having a std::string (or the Yaml datastructure that marian uses) as the configuration. That way bergamot-translator doesn't need code changes to be synced with every update to marian.

My initial idea was to figure out where QualityEstimator gets its values from from the History object, and then see whether I can make it detect that those values are uninitialised. Similar to how HTML detects that response.alignment is empty. I got as far as finding that skip-cost affects the mode or use parameter when building the graph, but could not yet figure out why QualityEstimator still has access to something that is supposedly skipped during construction of the graph.

@abhi-agg
Copy link
Contributor

A neat way to solve the general problem is perhaps to leak model information (show-alignment, skip-cost) etc through Request to Response, and during Response construction use that to trigger ABORTS.

This information doesn't need to leak via Request because the translate API expects TranslationModel object as a parameter (same goes for Async version of translate API as well) and each TranslationModel object is created with a specific config that is stored internally. So you have that information already.

@jerinphilip
Copy link
Contributor

I know, but that feels like a you're holding it wrong type of solution.

Do not disagree. However, setting the correct YAML + ResponseOptions is necessary at the client. Is a 2 line change. The C++ change proposed here to trigger an ABORT is a larger change. The ABORT is intended for a developer to know and set the right config, there's no recovery given the WASM exception state. Is this really what we want to wait on for implementing QE? A competent developer should be able to navigate and simplify all our lives, allowing an eventual fix?

This information doesn't need to leak via Request...

This is a feasible idea. For a checkConsistent(model, responseOptions) -> ABORT before queueing, it need to be written at 4 places - {translate, pivot} x {blocking,async}service. I believe the above is enough for this issue to be not deemed a "blocker". Please get started on implementing QE - such a fix will be brought in independently.

Since I'm assigned this, please treat this as intimation, that I'm not going to get around to solving this until ARM and separate graphs from workspaces are solved (particularly to help @abhi-agg schedule).

@jelmervdl
Copy link
Member Author

Oh no this is in no way blocking. Just intended as a convenience feature and a way to prevent wrong output( as opposed to an error message) in case of a mistake

@jerinphilip jerinphilip added the low-priority Things work, this could make things better. label Feb 28, 2022
@abhi-agg
Copy link
Contributor

There is also a solution where the client does not try to do the above. In the short term, @abhi-agg is expected to know this caveat.

I am completely aware of the caveats.

This is a feasible idea. For a checkConsistent(model, responseOptions) -> ABORT before queueing, it need to be written at 4 places - {translate, pivot} x {blocking,async}service. I believe the above is enough for this issue to be not deemed a "blocker". Please get started on implementing QE - such a fix will be brought in independently.

This issue is not blocking QE integration. I just left my opinion regarding the solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low-priority Things work, this could make things better.
Projects
None yet
Development

No branches or pull requests

3 participants