-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QualityEstimation Implementation [RFC] #56
Comments
Now adding the implementation details, to conveniently prototype (as @ugermann mentioned in the meeting) the class called There is a command line test-app which already prints qualityscores and accesses response. There are some bergamot-translator/app/service-cli.cpp Lines 58 to 76 in f654ab0
The output looks like this, for a sample text:
These values are believed to be -logprobs scale (which are the requested input values to QualityEstimator). You can pick up from here and implement the In When ready and complete, (hopefully) it is not going to be much work to induct Instructions to build the command line app is here. |
@jerinphilip has this alignment PR been merged? If that's the case, I don't see |
Alignment PR is merged. The file is renamed to service-cli.cpp in a parallel sanitization attempt of filenames(7fd5d0f). Docs to build it are at doc/marian-integration.md. There will be some more shuffling and moving around to cleanup and well structure the source. However, if you stick to providing the |
Hi there, I managed to successfully build the project locally through the provided docs However, when I tried to run:
I am getting the following error: Here is the full stack trace of the error:
I have tried to investigate a little bit and I found that the shortlist feature was merged from here but I don't know if they are related. @jerinphilip do you have a clue about what I'm might be doing wrong? Thanks! |
I have tried both
And
|
@abarbosa94 We know for sure the tests pass (because CI) from clean install, so can you try the script over there? They should be similar enough to know to update the variables. I will update the documentation shortly. This is my bad, sorry. |
There's a few issues here for @qianqianzhu to fix.
then file an issue against https://github.com/browsermt/marian-dev. It should work if you patch the config file: |
Hi guys, Thanks for the quick feedback. Indeed the instructions provided by @kpu worked :) As I'm using a CPU-only machine, there was a little tweak that I was required to do to make this "hello world" rungs smoothly. For future reference:
Sample output:
Again, I much appreciated the quick assistance :) |
I think Mozilla were keeping their own fork of some models which need to be updated /synced with master. @abhi-agg? |
The instructions point to http://data.statmt.org/bergamot/models/deen/ende.student.tiny11.tar.gz ; don't poke Mozilla to fix things until we've fixed upstream. |
Hey guys, there is a first attempt to provide this solution in #173 Feel free to analyze and criticize. I basically decided to perform this implementation with ONNX because I think it would treat models agnostically and it would also have a lot of good features already implemented aiming inference performance. |
Input: translated text, Source text, model scores for tokens, tokenization information to make sense of model scores.
Output is expected to be containing for each sentence the following:
float
vector<float>
corresponding to each of the Wordwhere Word is space separated words of a sentence (mozilla prefers word, not subword level scores). Continuous values preferred for more experimentation capabilities.
Let output be a struct called
QualityEstimate
. Implementation which can start from the below skeleton is tentatively going to be used byService
to makeQualityEstimate
a member inResponse
. (The layer above in UnifiedAPI doesn't have access to logprobs, so).This is to be built native first, and when readied exported to WASM.
@ugermann @abhi-agg @fredblain @mfomicheva /cc @kpu
The text was updated successfully, but these errors were encountered: