Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options covered by TranslationRequest #89

Closed
jerinphilip opened this issue Apr 6, 2021 · 4 comments
Closed

Options covered by TranslationRequest #89

jerinphilip opened this issue Apr 6, 2021 · 4 comments

Comments

@jerinphilip
Copy link
Contributor

jerinphilip commented Apr 6, 2021

Meta issue to discuss and complete docs for keys and possible values for the message passed in regarding what or how should Response be constructed.

@abhi-agg

We will have to add a documentation listing all the keys and the corresponding values that can be provided as translation request.

@motin This is where I want your inputs, this is not API design, this is slight change/discussion in what you communicate to me and what I respond with. Unified API is a wall which changes the objectives to something else and an unnecessary time-sink. I put forth the following configurable parameters.

alignment: true # true | false
alignment-threshold: 0.2f # Float value
quality: false # true | false
quality-score-type: free # free | expensive
concat-strategy: faithful # faithful | space 
Explanation
  1. alignment-threshold: So alignments is a (dense) matrix per Unified API Example. This is wasteful, as the matrix is often sparse and your algorithm is expected to only operate with what is the high-match alignments. I'd therefore like to provide you this additional configurability as well, where you set this to 0.0f where you need the full alignment (the dense matrix) or some other tuned value where you want to experiment with different configurations.
  2. quality-score-type: I can offer you a free quality score as of now, which should help you develop UI components. However, I cannot guarantee the API remains same as we accommodate both Mozilla and Sheffield requirements. We're effectively parallelizing development with a bit of overhead here. I have some background developing UIs and particularly with quality scores and I'll add this here to establish the credentials. You should be able to reuse UI components and run a few iterations while we make slight tweaks in the backend to get different but close to these structures quality.
  3. concat-strategy: I am not sure if you want to have this, but you might already be aware that there are newline no newline etc issues with bergamot-translator. You can ask me here to translate text faithful to it's source structure or not if such provisions are present. Think you're translating a .txt, you can offload everything down and print back what we provide - in which case you'd want faithful. Not so much so if you're working with sentences picked up from HTML nodes.

We can add many more as we go ahead. With a dict, the possibilities increase. We'll also need some place to document these, maybe the wiki here or sphinx being generated. Let know your suggestions, or maybe more configurability you want.

Edit: Added quality score yes/no option.

@motin
Copy link
Contributor

motin commented Apr 8, 2021

@jerinphilip

  1. I am on board with the ability to configure an alignment threshold on a per-request basis. Not having to traverse the alignments matrix makes the integration easier as a consumer of the API.
  2. I am on board with the ability to configure which type of quality score that should be returned (as well as configuring not to return any quality scores at all for the particular request).
  3. The text picked up from HTML nodes can be paragraphs of text with several newlines at various points. As a consumer, I would expect these to be preserved by the API, so no need for a configuration parameter here. Stay faithful all the time. :)

@jerinphilip
Copy link
Contributor Author

I added quality = true | false. Most of the returned stuff will be made optional according to provided parameters.

@abhi-agg I've started a page for comprehensive documentation of options at:

Guessing it's best to move into a .md in source to be picked up by doc tooling after we reach a consensus on the parameters and the documentation and it's implemented in source, until which all of us can enjoy fast and WYSIWYG editing.

@motin
Copy link
Contributor

motin commented Apr 13, 2021

I made the slight tweak quality -> quality-scores since the flag covers quality scores and does not affect the quality of the returned translation.

@kpu
Copy link
Member

kpu commented Apr 26, 2021

I don't understand why we are using a weirdly typed key-value map for what should be a struct? Want to make it easy to add new keys? You can add member variables to a struct with default construction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants