-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 157 warn about missing translations #564
Issue 157 warn about missing translations #564
Conversation
8b8de98
to
5670deb
Compare
@lindsay-stevens Thanks so much for that big PR fixing up the tests!!! it got CI to pass! @lognaturel I was able to clean up and shorten this PR, Hoping it can get a look soon? |
return path[path.find(":") + 1 :] | ||
|
||
# if the ":" is not present, there may be a choice label present. | ||
# First check for a "-" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to the regression that occurred on my first try
- as noted in the comment, the check doesn't apply to questions anymore since PR XLSForm#543 makes pyxform always emit a label, and the only case that is not permissible is a guidance_hint with no label or hint. - nesting this check inside the "control group" processing branch seems the only neat+easy way about this, since: - the context is a loop from which the various branches continue, so it can't check at the end, - each branch parses / determines the actual item type which is not apparent in the raw "type" data from the row dict, and - after the loop context exits, the data is in an array with nested dicts so it's no longer possible to determine the sheet row number for the error message to be as useful to users. - refactored an old test which covers a lot of question types, into the tests_v1 / md style since this test is affected by the changes and having md style is more transparent - added some tests to verify the assumption that for the label check, it's not going to encounter a None or empty value for "control_type".
5670deb
to
2bd90f6
Compare
@@ -703,8 +705,49 @@ def _add_empty_translations(self): | |||
self._translations[lang][path] = {} | |||
for content_type in content_types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real "meat" of the PR is here. Attempt to figure out what column we are working with, then create the appropriate warning for that column.
@@ -326,43 +333,28 @@ def get_translations(self, default_language): | |||
for display_element in ["label", "hint", "guidance_hint"]: | |||
label_or_hint = self[display_element] | |||
|
|||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a small but I think helpful refactor that better sorts the logic for get_translations
04fe844
to
1fa9f96
Compare
Codecov Report
@@ Coverage Diff @@
## master #564 +/- ##
==========================================
- Coverage 84.20% 84.18% -0.02%
==========================================
Files 25 25
Lines 3672 3687 +15
Branches 865 871 +6
==========================================
+ Hits 3092 3104 +12
- Misses 440 441 +1
- Partials 140 142 +2
Continue to review full report at Codecov.
|
@KeynesYouDigIt If I understand the issue correctly, the goal is to warn users when there's one or more "missing" translation columns in the survey (or choices) sheet. A "missing" translation is one where that language appears for one or more of the translatable content types (label, media, etc) specified on the form, but not all translatable content types specified on the form. If no translations specify a certain type then e.g. it's no problem if there is no French image if no other languages have an image column. I would expect to find this kind of warning near the top of the processing pipeline, at the level of About the refactoring in SurveyElement.get_translations. Before a refactor like this I would prefer to have unit tests on the block to establish that the before/after produce the same result and ensure regressions are not introduced. At the moment the refactor does not appear to be equivalent, mainly on the point that the old code for guidance_hint only wants About the introduction of the Lastly for me it would help with reviews if the commit messages for material changes were a bit more detailed, or included in-line comments. For example in the commit |
Thank you so much for the comment and feedback! I have been needing some guidance and collaboration badly here, as you can see 😄 Sorry in advance for so much text, the TL:DR is:
Please feel free to @ me on slack if you would rather decline the PR and move me to a simpler or more urgent set of work (I'll defer to you since you use this code much more in your day-to-day work), or if you want to chat more about what I've done here. I mostly only have time for this on weekends, but could get away from work on Fridays, Thursdays, and maybe Tuesdays. Longer version -
I considered and even preferred the upstream approach here, and to be honest now that you bring it up again I think you might be correct. But let me explain why I went downstream: What caused me to pick this downstream approach is that we actually handle the empty translations themselves downstream ( I am going to spend some time today reconsidering that upstream path again. What I would really like to do is move the translations upstream - when I first had that thought it sounded like too big of a change, but looking at it again I think it would at least be worth spending a few hours seeing what that would look like. Now, if we calculate missing translations upstream, but only create translations downstream, that feels redundant, but I suppose it might not be. (and redundancy might be better than reverse engineering?)
It is, at my company we follow the "boy scout rule" (leave it better than you found it) as a way of packing refactors into PRs. I think the tests cover the case you had mentioned, but I am happy to just back it out for simplicity's sake - the "boy scout rule" might doesn't make sense for this project? (small team, less frequent updates, testing that's more situation-ally driven and doesn't cover small units of work like this particular chunk of code)
I don't (think?) ther'es a way to access that warnings object in the "guts" of where I do this work (this is another hazard of the downstream approach I am using). I do think its a bit odd that the internal workings can't pass information back up to the user, so while the approach of adding another mutable/stateful attribute was hardly ideal, this was the quickest way I could make it work. If I can make the more "upstream" approach work we can remove this for now. My guess is that we will need something that achieves the same later? but I could be wrong. This is a pretty mature codebase and it hasn't needed it before 🤷 |
Reviewing the sections you linked me to - dealias_and_group_headers is very agnostic as to what the "tokens" it works with actually are, which makes it hard to detect if the token is a language. It also works row-by-row, but translations are added column-by-column, so we would have to review the aggregated data again. I could certainly add something like I haven't done enough work to see if its possible, but what do you think of trying to move some or all of the work in _setup_translations and/or _add_empty_translations to this upstream point? Perhaps we could build JSON elements that are much easier to work with down stream. Before I spend time trying to code a proof of concept, is that what you were getting at with your earlier comment? Or are we strictly trying to detect unbalanced translations and move on? Again, I think that would require duplication of logic, but I am not sure. I'll keep mulling this over in my mind, still not sure what the best way forward is, LMK if you have any thoughts or if I've missed something here. |
The There are finite translatable column types, as noted in this comment. So after splitting column names on About the duplication concerns, I think these may overlap code-wise but the purpose is different. The goal for #157 is a warning to the user, which they may ignore at their peril. The existing I would recommend aiming to address #157 as directly as possible. If it looks like a refactor is needed or potentially beneficial, then please create a separate issue/PR to describe/discuss/implement it. My concern is that there aren't many tests around translations behaviour, so embarking on that at the same time could stymie the PR or cause regressions. |
Thanks @KeynesYouDigIt for sticking with this one and exploring from multiple angles. I'd like to propose @lindsay-stevens take this one over and put up a new PR. Then you can use what you've learned to do an initial review and evaluate his approach. As @lindsay-stevens says, if there's an area you feel really needs some refactor, please file an issue for now. |
@lognaturel sounds good I'll take a look at this in a new PR. |
@lindsay-stevens / @lognaturel Sounds good to me as well, thanks for all the help! |
closing in favour of #571 |
Closes #157
Why is this the best possible solution? Were any other approaches considered?
This approach ensures that the warning work has been done along side the work that puts in placeholders for missing translations. It is focused on making sure we never re-evaluate the form for missing languages but uses what is already implied in the form build.
What are the regression risks?
Last time I attempted to solve this issue, I assumed the existence of certain pieces of the
path
string and tried to build warning messages using those pieces. When the warnings function tried to parse paths without what I assumed to be present, it broke. I do not believe this risk is still present, but I do assume things like path and content type are not null in this function. This feels like a safe assumption, so I dont think the risks are great.Does this change require updates to documentation? If so, please file an issue here and include the link below.
No
Before submitting this PR, please make sure you have:
tests_v1
nosetests
and verified all tests passblack pyxform
to format code