You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This means translation checks are only run on the first row of a form. Unfortunately, that will not always use all translation columns. For example, the following test fails because no warning is produced:
If the deviceid row of the form is omitted, the check works as intended.
I'm not entirely sure what to do about this. I think the ideal would be to check the actual column headers. I believe that would mean adding header validation to xlsx_to_dict/xls_to_dict/csv_to_dict or passing on header information from there. Those currently don't know about the different expected sheets (which seems like a good thing).
Another option would be to do the translation check while processing each row. That might not be great performance-wise but I'm not sure.
The text was updated successfully, but these errors were encountered:
translation checks are only run on the first row of a form
There was a performance concern (link) that lead to that. In that thread I noted having checked validating all items and it took "+/- a few ms at 5k questions".
check the actual column headers
I think I looked at getting the headers into workbook_to_json and it was one of those things that seems simple initially but kinda isn't (maybe in the context of the rest of #157).
Thanks, I had forgotten that conversation and that change.
it took "+/- a few ms at 5k questions".
That seems acceptable after all? Right now I expect most forms are missing out on this check. It's really common to get deviceid, starttime, etc at the start of a form and/or to start with a group or repeat. An alternative might be to check the first ~15 rows or something.
My current preference is to go back to checking every row. Maybe we could add the check to the main loop over all rows? I'm not entirely sure that would make a difference, even 5k is not a big number of list entries to loop over.
I think I looked at getting the headers into workbook_to_json and it was one of those things that seems simple initially but kinda isn't
I agree, that was what I set out to do with #601 and it would really require a significant restructuring.
Since #666 the translations check uses the column headers, which addresses this bug. If it's desirable to warn users about any missing translation in any row, I think that's a different feature for a new ticket.
While taking a look at #601, I noticed the use of
sheet_data[0]
athttps://github.com/XLSForm/pyxform/blob/master/pyxform/validators/pyxform/missing_translations_check.py#L90
This means translation checks are only run on the first row of a form. Unfortunately, that will not always use all translation columns. For example, the following test fails because no warning is produced:
If the
deviceid
row of the form is omitted, the check works as intended.I'm not entirely sure what to do about this. I think the ideal would be to check the actual column headers. I believe that would mean adding header validation to
xlsx_to_dict
/xls_to_dict
/csv_to_dict
or passing on header information from there. Those currently don't know about the different expected sheets (which seems like a good thing).Another option would be to do the translation check while processing each row. That might not be great performance-wise but I'm not sure.
The text was updated successfully, but these errors were encountered: