New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support repeats in Google Sheets #1851
Conversation
Form used for testing :
Observations :
|
@shobhitagarwal1612 I fixed the issue. |
I can verify that both of the above issue are fixed.
Can we link the cell from the first sheet to the second sheet's cell? In that way we will know that which repeat belongs to which row in the first sheet. |
This is really important. I think we should match what Briefcase does -- include the I believe what is going on now is that the first question is being used as a link, is that right, @grzesiek2010? That will work in most cases if |
yes and I agree that it's important but
I think it might be tricky because someone might delete something manually and then such a link might be wrong and another thing is that Aggregate and Briefcase use instanceId for that so I think my current solution is ok. Maybe we should just check if an |
Agreed 👍 |
@lognaturel so what should I do? Leve it as it is or add this?:
|
Correct. XLSForm puts it at the end of the form and in general, there is no guarantee about the location of I have tried two forms that fail to upload the main form body. Some of the repeats are uploaded but not all. groups-repeats.xml.txt fails with "No data found" or "Sorry, no form was uploaded" -- I wasn't able to quickly tell what explains why it's not a consistent error. All student names repeats were uploaded but not all student grades. The other form which failed I unfortunately can't share but the failure mode seems to be the same. I can see that there's an At a high-level, I recommend spending some time looking for opportunities to make the code more readable and maintainable. That line above is a great example where it's unclear what is going on. A combination of additional comments and breaking 7b17a46 with commit messages that explain some of the decisions would also help a lot. For example, is using a guava There are a few methods that take in blank collections and then populate them that should instead return (https://books.google.com/books?id=_i6bDeoCQzsC&pg=PA45&lpg=PA45&dq=Output+arguments+should+be+avoided&source=bl&ots=ep3SFo9d46&sig=GkBX61M8oLwgh6wrxx0BBtgKqU0&hl=en&sa=X&ved=0ahUKEwjz2JLo3q3ZAhUOSK0KHZ_TAkQ4ChDoAQgoMAE#v=onepage&q=Output%20arguments%20should%20be%20avoided&f=false). A clear example is Breaking |
@dcbriccetti could you review the code? I changed almost everything in comparison with the old version of In the end, I investigated everything again and I noticed that some functions don't work and some are not needed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of really good improvements here. I've provided some more suggestions in line. In particular, it would be great to reduce the amount of state kept to keep things easier to reason about.
There are still a number of methods that throw exceptions and a catch (Exception e)
catch-all in uploadOneInstance
. @dcbriccetti do you have any best practices to share in this case? The scenario is that a task has multiple subtasks and the failure of any one of them should result in the failure of the overall task. Uploading one instance entails making sure the spreadsheet is set up to receive that instance, gathering all the relevant answers, writing those answers to the spreadsheet, etc. A lot of the calls involved can throw exceptions. Remove the throws Exception
in insertRow
to see that basically every single line can throw an exception. We don't particularly care about those exceptions and just want to stop (though in an ideal world we would clean up the partial submission).
The original solution included giant methods with lots of try-catches.
} | ||
|
||
public boolean isAuthFailed() { | ||
return authFailed; | ||
} | ||
|
||
public void setAuthFailed() { | ||
public void setAuthFailedForFalse() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setAuthFailedToFalse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or unsetAuthFailed
or resetAuthFailed
or setAuthFailed(boolean authFailed)
@@ -199,12 +199,19 @@ private boolean uploadOneSubmission(String id, File instanceFile, String jrFormI | |||
FormDef formDefFromXml = XFormUtils.getFormFromInputStream(new FileInputStream(new File(formFilePath))); | |||
|
|||
List<TreeElement> mainLevelColumnElements = getColumnElements(formDefFromXml.getMainInstance().getRoot()); | |||
TreeElement instanceIDColumn = getInstanceIDColumn(mainLevelColumnElements); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you get this from the database to avoid doing a full pass through the form? It's a small optimization but could make a difference when a form has thousands of questions255 questions.
@@ -113,23 +112,20 @@ public static boolean isLocationValid(String answer) { | |||
.matches(); | |||
} | |||
|
|||
private String getGoogleSheetsUrl(Cursor cursor) { | |||
private String getSpreadsheetUrl(Cursor cursor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getGoogleSeetsUrl
seems more accurate!
if (!uploadOneSubmission(id, instance, jrformid, token, formFilePath, urlString)) { | ||
cv.put(InstanceColumns.STATUS, | ||
InstanceProviderAPI.STATUS_SUBMISSION_FAILED); | ||
if (token == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems token
doesn't change in the method so this check should come before any other work is done.
TreeElement instanceIDElement = getInstanceIDElement(getChildElements(instanceElement)); | ||
if (hasRepeatableGroups(instanceElement)) { | ||
if (instanceIDElement == null) { | ||
outcome.results.put(id, "This form contains repeatable group so it should contain an instanceID!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will show up in the UI so it should be translated. More concise version: "Forms with repeats must define an instanceID."
private String googleSheetsUrl = ""; | ||
|
||
private String mainSheetTitle; | ||
private String googleSheetsUrl; | ||
private String spreadsheetId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should always be able to get this from a Spreadsheet
object.
|
||
private void insertRow(List<Object> rowElements, String sheetName) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this method is helpful. If a method name is overloaded, it should be to have different variants with different parameter lists. Here, the two insertRow
methods are not related in this way. Making the helper call and catching the exception in the primary insertRow
would be clearer.
} else { | ||
String answer = element.getValue() != null ? element.getValue().getDisplayText() : ""; | ||
if (new File(instanceFile.getParentFile() + "/" + answer).isFile()) { | ||
answers.put(elementTitle, uploadMediaFile(instanceFile, answer)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy to miss that the attachment is actually uploaded here. Clearer:
String mediaHyperlink = uploadMediaFile(instanceFile, answer);
answers.put(elementTitle, mediaHyperlink);
private String spreadsheetId; | ||
private String id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id
is only used for the outcome text. Could jrFormId
be used in the outcome instead so that this field isn't needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve eliminated it in my proposed changes.
private Outcome outcome; | ||
private boolean hasWritePermissionToSheet = false; | ||
|
||
private boolean hasWritePermissionToSheet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this field actually useful? It's only ever set to true so I think that means even if you're uploading a bunch of different forms to different sheets, as soon as you hit the first one you have write access to the field is true. I think it can be removed.
Sorry, I missed this way back when. One comment right away is that many of the commits could be combined, and that would help reviewers. |
Here are the changes I recommend, to simplify the error handling: https://github.com/dcbriccetti/collect/commit/fd36388ef0ffcdb91c093ff1003b413da4d1fd5e |
@lognaturel @dcbriccetti I implemented all improvements. |
Thanks for getting those changes in so quickly, @grzesiek2010! @dcbriccetti to summarize, you're not against using exceptions for control flow in a situation like this but introducing a custom exception means that at least we can be sure that it's an "expected" exception. Did I interpret your suggestions right? I think this is now clear enough that we can move to QA. @dcbriccetti? |
Sure, we can move to QA. If this were in Scala (or possibly with more modern Java features available), I would probably not recommend the exceptions. (Scala has What I suggested creates a single exception Here’s my commit comment, since it didn’t survive: Improve InstanceGoogleSheetsUploader error handling Rename Outcome.results to messagesByInstanceId |
if (!sheetCols.contains(col)) { | ||
missingColumns.add(col); | ||
} | ||
// This method builds a column name by joining all of the containing group names using "-" as a separator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well enjoy the benefits of having this be a doc comment. Pop-up help in the IDE, for instance.
`/** This method builds a column name by joining all of the containing group names using "-" as a separator */
(Please notice the removed extra space.)
|
||
// check if root folder exists, if not then create one | ||
driveHelper.getIDOfFolderWithName(GOOGLE_DRIVE_ROOT_FOLDER, null, true); | ||
private String getGoogleSeetsUrl(Cursor cursor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sheets
, not Seets
@@ -167,7 +167,7 @@ protected void onDestroy() { | |||
|
|||
@Override | |||
public void uploadingComplete(HashMap<String, String> result) { | |||
Timber.i("uploadingComplete: Processing results (%d) from upload of %d instances!", | |||
Timber.i("uploadingComplete: Processing messagesByInstanceId (%d) from upload of %d instances!", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a search and replace error.
… parsing submission file
Regression has been found! I checked this case on v1.13.2. Problem was not visible. Steps to reproduce:
Current behavior:
|
@mmarciniak90 I fixed the problem and you can continue testing once you update your branch. |
selectionArgs, null); | ||
if (cursor != null && cursor.getCount() != 1) { | ||
cursor.close(); | ||
if (!new File(filePath).exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To help reviewers, would you please explain this change and how it fixes the problem? Thanks.
Tested with success! Verified on Android: 4.1, 4.2, 5.1, 6.0, 7.0, 8.0 Verified cases:
@opendatakit-bot unlabel "needs testing" |
👏 👏 👏 |
* Extracted areHeadersEmpty() method * Extracted areColumnNamesLegal() method * Extracted areWritePermissionsGranted() method * Extracted method uploadMedia() * Extracted resizeSpreadSheet() method * Extracted addHeaders() method * Extracted isColumnLengthValid() method * Extracted readColumnNames method() * Aded instanceFile as a method parameter * Extracted readAnswers() method * Extracted areSubmissionColumnNamesLegal() method * Extracted insertRow method() * Extracted areEmptyColumns() method * Extracted handleBlankColumnNames() method * Extracted addMissingColumns() method * Extracted checkForMissingColumns() method * Extracted addPhotos() method * Extracted updateValues() method * Extracted getSheetCols() method * Extracted prepareListOfValues() method * Extracted getRowFromList() method * Use insertRow() method in other places too * Extracted sleepThread() method * Removed unnecesarry comments * Reduced the code using exceptions * Fixed bug in updateValues() method * Removed unnecessary areHeadersEmpty() method * Removed unnecessary code * Improved naming * Moved token check up * Improved comments * Implemented addSheet() method * Support repeat groups * Code improvements * Fixed token check * Removed unnecessary validation * Fixed problem with empty groups * Fixed problem with doubled submissions * Fixed problem with empty answers * Fixed naming * Fixed lints * Fixed bug with duplicated repeat groups * Check for instanceID * Support nested repeat groups * Removed usning Exceptions for flow control * Code improvements * Removed unnecessary check - Google Sheets allow adding other chars too * Code improvements * Improved previous solution - read answers from treeElements insead of parsing submission file * Ignore empty rows * Code improvements * Fixed max columns limit - it's 256 not 255 * Fixing blank headers didn't work * Ignore template elements * Code improvements * Code improvements * Fixed problem with sending media files
Closes #1385
What has been done to verify that this works as intended?
I tested many different forms:
A form with one repeat,
A form with multiple repeats,
Media files in repeats,
Empty answers etc.
Why is this the best possible solution? Were any other approaches considered?
It works in a similar way like Briefcase and Aggregate - repeats are stored in a separate sheet.
Are there any risks to merging this code? If so, what are they?
Of course, uploading to Google sheets should be tested carefully.
Do we need any specific form for testing your changes? If so, please attach one.
Any form with a repeat section.