Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebUtils.createOpenRosaHttpHead can throw uncaught IllegalArgumentException #175

Closed
getodk-bot opened this issue Sep 5, 2016 · 8 comments
Assignees
Labels
bug help wanted Issues that are well-specified and don't require too much context.

Comments

@getodk-bot
Copy link
Member

Issue by mitchellsundt
Thursday Jul 09, 2015 at 19:46 GMT
Originally opened as getodk/getodk#1147 (0 comment(s))


Originally reported on Google Code with ID 1146

What steps will reproduce the problem?
1. Enter an ODK Collect URL with a leading space (e.g. " https://foobar.com")
2. Attempt to submit a form

What is the expected output? What do you see instead?
Expected outcome: ODK Collect should indicate that the URL is invalid
Actual outcome: ODK Collect crashes

Please provide any additional information below.
WebUtils.createOpenRosaHttpHead uses java.net.URI.create without catching the thrown
IllegalArgumentException.  This only manifests itself when forms are uploaded.  This
bit us during a training exercise when half our users were unable to upload forms from
the field with ODK crashing when they tried.  When I got the devices back, I found
this in the logfile:
E/AndroidRuntime(16668): Caused by: java.lang.IllegalArgumentException: Illegal character
in scheme at index 0:  https://<URI removed>
E/AndroidRuntime(16668):    at java.net.URI.create(URI.java:733)
E/AndroidRuntime(16668):    at org.odk.collect.android.utilities.WebUtils.createOpenRosaHttpHead(WebUtils.java:204)
etc

Fix is simple:
1) trim() the URI string before attempting to create a URI, to make this more robust
2) Delcare that the method throws IllegalArgumentException, and catch it in InstanceUplaoderTask.uploadOneSubmission,
along with the other exceptions that are caught and logged properly there.


Reported by inverarity on 2015-05-26 10:01:32

@batkinson
Copy link
Contributor

This issue description lacks enough detail to determine scope, but I believe the root cause implies multiple defects:

  • Collect should prevent saving malformed URLs like this
  • The UI for various platform settings are not consistent in how they validate URL input
  • Submission URLs embedded in xform definitions are not well-validated by hosting server
  • Submission URLs embedded in xform definitions are not validated in Collect

This issue, as currently written, assumes that unencoded spaces in URLs should be treated as valid input and Collect should fix them just before use. This may not be the best way to handle these since, in general, invalid URLs should be detected as early as possible and user feedback given so the problem can be identified and fixed at its source.

Testing using latest release (Collect v1.4.10 1061), it is not possible to enter server URL with leading spaces, depending on which platform is being used. When using ODK Aggregate, the user is allowed to type leading spaces, but the value is not applied and the user is notified that "whitespace is not allowed in URLs". When using Google Drive, Google Sheets, the user can type and save a URL with whitespace. When using Other, user input appears to be filtered; hitting space does not modify the URL text. These should probably be made consistent, if possible. However, it looks like Google Drive, Sheets is the only one allowing URLS to contain leading spaces currently.

Submission URLs embedded in the xform definition can't be fixed at input by Collect, since they are either downloaded from a remote server, or loaded from the file system. The best option here is better validation on servers (like Aggregate) to validate them prior to accepting them. Probably the best way to handle the file system case would be to validate forms on scan and show some indication in Collect that the submission URL is broken.

With all of that said, following the Robustness Principle, it makes sense to prevent leading spaces from being interpreted as part of the URL. Trailing spaces, since they actually could be intentional input, may be different. To accept them and also create a valid URL they would have to be URL-encoded. Requiring URL-encoded input from the user may not be the most user-friendly option, but it would make trailing spaces safe to remove with trim().

I was able to reproduce the reported behavior using this test form. The form specifies a submission URL with a leading space. When attempting to submit the form, the result is an app crash with the full stack trace:

09-20 13:07:36.153 25094-25417/org.odk.collect.android E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1
                                                                         Process: org.odk.collect.android, PID: 25094
                                                                         java.lang.RuntimeException: An error occured while executing doInBackground()
                                                                             at android.os.AsyncTask$3.done(AsyncTask.java:300)
                                                                             at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:355)
                                                                             at java.util.concurrent.FutureTask.setException(FutureTask.java:222)
                                                                             at java.util.concurrent.FutureTask.run(FutureTask.java:242)
                                                                             at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
                                                                             at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
                                                                             at java.lang.Thread.run(Thread.java:841)
                                                                          Caused by: java.lang.IllegalArgumentException: Illegal character in scheme at index 0:  https://example.org/submit?deviceID=android_id%3A471d0be9b4a82550
                                                                             at java.net.URI.create(URI.java:727)
                                                                             at org.odk.collect.android.utilities.WebUtils.createOpenRosaHttpHead(WebUtils.java:193)
                                                                             at org.odk.collect.android.tasks.InstanceUploaderTask.uploadOneSubmission(InstanceUploaderTask.java:130)
                                                                             at org.odk.collect.android.tasks.InstanceUploaderTask.doInBackground(InstanceUploaderTask.java:583)
                                                                             at org.odk.collect.android.tasks.InstanceUploaderTask.doInBackground(InstanceUploaderTask.java:71)
                                                                             at android.os.AsyncTask$2.call(AsyncTask.java:288)
                                                                             at java.util.concurrent.FutureTask.run(FutureTask.java:237)
                                                                             at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112) 
                                                                             at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587) 
                                                                             at java.lang.Thread.run(Thread.java:841) 

@batkinson
Copy link
Contributor

For some reason, I was able to run the test form locally. However, when running against a fresh build of master attempting to fill a blank form crashed the app. I tracked the problem back to the auto-save serialization for the submission element - the method can't be null. Attached is
a fixed version of the test form that includes a method attribute.

@batkinson
Copy link
Contributor

The pull request #203 addresses the ultimate cause of the runtime exceptions, but I am not sure how we want to handle this issue. As I mention on #203, I would be reluctant to resolve this per the description. It seems that the immediate issue would be fixed by handling the URLs properly. Changing the upload process to gracefully handle submissions seems like another issue, and it would require more than simply catching any RuntimeException.

@yanokwa
Copy link
Member

yanokwa commented Sep 22, 2016

Thanks for the detailed write up!

I agree that leading spaces should be removed and I agree that consistency across the various platform setting is important. Your changes there are great!

I think trailing spaces should be trimmed. Safari and Chrome (at least recent versions on Mac) trim trailing spaces and I think most people expect that behavior. How strongly do you feel about this?

As to the submission behavior, one way forward is that we can file issues to validate submission URLs at the source in pyxform and in Aggregate/Validate/Collect (probably best to do this in JavaRosa). And then because we don't want perfect to be the enemy of good, we create an issue in Collect to scan new forms and warn about bad URLs and provide some way to override an in-form submission URL (I'm less excited about this).

@meletis @ukanga How have you handled this in your respective platforms?

@meletis
Copy link
Contributor

meletis commented Sep 22, 2016

Well, here in SurveyCTO, we are not suffering from this since our users do not enter the full URL in the settings. They just enter the server name, and since that name represents the sub-domain we construct the URL internally (and safely) in Collect. But we also trim the server name first. :)

@batkinson
Copy link
Contributor

batkinson commented Sep 22, 2016

Hi @yanokwa,

I think trailing spaces should be trimmed. Safari and Chrome (at least recent versions on Mac) trim trailing spaces and I think most people expect that behavior. How strongly do you feel about this?

This is a design preference, but I would do everything I could to give direct feedback to the user as quickly as possible. The pull request I submitted handles this by applying the filter that was being used for the Other platform to all platforms. This makes the input handling consistent, and should prevent the server URL from containing spaces to begin with.

As to the submission behavior, one way forward is that we can file issues to validate submission URLs at the source in pyxform and in Aggregate/Validate/Collect (probably best to do this in JavaRosa). And then because we don't want perfect to be the enemy of good, we create an issue in Collect to scan new forms and warn about bad URLs and provide some way to override an in-form submission URL (I'm less excited about this).

I agree with the spirit of this. Anything that generates a form should provide user feedback, and/or prevent invalid input. While it doesn't solve the general case of a malformed URL, the pull request I submitted trims white space of form submission URLs since they can not be assumed to be validated (could have been hand-coded XML).

@mitchellsundt
Copy link
Contributor

I recommend that the patch should prevent both leading and trailing whitespace.

If there is ever a need to allow a trailing space, then the user can append a trailing slash (/) to indicate that the space should be retained.

If you are working off of a web page or word or other document, it is very easy to accidentally copy-and-paste URLs with leading or trailing whitespace.

@batkinson
Copy link
Contributor

batkinson commented Sep 23, 2016

Thanks for the feedback. Three things:

  • The pull request trims both leading and trailing whitespace for URLs from forms
  • For the server URL from settings, all platforms are treated the way only platform type Other was previously - control characters and spaces are not allowed as input
  • I don't think a trailing slash works in general, for example http://example.com?delim=[space]

The reason I went forward trimming both sides is that the user can use URL-encoded spaces. For example, http://example.com?delim=+ or http://example.com?delim=%20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted Issues that are well-specified and don't require too much context.
Projects
None yet
Development

No branches or pull requests

5 participants