Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forms don't download on slow connections #2134

Closed
sglangevin opened this issue Mar 23, 2016 · 27 comments
Closed

Forms don't download on slow connections #2134

sglangevin opened this issue Mar 23, 2016 · 27 comments
Labels
Priority: 1 - High Blocks the next release. Type: Bug Fix something that isn't working as intended

Comments

@sglangevin
Copy link

Requests for form definitions seem to be cancelled after 10 seconds. On slow connections, this isn't enough time to download the form definition. Can we extend this timeout?

@SCdF
Copy link
Contributor

SCdF commented Mar 23, 2016

It shouldn't cancel a download after 10 seconds if it's managed to actually connect (I hope!). What is more likely is that it's not even managing to connect in those 10 seconds.

Regardless, we should be able to change that to something like 30 seconds pretty easily I think.

@sglangevin
Copy link
Author

That makes sense. Changing it to something higher would help a lot and make it easier to load phones on slower connections. We had to ask everyone in this training to turn off WiFi so that I can update some phones :)

@SCdF
Copy link
Contributor

SCdF commented Mar 23, 2016

@sglangevin can you go into a bit more detail about when these timeouts are occurring?

Currently it looks like for the initial replication we have a timeout of 30 seconds, and for continuous replication we don't have a timeout at all!

Having said that, it's unclear to me which timeout we are talking about in our code, and whether or not those values are also used for individual document requests.

@sglangevin
Copy link
Author

The 30 second timeout for initial replication isn't long enough. It almost always fails on the WiFi at LG HQ. I've also seen it fail on better connections (WiFi in Nairobi). Haven't tested thoroughly on an SF-speed connection.

The easiest way to observe this is to try a clean install on your Tecno on a slow internet connection using developer tools so that you see the network requests and console. I will take some screen shots here as well.

@SCdF
Copy link
Contributor

SCdF commented Mar 23, 2016

OK ignore me I've found it. There is a separate timeout that you set specifically for ajax requests.

@sglangevin
Copy link
Author

Also, just as a note. If initial replication fails, forms are still downloaded during continuous replication. But it only seems to try to download form definitions once, so if it doesn't work the first time you have to clear data and start again. I may also not be understanding what is happening correctly and the forms would come later, but I haven't seen that happen before.

SCdF added a commit that referenced this issue Mar 23, 2016
By default PouchDB has a 10 second timeout for HTTP requests. To improve
reliability on bad connections this ups the value to 30 seconds.

NB: This is a separate value to the general replication timeout.

Issue: #2134
@SCdF
Copy link
Contributor

SCdF commented Mar 23, 2016

AFAIK forms are just documents like everything else, so I'd hope it would try again.

@sglangevin
Copy link
Author

Maybe someone can tell me when that would happen - it doesn't seem like reload helps with that and I haven't seen it retry on forms or icons before. But maybe that would happen the next time I re-open the app? I haven't tested thoroughly enough to know but was wondering if anyone can confirm how this is supposed to work.

@garethbowen
Copy link
Member

I don't think that's the right timeout. That changes the timeout for stuff we request from the remote db which isn't very common. The timeout we need to set is buried deep inside pouch replication. I don't think there's any way currently to change this without patching pouch.

@SCdF
Copy link
Contributor

SCdF commented Mar 24, 2016

That changes the timeout for stuff we request from the remote db which isn't very common.

Yep that was the one I was intent on changing, based on this comment:

Requests for form definitions seem to be cancelled after 10 seconds.

So I figure (@sglangevin correct me if I'm wrong) that she meant that she's watching the network tab and Pouch does a GET for a form and then it times out. FWIW I've also seen this before.

@SCdF
Copy link
Contributor

SCdF commented Mar 24, 2016

And then 10 seconds later after reading the code you linked…

Oh right. Darn.

@SCdF
Copy link
Contributor

SCdF commented Mar 24, 2016

So yeah, I've also looked through the code and you're totally right, it's not clear you can actually change this value at all. What is confusing to me is that it's not clear why the replication timeout (what you specify when kicking off a replication) isn't also used in this situation: that would make sense to me.

@sglangevin
Copy link
Author

Too bad. There are 2 timeouts that are affecting initial replication. One is the timeout for initial replication, which seems to be 30 seconds. Initial replication completes but has status failed, then continuous replication begins and when the series of GET requests go out for forms and icons, if there is no response within 10 seconds, they cancel and forms are never downloaded. The only solution is to clear the app's data, try again, and hope for the best.

FWIW I'm also seeing this on slow connections when using the webapp. If you are trying to create a CHP area, there is a person dropdown for CHP and for supervisor. Often, these also timeout as the timeout is still 10 seconds when loading things from the DB into the webapp. From the thread on the other issue, it sounds like Couch 2.0 might help, but I do see this being a consistent problem on slower connections, especially when there are a lot of people in the database.

@SCdF
Copy link
Contributor

SCdF commented Mar 24, 2016

Hrm, reading the code, we actually manage to pass down timeout: false when doing continuous replication (to reproduce devs, find the function ajax(opts, callback) { function and put a breakpoint on it, and then change a known contact via admin), which makes it look like pouch is not setting a timeout. This may function differently when doing the first replication, I might clear my cache and take a look at that.

Reports of something failing to download and then never downloading is worrying. I'm wondering if it's a bug in pouch around attachments (we have documents for forms but it's just metadata, the actual form is an attachment on that document), because everything I know about Pouch suggests that if it tries to replicate a document and it fails it should try again until it succeeds, and only then mark that document as replicated.

I will try to reproduce it if possible.

@alxndrsn alxndrsn self-assigned this Mar 30, 2016
@alxndrsn
Copy link
Contributor

@sglangevin next time you have an instance which doesn't work, could you run the following in the javascript console?

angular.element(document.body).injector().get('DB').get()
  .query('medic/doc_by_type', { startkey:['form'], endkey:['form'], include_docs:true, attachments:true })
  .then(function(res) {
    console.log('RAW RESULT', res);
    return res.rows;
  })
  .then(function(rows) {
    var report = {
      xml: [],
      no_xml: [],
    };
    rows.forEach(function(row) {
      if(row.doc._attachments && row.doc._attachments.xml) {
        report.xml.push(row.id);
      } else {
        report.no_xml.push(row.id)
      }
    });
    return report;
  })
  .then(function(report) {
    console.log('----- REPORT');
    console.log('- XML:   ' + report.xml.length);
    console.log('---> ' + report.xml.join(', '));
    console.log('- OTHER: ' + report.no_xml.length);
    console.log('---> ' + report.no_xml.join(', '));
  })
  .catch(function(err) { console.log('error', err); });

I'm interested to know if the form docs are downloading without attachments, or if the whole doc is missing.

@alxndrsn alxndrsn assigned sglangevin and unassigned alxndrsn Mar 30, 2016
@sglangevin
Copy link
Author

@alxndrsn I ran into this issue again today due to slow network here in NBO. Here is the output from running that code, forms that are definitely missing are contact:clinic:edit, contact:health_center:create, contact:health_center:edit:
image

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

For more detailed output of the forms you have locally, run this:

angular.element(document.body).injector().get('DB').get()
  .query('medic/doc_by_type', { startkey:['form'], endkey:['form'], include_docs:true, attachments:true })
  .then(function(res) {
    console.log('RAW RESULT', JSON.stringify(res));
  });

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

From @sglangevin, server alpha.dev with a fresh browser (desktop):

Local forms:

form:assessment
form:assessment_follow_up
form:assessment_image
form:contact:clinic:create
form:contact:person:create
form:contact:person:edit
form:effective_communication
form:ovc_survey
form:postnatal_care
form:pregnancy
form:pregnancy_referral_follow_up
form:pregnancy_visit
form:stock

All of these forms seem to have valid XML, except for stock, which was not an XML form.

Forms on server

form:assessment_follow_up
form:assessment_image
form:contact:clinic:create
form:contact:clinic:edit
form:contact:health_center:create
form:contact:health_center:edit
form:contact:person:create
form:contact:person:edit
form:effective_communication
form:family_survey
form:it_support
form:observation
form:ovc_survey
form:postnatal_care
form:pregnancy
form:pregnancy_referral_follow_up
form:pregnancy_visit
form:preparation

NB

Does not update

This report was originally done 40 minutes ago. No new forms have arrived in the intervening period, despite working internet connection.

Deleted forms

Forms appear client-side which were deleted and no longer appear on the server:

form:assessment
form:stock

Missing forms

The following are missing client-side:

form:contact:clinic:edit
form:contact:health_center:create
form:contact:health_center:edit
form:family_survey
form:it_support
form:observation
form:preparation

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

Would be interesting to see a comparison of the XML sizes for the different forms - perhaps the missing ones are really big.

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

Forms size comparison:

form size (bytes)
form:assessment 58581
form:assessment_follow_up 19055
form:contact:health_center:create 9020
form:contact:health_center:edit 2032
form:effective_communication 4085
form:contact:clinic:create 16699
form:contact:clinic:edit 2251
form:family_survey 5616
form:it_support 2738
form:observation 3303
form:ovc_survey 24664
form:contact:person:create 4882
form:contact:person:edit 2860
form:postnatal_care 22999
form:pregnancy 27185
form:pregnancy_referral_follow_up 9796
form:pregnancy_visit 23207
form:preparation 3444
form:assessment_image 67339

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

form:contact:health_center:edit is missing, but only 2KB.

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 1, 2016

@sglangevin reports that restarting the app does not re-trigger the form download - local storage must be wiped and process restarted from scratch.

@garethbowen garethbowen added Type: Bug Fix something that isn't working as intended and removed Mobile Only labels Apr 3, 2016
@garethbowen
Copy link
Member

Yeah, looks like a pouch regression. seq is updated the same whether or not all docs replicated successfully.

@alxndrsn
Copy link
Contributor

alxndrsn commented Apr 6, 2016

  • fix pushed to develop and testing
  • new issue created for updating pouch once an official fix is available

@garethbowen
Copy link
Member

Looks good.

@garethbowen
Copy link
Member

Related: #2167

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 1 - High Blocks the next release. Type: Bug Fix something that isn't working as intended
Projects
None yet
Development

No branches or pull requests

5 participants