Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter_loaded_trips #46

Conversation

allenmichael099
Copy link
Contributor

  • made mod_load_trips
  • queried trips by timestamp
  • made a new summarise_trips that doesn't require all of trips data

- made mod_load_trips
- queried trips by timestamp
-  made a new summarise_trips that doesn't require all of trips data
- set the map to use trips_with_trajectories
- uncommented locations and trajectories within mod_load_trips
- locations are now filtered by the same dates as trips
- set max trip date window to 31 days
- worked on handling empty query but no luck yet
- added functionality to prevent the user from getting an empty query
- the checks are done in mod_load_trips in the  load_allowed reactive
- removed remaining traces of trips from mod_load_data
@allenmichael099
Copy link
Contributor Author

There’s more to improve after I get back but I think this is good to go for the moment.

@allenmichael099 allenmichael099 marked this pull request as ready for review June 19, 2021 03:32
@shankari
Copy link
Collaborator

@asiripanich I am going to try to merge this to a branch for now so I can deploy it.

@asiripanich
Copy link
Owner

@asiripanich I am going to try to merge this to a branch for now so I can deploy it.

Hi Shankari, sure but please make sure they works on your loocal machine. Unfortunately, I don't have time to review the PRs this week.

@shankari
Copy link
Collaborator

While testing with this PR on my local laptop and a dataset with ~ 32 trips from one user in the past month, the process was still killed

Running: mod_load_data_server
Running: mod_load_data_server
About to load server calls
Finished query, about to tidy server calls
Finished tidying server calls
Finished loading server calls
About to load participants
merging
Killed

On retrying, it worked

Running: mod_load_data_server
Running: mod_load_data_server
About to load server calls
Finished query, about to tidy server calls
Finished tidying server calls
Finished loading server calls
About to load participants
merging
Finished loading participants
Window_width is 31
About to load trips
Finished query, about to clean trips
Finished cleaning trips
Finished loading trips
About to load locations
Finished query, about to clean locations
Finished cleaning locations
Finished loading locations
About to create trajectories within trips
About to generate trajectories
Trajectories created
Finished creating trajectories within trips
Warning in grSoftVersion() :
  unable to load shared object '/usr/local/lib/R/modules//R_X11.so':
  libXt.so.6: cannot open shared object file: No such file or directory
Warning: Removed 15 rows containing non-finite values (stat_count).

The map still didn't load because none of the entries had labels, so the mode_confirm column was not in the table. Not sure whether I should deploy this yet.

@shankari
Copy link
Collaborator

@allenmichael099 how much data did you test this out on?

@shankari
Copy link
Collaborator

switching to the month of Jan, and it was killed again.

Window_width is 59
Window_width is 31
About to load trips
Finished query, about to clean trips
Finished cleaning trips
Finished loading trips
About to load locations
Finished query, about to clean locations
Killed

I'm going to hold off on deploying this fix.

@shankari
Copy link
Collaborator

Re-disabling the map view using the following patch for deployment since I still want to get the fix for #18

@shankari
Copy link
Collaborator

Still getting some kills. Looking at it a bit further - the process appears to be killed at this point:

About to load participants
merging
Killed

on a successful load, we see

About to load participants
merging
Finished loading participants

The related code is

      message("About to load participants")
      data_r$participants <-
        tidy_participants(query_stage_profiles(cons), query_stage_uuids(cons)) %>%
        summarise_trips_without_trips(., cons) %>%
        summarise_server_calls(., data_r$server_calls)
      message("Finished loading participants")

The last merging message is from summarise_server_calls

  message("merging ")
  # merge(participants, usercache_get_summ, usercache_put_summ, by = "user_id", all.x = TRUE)
  merge(participants, usercache_get_summ, by = "user_id", all.x = TRUE) %>%
    merge(., usercache_put_summ, by = "user_id", all.x = TRUE) %>%
    merge(., diary_summ, by = "user_id", all.x = TRUE)

Why is merging with the summaries so data intensive? Each summary only has the first and last entry.
One potential fix would be to filter the server calls also by the date range.

shankari added a commit to shankari/emdash that referenced this pull request Jun 24, 2021
@shankari
Copy link
Collaborator

shankari commented Jun 24, 2021

This also seems to fail on production, maybe because the range is too short?

Listening on http://0.0.0.0:80
2021-06-24
2021-06-16

No further log messages, no data visible in the dashboard

Screen Shot 2021-06-23 at 10 57 07 PM

@shankari
Copy link
Collaborator

@allenmichael099 I have shared a mongodump of the data in the environment that generates this
#46 (comment)

This is 100% reproducible. We should also talk a bit about thinking through various scenarios and adding unit tests

- Fixed minimum trip date
- moved trip_trend inside observeEvent so it reacts to a change in trips
- commented out maps, locations, and trajectories
- altered dates from input so the timestamps include all days that they should.
- EX: When both dates are 5-06-21, the final query timestamp should be the start of 5-07-21
@shankari
Copy link
Collaborator

Potential unit tests:

  • No trips
  • Trips for only one week
  • Trips without any labels

@allenmichael099
Copy link
Contributor Author

@shankari I'm having trouble replicating the 'killed' error. Using the January 2021 data with 36 participants the server calls merged fine.

@shankari
Copy link
Collaborator

shankari commented Jul 1, 2021

ok, I will deploy this version and see if I run into the killed issue on production.

allenmichael099 and others added 7 commits July 2, 2021 10:22
- made functions to query get/put/diary first and last server calls
- used those in an altered summarise_server_calls
-  removed query_server_calls lines from mod_load_data
- uncommented map, locations, and trajectories sections
- added details to merge and timestamp messages
- when there are no user inputs, no user input columns will be added.
- Wrote a message to notify when there are no user inputs
- Moved functions following the query out of query_cleaned_trips_by_timestamp
Made tests for:
- tidy_cleaned_trips_by_timestamp
- get_query_size
- summarise_trips_without_trips
- summarise_sesrver_calls
@asiripanich asiripanich merged commit 1c45f0f into asiripanich:master Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants