Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy new national commute data #703

Open
Robinlovelace opened this issue Apr 1, 2019 · 90 comments

Comments

Projects
None yet
7 participants
@Robinlovelace
Copy link
Member

commented Apr 1, 2019

As discussed with @AnnaGoodman1 the data is now ready to go. The only data that will change is the vector route network. As of npct/pct-outputs-regional-R@cb9f767 we now have test data for the isle-of-wight, which we can use to develop on the test server.

The plan for the new data is to:

The person mentioned in each task is the best person for the job I think - let me know if you think any of these should be changed. A rough ETA of when these can be done would be useful (I can get the remaining data up by the end of the week). Looking forward to getting the updated national commute data out there!

Suggested target date for deployment: Friday 19th April, to provide 2 weeks for smoothing any wrinkles, but happy to push back if the tasks outlined above take longer.

@JDWoodcock

This comment has been minimized.

Copy link
Contributor

commented Apr 2, 2019

thanks @Robinlovelace what else needs to go out with near market in terms of documentation @RachelAldred and @AnnaGoodman1 . Our old result will change to- does anyone see any issues with this?

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2019

Good point about the documentation - I've added another point on that tagging Rachel for now.

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

@Robinlovelace , it looks like you have uploaded Isle of Wight data to the git server? please don't do that for other regions, otherwise you are doing a half launch, since that is where the data the server is pointing to. But I guess doing it just for IOW not terrible. otherwise looks good!

re. documentation, I think I have written what we need (At least a minimally acceptable version), currently here: Dropbox\PCT\2_WorkInProgress\Anna\NearMarket

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

ALSO please make all these shiny changes on the branch #668 !

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Great. @AnnaGoodman1 these recent commits have no impact on the results because the new rnet files live in the LSOA not the MSOA folder. If you want to test the data, as I have done, you can do this:

file.copy(paste0("../pct-outputs-regional-R/commute/lsoa/", r, "/rnet.Rds"),
          paste0("../pct-outputs-regional-R/commute/msoa/", r, "/rnet.Rds"), overwrite = TRUE)

But please revert that change and do not push it. Of course I'm not planning to do a 'half launch' but thanks for confirming! Note: the production server only updates when we change the 'data sha'.

Good point about making the changes on that branch. @usr110 are you up for testing that on the test LSOA commute rnet data that I've pushed for isle-of-wight for testing purposes? Good to keep the ball rolling on this. Working on the rasters.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

BTW many thanks for updating the 'to-do list' @AnnaGoodman1, just noticed that. Very useful.

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

@Robinlovelace , oh yes , good point that they are lsoa files not previously there -phew!

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

@Robinlovelace , I am updating region stats info page. Do you yet know what the minimum go dutch route segment number will be for each region? I can see standard is 10, but don't know how to tell which regions exceed that threshold...

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Good question that raises another one: where should we save this data? In the main national build file, or in a new one. This time the values are based on an upper limit of 20,000 features (~2/3rds of the schools rnet in London which is quite slow) and I rounded to the nearest 10, resulting in these numbers so far:

   region_name dutch_slc_min n_row
1                        london           150 19841
2            greater-manchester            30 19387
3         liverpool-city-region             0 17648
4               south-yorkshire             0 13959
5                    north-east            20 19535
6                 west-midlands            30 19230
7                  warwickshire            NA    NA
8                west-yorkshire            NA    NA
9                      cheshire            NA    NA
10                   lancashire            NA    NA
11                   humberside            NA    NA
12              north-yorkshire            NA    NA
13                   derbyshire            NA    NA
14               leicestershire            NA    NA
15              nottinghamshire            NA    NA
16       hereford-and-worcester            NA    NA
17                   shropshire            NA    NA
18                staffordshire            NA    NA
19                         avon             0 13829
20                        devon            NA    NA
21                       dorset            NA    NA
22                    wiltshire            NA    NA
23               cambridgeshire            NA    NA
24                 bedfordshire            NA    NA
25                        essex            NA    NA
26                         kent            NA    NA
27                    berkshire            NA    NA
28              buckinghamshire            NA    NA
29                  east-sussex            NA    NA
30                    hampshire            NA    NA
31                isle-of-wight             0  1311
32 cornwall-and-isles-of-scilly            NA    NA
33                      cumbria            NA    NA
34              gloucestershire            NA    NA
35                hertfordshire            NA    NA
36                 lincolnshire            NA    NA
37                      norfolk            NA    NA
38             northamptonshire            NA    NA
39               northumberland            NA    NA
40                  oxfordshire            NA    NA
41                     somerset            NA    NA
42                      suffolk            NA    NA
43                       surrey            NA    NA
44                  west-sussex            NA    NA
45                        wales            NA    NA
@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

@Robinlovelace : As for schools layer, the current bodge is that I paste the numbers into the building stats files, i.e. https://github.com/npct/pct-scripts/blob/master/commute/lsoa/build_params_pct_region.csv . You ideally could set up the script to export this number into that file....? (Or we can leave it as a bodge.)

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Thanks, yes that's the ideal solution, and I'll endeavour to add columns with the appropriate info. Many thanks!

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

example code is e.g.
build_params[build_params$region_name == region, ]$n_flow_rnet <- nrow(rf_rnet)

in 07b.2, line 45

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

I'll be able to add them all in one go soon: the regional rnets are half way done (for the second time: the first time I missed the ebike scenario).

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Update on this: the national rnet has been uploaded! See: https://github.com/npct/pct-scripts/releases

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

@mem48 can you do the rasterisation?

Update on methods: I suspect gdal may be faster than the previously used method - see here for documentation: https://gdal.org/gdal_rasterize.html

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 5, 2019

Ok great, either send me the full CSV or update the build Param file, with minimum go Dutch flows. Thanks Anna

@usr110

This comment has been minimized.

Copy link
Member

commented Apr 5, 2019

Good point about making the changes on that branch. @usr110 are you up for testing that on the test LSOA commute rnet data that I've pushed for isle-of-wight for testing purposes? Good to keep the ball rolling on this. Working on the rasters.

Sure. Will have to be next week - same goes for the the clickable rnet option.
Glad you and @AnnaGoodman1 mentioned the branch. It was slightly confusing otherwise.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

Hi guys, just found a faster way of generating the national rasters and the results are looking good! Question: what resolution? The results at the highest resolution (10 m) no longer look great because the new method doesn't convert lines into polygons before rasterizing. I think the polygons were ~30 wide in any case so I've tried rastering them at this resolution and I think the results look good. There are no more high value 'artefacts' at junctions. The results are more 'blocky' but I think this may be no bad thing, as we want people to understand that the results are not saying 'build on this specific path' but 'build along this desire line and in these areas'. Result below show before, after (30m) and after (10m). Cc @AnnaGoodman1 @JDWoodcock @RachelAldred @mem48 @nikolai-b @usr110 - thoughts welcome!

Another thought I had: we could make the visible raster values semi-transparent so they don't hide road names etc. I think that would be beneficial for users if not too hard to do.

Before:

image

After (30 m):

image

After (10 m):

image

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

Also progress on getting the stats for the regional rnets, apologies this is slow in coming @AnnaGoodman1 : https://github.com/npct/pct-scripts/blob/755ff49728409b598e69b95570885631c3e109f0/commute/lsoa/build_params_pct_region.csv.

Any objections if I upload the lsoa route networks? Want to keep momentum up on this. Any update on the UI changes that can support this new data (already there for the test region of isle-of-wight)?

I think @usr110 or @nikolai-b are well placed to do the UI changes and that sooner is probably better than later from a testing perspective.

@RachelAldred

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

Just a heads up that I am demonstrating the pct in Maidstone Tuesday afternoon at an event so would be good not to be doing anything then that might cause instability :) thanks.

@RachelAldred

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

That's Tues 15th.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

UI changes will only affect the test server so shouldn't be an issue. Thanks for heads-up in any case. Still aiming for deployment on 19th, but if UI/other changes are not ready by then we can always delay.

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

For raster resolution, can we not use the same method that was recently used in the schools layer? That looks very good, see below. I don't like the look of the 30m one, it is much less attractive than the current commute or schools layers

Untitled

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

@usr110 , @Robinlovelace : In terms of the UI, , Robin and I had a conversation about this. My suggestion was that the lsoa rnet RDS files simply get saved in the relevant place in the msoa file structure too. I appreciate it is not DRY ideal to save the same file in two places, but that would mean that we would need pretty much no changes to pct-shiny code. I think the alternative may be to introduce lots of messy 'if' statements into pct-shiny server.R file. I'm very happy to be proved wrong on this, but if it does look like it will be a coding nightmare then I suggest we do consider saving the LSOA file in two places

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

Hi @AnnaGoodman1 yes it's possible, but perhaps not within the time-scales required for the national build. Agree that image, which uses polygons representing roads, looks good. @mem48 tried to re-build it but lacks computer and perhaps time capacity to do it within the time constraints I believe. Can have a play. I think the optimum would be:

  • To have high (10 m) underlying resolution
  • To have partial transparency, so that any road names currently hidden in the image pasted above can be seen
  • To remove the artefacts which provide incorrect results at junctions
@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

Update, especially for @AnnaGoodman1 and @mem48, I've found a solution that hits all of those criteria, building on the code written by Malcolm, but using GDAL rather than fasterize to do the actual rasterisation, which seems much (100+ times) faster.

Result (note the transparency, I think this is possible in leaflet) below. Thanks for the feedback, which has led to useful iteration.

image

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

Note: this is with the previous method, with artefacts. I think it's best to follow #704 and update the school raster rather than make the new commute raster like the old school raster based on this, and the fact it's quicker to produce:

image

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

A better image of the old raster file:
image

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

I'm confident that ignoring geography and hardcoding the LSOA folder for the vector rnet in this repo will be less work and better practice than copying all those MB of data in the pct-outputs-regional-R repo.

Also reduces the chances of confusion among people who find the data repos. So yes, fairly strong preference. I'm happy to do this. Heads-up also @nikolai-b.

usr110 pushed a commit to npct/pct-outputs-regional-R that referenced this issue Apr 17, 2019

usr110 added a commit that referenced this issue Apr 17, 2019

@usr110

This comment has been minimized.

Copy link
Member

commented Apr 17, 2019

Thanks both @Robinlovelace and @AnnaGoodman1 for the insight. I understand the issue of rnet for MSOA much better now. I was siding with you on @AnnaGoodman1 that changing the code at the shiny end would be rather fiddly, instead of easier way out of copying rnet from LSOA to MSOA. However it turns out it was rather simple. I made the changes here: ca2dea5

With a bit of testing, it seems to work fine. You may like to test it yourself.

Re rnet_full, as you say that @Robinlovelace that for some of the smaller regions it will be identical to rnet. In that case, I would create a small file, say called identical, indicating that it's identical to rnet, so rnet should be read instead.
For bigger regions, I would create separate rnet and rnet_full. This is more efficient, I think. What do you think?

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

Hi @usr110 great solution in ca2dea5, I guessed it would be few lines of code and think the simple if() statement is a future-proof fix that adheres to the existing structure of the data. Fantastic work!

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

Regarding the separation of rnet and rnet_full, I agree with your idea in principle: even more DRY. However, given that the data has already been pushed, I'm not sure that implementing the idea is a good use of time. Happy to consider a PR that implements your suggestion if you can find a very fast fix. One other idea: use symbolic links to avoid unnecesary duplication of data. Question for @nikolai-b would symbolic links work like that in git-lfs?

usr110 added a commit that referenced this issue Apr 17, 2019

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 17, 2019

Part of the issue is that rnet_full is accessed in the data download page. So you would need also to implement some statement there, saying 'do rnet full if exists, otherwise rnet'. Fine if not fiddly, otherwise also fine by me to leave as is. Well done @usr110 (and @Robinlovelace )

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

Good point Anna, I was thinking only from shiny's narrow perspective on the world 💻. From the downloads perspective, it makes sense to keep them as separate files IMO.

@nikolai-b

This comment has been minimized.

Copy link
Contributor

commented Apr 17, 2019

@Robinlovelace and @usr110 instead of identical just use symbolic links like @Robinlovelace suggests #703 (comment). believe they should work fine.

nikolai-b added a commit that referenced this issue Apr 17, 2019

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

Good to know it's possible to use symbolic links with lfs. Not a priority though so I suggest we keep it as is for now.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 17, 2019

Now that #710 is fixed thanks to uploaded data from @AnnaGoodman1 I think that is everything. Am I right in thinking that we just need to test the new data on the test server before deploying? I've checked in a few random regions and all seems good, and I think that #708 makes the data load quicker too, great work @nikolai-b. Please @JDWoodcock and @RachelAldred have a play and report back if you find any issues. This is almost ready to go!

@JDWoodcock

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

@Robinlovelace what specifically should we check

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2019

Please check each of the scenarios works for zones, lines, routes, route network and route network LSOA options for a selection of regions, e.g.:

  • Isle of Wight
  • London
  • GM
  • Cheshire
  • Liverpool City Region
  • West Midlands
  • Devon

I think if we also try jumping between regions and type of trip that would be good - let us know of any bugs here. And of course any more issues with the landing page.

I've a few more ideas on next steps, discussed with @mem48 (putting the raster tiles on the landing page national map and, more tentatively, making the LSOA route network more prominent, maybe even on by default in the app) but will save that for another day.

@JDWoodcock

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

some quick testing in central England and working well with minimal lags :)
only issue is colouring of image route network. With tiles on it is often hard to make out. I think we need stronger colours. That said at the moment the clickable network is working pretty well- a few second delay when looking at whole of WM but not bad at all. I tested with chrome.

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

working well for me. Agree image rnet only looks v. good with black and white background and no zones - but that seems ok to me?

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2019

I think the fact that the basemap is B+W by default saves it. Disadvantages of disrupting user experience by changing the colourscheme outweigh any advantages of selecting stronger colours is my thinking. Great to hear there are no bugs, and great to hear it's nippy!

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2019

One question @AnnaGoodman1 have the downloadable geojsons been updated? Looking here: https://github.com/npct/pct-outputs-regional-notR

@RachelAldred

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

Looked at London. The image rnet looks generally good to me - I like the way it gives a picture of where the various little feeder routes are coming from, by contrast the clickable route network looks quite disconnected in NE Outer London even at 90% (as you are not seeing all those joining streets that merge to make the higher numbers). Super exciting to have the downloadable LSOA vector network of course so people can visualise more if they want :)
It's sometimes hard to interpret the colours on the image - something I assumed was 1000-1999 in the image layer turned out to be 540 in clickable. But I think that's probably inevitable result of different monitors etc. London is a bit slow as expected with the amount of data but only one actual crash. I think it is ok.
We need to decide when the two blogs go up and I need to tweak the near market one and put on wordpress (thanks for comments).

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

@Robinlovelace , geojsons in progress - about 50% done after 30 hours. plus some national ones to do. hopefully all done by end of lond weekend

@mem48

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

Perhaps we should colour code the clickable route network to match the image, might make comparison easier?

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 22, 2019

all new files now uploaded to git from me

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

Perhaps we should colour code the clickable route network to match the image, might make comparison easier?

That's possible, happy to take a look how it looks, after the new data has been deployed though. Pull requests implementing this welcome for review/testing.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

Now that the front page map has been fixed (see image below), great work @nikolai-b and the new files have all been uploaded I think this is ready to go. I suggest we test over the next few days and raise any further issues here, then merge master into production on Friday, test over the weekend and publicise the new data next week.

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

image

No lines on Firefox, finally!

@AnnaGoodman1

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2019

looking good on live server. @Robinlovelace , do you want to publish your blog and link to it from the whats new sextion of front page?

@Robinlovelace

This comment has been minimized.

Copy link
Member Author

commented Apr 29, 2019

Yes will do, thanks @AnnaGoodman1 . The permalink is https://blogpctbike.wordpress.com/?p=360 - can add that but happy for you to do so also via PR. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.