Choropleth - stroke by postcode #3

mdsumner · 2018-10-08T20:04:22Z

(I'll make some numbered suggestions for discussion, if you agree just thumbs-up and I will modify the example accordingly, otherwise please discuss!)

How about this for the stroke_count_estimate?

It simplifies the code somewhat by putting the array of incidence constants in directly against the right age-specific column, and doesn't require an indirect use of apply, or the extra function ComputeStrokeCount.

The current code https://github.com/SymbolixAU/GeospatialStroke/blob/ChoroplethMMC/mmc_surrounds.R#L54-L77 then becomes:

#################
## --- 1) 
basicDemographicsVIC <- mutate(basicDemographicsVIC, 
                       Age_0_24_yr_P = Age_0_4_yr_P + Age_5_14_yr_P + 
                         Age_15_19_yr_P + Age_20_24_yr_P)
basicDemographicsVIC <- mutate(basicDemographicsVIC, stroke_count_estimate = ( 
                   Age_0_24_yr_P  * 5 +  
                   Age_25_34_yr_P * 30 +   
                   Age_35_44_yr_P * 44 +  
                   Age_45_54_yr_P * 111 +  
                   Age_55_64_yr_P * 299 +  
                   Age_65_74_yr_P * 747 + 
                   Age_75_84_yr_P * 1928 +  
                   Age_85ov_P     * 3976) / 100000)

The other "tidyverse" way to map those values to the right column would require a named vector of the constants, and conversion to long form - which I think is less clear. The text might say that though, "in a database context these columns would be stored in long-form, with incidence as a join-able second table" - we can move from interactive exploratory mode to production mode.

The text was updated successfully, but these errors were encountered:

mdsumner · 2018-10-08T20:08:11Z

Use read_sf rather than st_read to avoid factor creation from strings by default in https://github.com/SymbolixAU/GeospatialStroke/blob/ChoroplethMMC/mmc_surrounds.R#L44

#################
## --- 2) 
postcodeboundariesAUS <- sf::read_sf("ABSData/Boundaries/POA_2016_AUST.shp")

This avoids the downstream message from the right_join

basicDemographicsVIC <- right_join(postcodeboundariesAUS, basicDemographicsVIC, 
                                   by=c("POA_CODE" = "POA_CODE_2016"))
Warning message:
Column `POA_CODE`/`POA_CODE_2016` joining factor and character vector, coercing into character vector

mdsumner · 2018-10-08T20:11:30Z

Consider plot of DistanceToMMC at https://github.com/SymbolixAU/GeospatialStroke/blob/ChoroplethMMC/mmc_surrounds.R#L93

#################
## --- 3) 
plot(basicDemographicsVIC["DistanceToMMC"])

Note that we didn't have to worry about the map projection, st_distance is correct as per great circle assumptions in longlat data.

This is where it needs to be said that crow-distance and transport-distance are obviously different.

richardbeare · 2018-10-08T21:22:28Z

Stroke count computation looks nicer.

I think we'll need comments on projections in the main text, with reference to st_distance.

richardbeare · 2018-10-09T01:09:09Z

I think you can merge your other suggestions into the code.

We'll cover crow-distance and transport-distance and transport time in a separate example.

richardbeare · 2018-10-24T02:50:32Z

@mdsumner - a query on crs. I'm experimenting with including osm geocoding (with tmap_tools, since we're using tmap anyway) to avoid the need for keys in this example. If I do:

MMCLocation <- tmaptools::geocode_OSM("Monash Medical Centre, Clayton, Victoria, Australia", as.sf=TRUE)
MMCLocation

Simple feature collection with 1 feature and 7 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 145.1207 ymin: -37.92093 xmax: 145.1207 ymax: -37.92093
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
                                                query       lat      lon   lat_min   lat_max  lon_min  lon_max                   geometry
1 Monash Medical Centre, Clayton, Victoria, Australia -37.92093 145.1207 -37.92098 -37.92088 145.1207 145.1208 POINT (145.1207 -37.92093)

Whereas the census postcode data is:

postcodeboundariesAUS

Simple feature collection with 2670 features and 4 fields (with 2 geometries empty)
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 96.81694 ymin: -43.74051 xmax: 167.998 ymax: -9.142176
epsg (SRID):    4283
proj4string:    +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
# A tibble: 2,670 x 5
   POA_CODE POA_CODE16 POA_NAME  AREA_SQKM                                                                                geometry
   <chr>    <chr>      <chr>         <dbl>                                                                      <MULTIPOLYGON [°]>
 1 POA0800  0800       0800          3.17  (((130.8502 -12.45301, 130.8496 -12.45622, 130.8504 -12.45637, 130.8501 -12.45824, 1...
 2 POA0810  0810       0810         23.8   (((130.8715 -12.39532, 130.8708 -12.39539, 130.8628 -12.39544, 130.8609 -12.39645, 1...

What is the correct way of ensuring crs are consistent?

mdsumner · 2018-10-24T03:30:36Z

Strictly, the right way is to transform one to the other (you could use sf::st_crs(postcodeboundariesAUS) rather than prj, or you could enter 4283 directly):

prj <- "+proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs"
 format(unclass(sf::st_geometry(sf::st_transform(MMCLocation, prj))), digits = 15)
[1] "145.1207033, -37.9209311"
> format(unclass(sf::st_geometry(MMCLocation)),  digits = 15)
[1] "145.1207033, -37.9209311"

But, as you can see they aren't different. I don't know exactly on this stuff, but the +towgs84=<all zeros> to me means that it's equivalent to WGS84 in 2D, ( if surveyor accuracy is required we'd be using GDA94 (the datum)and a real 3D transformation grid would be in use).

I'm not quite sure what the advice should be generally, use "+proj=longlat +datum=WGS84 +no_defs" ("+init=epsg:4326", or st_crs(4326)) unless you have source data that uses this locally specific one. I think probably we should use 4283 generally, but it never actually matters in my work these days.

Take home: keep the metadata strictly for use by the sf tools, and always check if you have to apply it or modify it manually.

richardbeare · 2018-10-24T03:40:13Z

Ah, I'll use st_transform to spread good practices.

Thanks

mdsumner mentioned this issue Oct 8, 2018

API keys query #2

Closed

mdsumner added a commit that referenced this issue Oct 10, 2018

apply #3

ed08196

richardbeare closed this as completed Mar 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choropleth - stroke by postcode #3

Choropleth - stroke by postcode #3

mdsumner commented Oct 8, 2018 •

edited

Loading

mdsumner commented Oct 8, 2018 •

edited

Loading

mdsumner commented Oct 8, 2018 •

edited

Loading

richardbeare commented Oct 8, 2018

richardbeare commented Oct 9, 2018

richardbeare commented Oct 24, 2018

mdsumner commented Oct 24, 2018

richardbeare commented Oct 24, 2018

Choropleth - stroke by postcode #3

Choropleth - stroke by postcode #3

Comments

mdsumner commented Oct 8, 2018 • edited Loading

mdsumner commented Oct 8, 2018 • edited Loading

mdsumner commented Oct 8, 2018 • edited Loading

richardbeare commented Oct 8, 2018

richardbeare commented Oct 9, 2018

richardbeare commented Oct 24, 2018

mdsumner commented Oct 24, 2018

richardbeare commented Oct 24, 2018

mdsumner commented Oct 8, 2018 •

edited

Loading

mdsumner commented Oct 8, 2018 •

edited

Loading

mdsumner commented Oct 8, 2018 •

edited

Loading