Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review of ch12 #782

Merged
merged 8 commits into from Apr 24, 2022
Merged

Review of ch12 #782

merged 8 commits into from Apr 24, 2022

Conversation

Nowosad
Copy link
Member

@Nowosad Nowosad commented Apr 21, 2022

Hi @jannes-m -- I read the whole chapter (before the most recent changes) and learned a lot!

I also added a number of rather small changes and have a few comments:

  1. I would suggest cleaning most of the commented lines in the chapter.
  2. Line 145 -- contains a link to datacamp materials. Given the problematic history of the company, I suggest to replace it.
  3. Line 463 -- You say "...as expected...". I think it would not be expected by many people not familiar with this topic. I would suggest adding a sentence or a few explaining why this is expected.
  4. Line 631 -- I would suggest adding an example runtime (e.g., "on a modern laptop it took XX seconds")

Merge remote-tracking branch 'origin/main' into rev12

# Conflicts:
#	12-spatial-cv.Rmd
@jannes-m
Copy link
Collaborator

Jakub, thanks a lot for taking the time to review and improve the chapter, very much appreciated! I like your comments, and will address them in this PR - hopefully within the next week.

@Robinlovelace
Copy link
Collaborator

On a different but related (see CI checks) note, any idea what is causing this message?

Quitting from lines 403-408 (15-eco.Rmd) 
Error in sc[, 1] : incorrect number of dimensions
Calls: local ... eval_with_user_handlers -> eval -> eval -> data.frame
In addition: Warning message:
In `$.crs`(attr(geom, "crs"), "wkt") :
  CRS uses proj4string, which is deprecated.

Merge branch 'main' into rev12

# Conflicts:
#	12-spatial-cv.Rmd
Merge branch 'rev12' of github.com:Robinlovelace/geocompr into rev12

# Conflicts:
#	12-spatial-cv.Rmd
@jannes-m
Copy link
Collaborator

On a different but related (see CI checks) note, any idea what is causing this message?

Quitting from lines 403-408 (15-eco.Rmd) 
Error in sc[, 1] : incorrect number of dimensions
Calls: local ... eval_with_user_handlers -> eval -> eval -> data.frame
In addition: Warning message:
In `$.crs`(attr(geom, "crs"), "wkt") :
  CRS uses proj4string, which is deprecated.

I have just merged the main branch into this PR branch, and I guess this should resolve the build issue.

@jannes-m
Copy link
Collaborator

mmh, Jakub had already merged the main branch, and I should have pulled that earlier. In any case, I can build chapter 15 locally, so I am not quite sure what the problem is...

@Robinlovelace
Copy link
Collaborator

No problem, we can merge this and fix any issues later if there are any.

@jannes-m
Copy link
Collaborator

I couldn't really solve the problem and I have really no idea why sc[, 1] did not work in the build process. But at least I could circumvent the problem by saving the response-predictor matrix as an rds file and reading it back in again.

Copy link
Collaborator

@Robinlovelace Robinlovelace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good changes, I suggest making any final tweaks and merging, or can merge now and do post-merge fixes. Sound reasonable @jannes-m ? Thanks, chapter looking really good.

@@ -30,9 +30,9 @@ Required data will be attached in due course.

Statistical learning\index{statistical learning} is concerned with the use of statistical and computational models for identifying patterns in data and predicting from these patterns.
Due to its origins, statistical learning\index{statistical learning} is one of R's\index{R} great strengths (see Section \@ref(software-for-geocomputation)).^[
Applying statistical techniques to geographic data has been an active topic of research for many decades in the fields of Geostatistics, Spatial Statistics and point pattern analysis [@diggle_modelbased_2007; @gelfand_handbook_2010; @baddeley_spatial_2015].
Applying statistical techniques to geographic data has been an active topic of research for many decades in the fields of geostatistics, spatial statistics and point pattern analysis [@diggle_modelbased_2007; @gelfand_handbook_2010; @baddeley_spatial_2015].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

]
Statistical learning\index{statistical learning} combines methods from statistics\index{statistics} and machine learning\index{machine learning} and its methods can be categorized into supervised and unsupervised techniques.
Statistical learning\index{statistical learning} combines methods from statistics\index{statistics} and machine learning\index{machine learning} and can be categorized into supervised and unsupervised techniques.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question: could we rename the chapter Geostatistical learning?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, wouldn't do that, geostatistics is basically a field of its own and we are not doing geostatistics here

@@ -79,7 +79,7 @@ data("lsl", "study_mask", package = "spDataLarge")
ta = terra::rast(system.file("raster/ta.tif", package = "spDataLarge"))
```

This should load three objects: a `data.frame` named `lsl`, an `sf` object named `study_mask` and a `SpatRaster` (see Section \@ref(raster-classes)) named `ta` containing terrain attribute rasters.
The above code loads three objects: a `data.frame` named `lsl`, an `sf` object named `study_mask` and a `SpatRaster` (see Section \@ref(raster-classes)) named `ta` containing terrain attribute rasters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -90,28 +90,26 @@ The 175 non-landslide points were sampled randomly from the study area, with the
# library(tmap)
# data("lsl", package = "spDataLarge")
# ta = terra::rast(system.file("raster/ta.tif", package = "spDataLarge"))
# lsl_sf = sf::st_as_sf(lsl, coords = c("x", "y"), crs = 32717)
# lsl_sf = sf::st_as_sf(lsl, coords = c("x", "y"), crs = "EPSG:32717")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- `cprof`: profile curvature (rad m^-1^) as a measure of flow acceleration, also known as downslope change in slope angle.
- `elev`: elevation (m a.s.l.) as the representation of different altitudinal zones of vegetation and precipitation in the study area.
- `log10_carea`: the decadic logarithm of the catchment area (log10 m^2^) representing the amount of water flowing towards a location.
- `slope` - slope angle (°)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the previous style here, but without the full stops. So I would make it:

- `slope`: slope angle (°)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robinlovelace The new style is consistent with the style in chapters 1-8

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robinlovelace The new style is consistent with the style in chapters 1-8

Which parts @Nowosad ? Had a quick look at the output below.

rg ' - ' *.Rmd
README.Rmd
5:<!-- README.md is generated from README.Rmd. Please edit that file - rmarkdown::render('README.Rmd', output_format = 'github_document', output_file = 'README.md') -->
55:<!-- - Next issue  -->

index.Rmd
11:  - geocompr.bib
12:  - packages.bib

_15-ex.Rmd
145:  mtry = paradox::p_int(lower = 1, upper = ncol(task$data()) - 1),

14-location.Rmd
332:    iter = iter - 1

15-eco.Rmd
65:For this, we will make use of a random forest model\index{random forest} - a very popular machine learning\index{machine learning} algorithm [@breiman_random_2001].
520:<!-- (`r # ncol(rp) - 1`), -->
526:  mtry = paradox::p_int(lower = 1, upper = ncol(task$data()) - 1),
650:all(values(pred - pred_2) == 0)

13-transport.Rmd
66:# code that generated the input data - see also ?bristol_ways
576:# online figure - backup
609:    - Bonus: find two ways of arriving at the same answer.
617:    - Bonus: what proportion of trips cross the proposed routes?
618:    - Advanced: write code that would increase this proportion.
633:    - Bonus: develop a raster layer that divides the Bristol region into 100 cells (10 by 10) and provide a metric related to transport policy, such as number of people trips that pass through each cell by walking or the average speed limit of roads, from the `bristol_ways` dataset (the approach taken in Chapter \@ref(location)).

_12-ex.Rmd
22:    - Slope
23:    - Plan curvature
24:    - Profile curvature
25:    - Catchment area

11-algorithms.Rmd
193:abs(T1[1, 1] * (T1[2, 2] - T1[3, 2]) +
194:  T1[2, 1] * (T1[3, 2] - T1[1, 2]) +
195:  T1[3, 1] * (T1[1, 2] - T1[2, 2]) ) / 2
213:i = 2:(nrow(poly_mat) - 2)
222:  abs(x[1, 1] * (x[2, 2] - x[3, 2]) +
223:        x[2, 1] * (x[3, 2] - x[1, 2]) +
224:        x[3, 1] * (x[1, 2] - x[2, 2]) ) / 2
294:    x[1, 1] * (x[2, 2] - x[3, 2]) +
295:    x[2, 1] * (x[3, 2] - x[1, 2]) +
296:    x[3, 1] * (x[1, 2] - x[2, 2])
327:  i = 2:(nrow(poly_mat) - 2)
343:  i = 2:(nrow(poly_mat) - 2)
412:    - Which of the best practices covered in Section \@ref(scripts) does it follow?
413:    - Create a version of the script on your computer in an IDE\index{IDE} such as RStudio\index{RStudio} (preferably by typing-out the script line-by-line, in your own coding style and with your own comments, rather than copy-pasting --- this will help you learn how to type scripts). Using the example of a square polygon (e.g., created with `poly_mat = cbind(x = c(0, 0, 9, 9, 0), y = c(0, 9, 9, 0, 0))`) execute the script line-by-line.
414:    - What changes could be made to the script to make it more reproducible?
415:    <!-- - Answer: The script could state that it needs a an object called `poly_mat` to be present and, if none is present, create an example dataset at the outset for testing. -->
417:<!--     - Try to reproduce the results: how many significant earthquakes were there last month? -->
418:<!--     - Modify the script so that it provides a map with all earthquakes that happened in the past hour. -->
421:    - How could the documentation be improved?
422:  <!-- It could document the source of the data better - e.g. with `data from https://earthquake.usgs.gov/earthquakes/feed/v1.0/geojson.php` -->
424:    - Reproduce the results on your own computer with reference to the script `10-centroid-alg.R`, an implementation of this algorithm (bonus: type out the commands - try to avoid copy-pasting).
426:    - Are the results correct? Verify them by converting `poly_mat` into an `sfc` object (named `poly_sfc`) with `st_polygon()` (hint: this function takes objects of class `list()`) and then using `st_area()` and `st_centroid()`.
434:     - Bonus 1: Think about why the method only works for convex hulls and note changes that would need to be made to the algorithm to make it work for other types of polygon.
436:     - Bonus 2: Building on the contents of `10-centroid-alg.R`, write an algorithm\index{algorithm} only using base R functions that can find the total length of linestrings represented in matrix form.
439:    - Verify it works by running `poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))`
440:    - What error message do you get when you try to run `poly_centroid_sf(poly_mat)`?

10-gis.Rmd
223:In our case, three arguments seem important - `INPUT`, `OVERLAY`, and `OUTPUT`.
426:The U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created the core of the Geographical Resources Analysis Support System (GRASS)\index{GRASS} [Table \@ref(tab:gis-comp); @neteler_open_2008] from 1982 to 1995. 
430:Here, we introduce **rgrass**\index{rgrass (package)} with one of the most interesting problems in GIScience - the traveling salesman problem\index{traveling salesman}.
434:In our case, the number of possible solutions correspond to `(25 - 1)! / 2`, i.e., the factorial of 24 divided by 2 (since we do not differentiate between forward or backward direction).
435:Even if one iteration can be done in a nanosecond, this still corresponds to `r format(factorial(25 - 1) / (2 * 10^9 * 3600 * 24 * 365))` years.
627:<!-- - We could have used GRASS's spatial database\index{spatial database} (based on SQLite) which allows faster processing.  -->
631:<!-- - We could have also accessed an already existing GRASS spatial database from within R. -->
635:<!-- - You can also start R from within a running GRASS\index{GRASS} session [for more information please refer to @bivand_applied_2013 and this [wiki](https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7)]. -->
636:<!-- - Refer to the excellent [GRASS online help](https://grass.osgeo.org/grass77/manuals/) or `execGRASS("g.manual", flags = "i")` for more information on each available GRASS geoalgorithm\index{geoalgorithm}. -->
637:<!-- - If you would like to use GRASS 6 from within R, use the R package **spgrass6**. -->
651:<!-- source code (or docker) - https://github.com/jblindsay/whitebox-tools -->
734:#> Extent: (-180.000000, -89.900000) - (179.999990, 83.645130)
896:As a final note, if your data is getting too big for PostgreSQL/PostGIS and you require massive spatial data management and query performance, then the next logical step is to use large-scale geographic querying on distributed computing systems, as for example, provided by GeoMesa (http://www.geomesa.org/) or Apache Sedona [https://sedona.apache.org/; formermly known as GeoSpark - @huang_geospark_2017].
908:    - **RQGIS**, **RSAGA** and **rgrass7**
909:    - **sf**

08-read-write-plot.Rmd
97:<!-- - elevatr - https://github.com/jhollist/elevatr/issues/64 -->
101:<!-- - https://github.com/ErikKusch/KrigR -->
105:<!-- - https://github.com/ropensci/MODIStsp -->
213:<!-- rgee - see https://github.com/loreabad6/30DayMapChallenge/blob/main/scripts/day08_blue.R -->
223:<!-- potentially useful package - https://github.com/eblondel/geosapi -->
224:<!-- rstac - https://gist.github.com/h-a-graham/420434c158c139180f5eb82859099082, -->
413:<!-- - KEA - https://gdal.org/drivers/raster/kea.html -->
414:<!-- - sfarrow & geoparquet/pandas/GeoFeather -->
604:A KML file stores geographic information in XML format - a data format for the creation of web pages and the transfer of data in an application-independent way [@nolan_xml_2014].

09-mapping.Rmd
314:    rect(xleft = 0:(n - 1), ybottom = i - 1, xright = 1:n, ytop = i - 0.2,
317:  text(rep(-0.1, n_colors), (1: n_colors) - 0.6, labels = titles, xpd = TRUE, adj = 1)
344:```{r na-sb, message=FALSE, fig.cap="Map with additional elements - a north arrow and scale bar.", out.width="50%", fig.asp=1, fig.scap="Map with a north arrow and scale bar."}
481:```{r insetmap1, message=FALSE, fig.cap="Inset map providing a context - location of the central part of the Southern Alps in New Zealand.", fig.scap="Inset map providing a context."}
487:Inset map can be saved to file either by using a graphic device (see Section \@ref(visual-outputs)) or the `tmap_save()` function and its arguments - `insets_tm` and `insets_vp`.
809:  # abort old way of including - mixed content issues
1011:Additionally, it is possible to modify the `intermax` argument - maximum number of iterations for the cartogram transformation.
1080:    - Name two advantages of each based on the experience.
1081:    - Name three other mapping packages and an advantage of each.
1082:    - Bonus: create three more maps of Africa using these three packages.
1084:    - Bonus: improve the map aesthetics, for example by changing the legend title, class labels and color palette.
1089:    - Change the default colors to match your perception of the land cover categories
1090:    - Add a scale bar and north arrow and change the position of both to improve the map's aesthetic appeal
1091:    - Bonus: Add an inset map of Zion National Park's location in the context of the Utah state. (Hint: an object representing Utah can be subset from the `us_states` dataset.) 
1093:    - With one facet showing HDI and the other representing population growth (hint: using variables `HDI` and `pop_growth`, respectively)
1094:    - With a 'small multiple' per country
1097:    - Showing first the spatial distribution of HDI scores then population growth
1098:    - Showing each country in order
1100:    - With **tmap**
1101:    - With **mapview**
1102:    - With **leaflet**
1103:    - Bonus: For each approach, add a legend (if not automatically provided) and a scale bar
1105:    - In the city you live, for a couple of users per day
1106:    - In the country you live, for dozens of users per day
1107:    - Worldwide for hundreds of users per day and large data serving requirements
1109:    - Using `textInput()`
1110:    - Using `selectInput()`

_05-ex.Rmd
50:nrow(nz_height_near_cant) # 75 - 5 more
146:plot(srtm_resampl_all - srtm_resampl1, range = c(-300, 300))
147:plot(srtm_resampl_all - srtm_resampl2, range = c(-300, 300))
148:plot(srtm_resampl_all - srtm_resampl3, range = c(-300, 300))
149:plot(srtm_resampl_all - srtm_resampl4, range = c(-300, 300))
150:plot(srtm_resampl_all - srtm_resampl5, range = c(-300, 300))

05-geometry-operations.Rmd
235:To achieve that, each object is firstly shifted in a way that its center has coordinates of `0, 0` (`(nz_sfc - nz_centroid_sfc)`). 
241:nz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc
264:The `rotation` function accepts one argument `a` - a rotation angle in degrees.
269:nz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc
289:nz_scale_rotate = (nz_sfc - nz_centroid_sfc) * 0.25 * rotation(90) + nz_centroid_sfc
295:nz_shear = (nz_sfc - nz_centroid_sfc) * shearing(1.1, 0) + nz_centroid_sfc
776:- Nearest neighbor - assigns the value of the nearest cell of the original raster to the cell of the target one.
778:- Bilinear interpolation - assigns a weighted average of the four nearest cells from the original raster to the cell of the target one (Figure \@ref(fig:bilinear)). The fastest method for continuous rasters
779:- Cubic interpolation - uses values of 16 nearest cells of the original raster to determine the output cell value, applying third-order polynomial functions. Used for continuous rasters. It results in a more smoothed surface than the bilinear interpolation, but is also more computationally demanding
780:- Cubic spline interpolation - also uses values of 16 nearest cells of the original raster to determine the output cell value, but applies cubic splines (piecewise third-order polynomial functions) to derive the results. Used for continuous rasters
781:- Lanczos windowed sinc resampling - uses values of 36 nearest cells of the original raster to determine the output cell value. Used for continuous rasters^[More detailed explanation of this method can be found at https://gis.stackexchange.com/a/14361/20955.]
840:<!-- gdalUtils - https://cran.r-project.org/web/packages/gdalUtils/index.html - we mentioned it in geocompr 1; however it seems abandoned -->
841:<!-- gdalUtilities - https://cran.r-project.org/web/packages/gdalUtilities/index.html -->
842:<!-- also - add some reference to GDAL functions! -->
851:- `gdalinfo` - lists various information about a raster file, including its resolution, CRS, bounding box, and more
852:- `gdal_translate` - converts raster data between different file formats
853:- `gdal_rasterize` - converts vector data into raster files
854:- `gdalwarp` - allows for raster mosaicing, resampling, cropping, and reprojecting

_04-ex.Rmd
52:# Calculate n. points in each region - this contains the result
169:E7. Calculate the Normalized Difference Water Index	(NDWI; `(green - nir)/(green + nir)`) of a Landsat image. 
178:  (nir - red) / (nir + red)
184:    (green - nir) / (green + nir)
233:plot(distance_to_coast_km - distance_to_coast_km2)

04-spatial-operations.Rmd
1003:NDVI&= \frac{\text{NIR} - \text{Red}}{\text{NIR} + \text{Red}}\\
1014:The raster object has four satellite bands - blue, green, red, and near-infrared (NIR).
1019:  (nir - red) / (nir + red)
1064:```{r focal-example, echo = FALSE, fig.cap = "Input raster (left) and resulting output raster (right) due to a focal operation - finding the minimum value in 3-by-3 moving windows.", fig.scap="Illustration of a focal operation."}

03-attribute-operations.Rmd
550:Alternatively, we can use one of **dplyr** functions - `mutate()` or `transmute()`.

_02-ex.Rmd
16:# - Its geometry type?
18:# - The number of countries?
20:# - Its coordinate reference system (CRS)?
36:# - What does the `cex` argument do (see `?plot`)?
38:# - Why was `cex` set to the `sqrt(world$pop) / 10000`?
40:# - Bonus: experiment with different ways to visualize the global population.

01-introduction.Rmd
375:<!-- add info about specialized packages - sfnetworks, landscapemetrics, gdalcubes, rgee, etc. -->

02-spatial-data.Rmd
117:<!-- - [liblwgeom](https://github.com/postgis/postgis/tree/master/liblwgeom), a geometry engine used by PostGIS, via the [**lwgeom**](https://r-spatial.github.io/lwgeom/) package -->
423:# not printed - enough of these figures already (RL)
450:# Plotted - it is referenced in ch5 (st_cast)
1042:The degree of compression is often referred to as *flattening*, defined in terms of the equatorial radius ($a$) and polar radius ($b$) as follows: $f = (a - b) / a$. The terms *ellipticity* and *compression* can also be used.
1055:Both datums in Figure \@ref(fig:datum-fig) are put on top of a geoid - a model of global mean sea level.^[Please note that the geoid on the Figure exaggerates the bumpy surface of the geoid by a factor of 10,000 to highlight the irregular shape of the planet.]
1075:There are three main groups of projection types - conic, cylindrical, and planar (azimuthal).

_03-ex.Rmd
131:  mutate(pop_dens_diff_10_15 = pop_dens_15 - pop_dens_10,
136:E11. Change the columns' names in `us_states` to lowercase. (Hint: helper functions - `tolower()` and `colnames()` may help.)
144:The new object should have only two variables - `median_income_15` and `geometry`.
159:  mutate(pov_change = poverty_level_15 - poverty_level_10)
165:  mutate(pov_pct_change = pov_pct_15 - pov_pct_10)
196:r[c(1, 9, 81 - 9 + 1, 81)]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based it on the example from ch5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the style is fine but the dash in

  • slope - slope angle (°)

Does not seem standard. Same with

  • Nearest neighbor - assigns the value of the nearest cell of the original raster to the cell of the target one. It is fast and usually suitable for categorical rasters

Also if there are full stops (periods) in the bullet points there should a bullet point at the end: https://www.instructionalsolutions.com/blog/bulleted-list-punctuation

Regarding colons vs dashes, I think both

  • slope --- slope angle (°)

and

  • slope: slope angle (°)

Would be right with the former being an 'em dash'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robinlovelace I am fine with colons -- we just need to use them consistently.

@jannes-m
Copy link
Collaborator

These are good changes, I suggest making any final tweaks and merging, or can merge now and do post-merge fixes. Sound reasonable @jannes-m ? Thanks, chapter looking really good.

Good plan!

@jannes-m
Copy link
Collaborator

Hey @Nowosad,
first of all, thanks again for reviewing and improving the chapter. Finally, I am addressing your comments but better late than never.

I would suggest cleaning most of the commented lines in the chapter.

I have deleted most of the comments.

Line 145 -- contains a link to datacamp materials. Given the problematic history of the company, I suggest to replace it.

I have deleted the link.

Line 463 -- You say "...as expected...". I think it would not be expected by many people not familiar with this topic. I would suggest adding a sentence or a few explaining why this is expected.

I have pointed to the corresponding section where we explain in detail why it is to be expected that due to the negligence of spatial autocorrelation a non-spatial cv will yield higher AUROC values than a spatial cv, which basically is overfitting.

Line 631 -- I would suggest adding an example runtime (e.g., "on a modern laptop it took XX seconds")

Ok, I have added that the code can easily run for half a day on a modern laptop

Finally, I have deleted your to dos:

And I think you are referring to this pipe here:

# compute the AUROC as a data.table
score_spcv_glm = rr_spcv_glm$score(measure = mlr3::msr("classif.auc")) %>%
  # keep only the columns you need
  .[, .(task_id, learner_id, resampling_id, classif.auc)]

You probably have to use %>% here, since |> does not support the dot notation or well it does but then you would have to use an anonymous (lambda) function. And I think this makes the syntax overly complicated (if it works at all here in combination with data.table).

@Nowosad
Copy link
Member Author

Nowosad commented Oct 21, 2022

Thanks @jannes-m. The only think I would suggest changing is related to your last answer. This is the only place in the book where we use the %>%. Thus, I just think it would be better to remove the pipe entirely, and just split the code into two lines here...

@jannes-m
Copy link
Collaborator

OK, fair enough, will do so!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants