Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways to combine categorical rasters [Feature Request] #663

Closed
ailich opened this issue Jun 8, 2022 · 7 comments
Closed

Ways to combine categorical rasters [Feature Request] #663

ailich opened this issue Jun 8, 2022 · 7 comments

Comments

@ailich
Copy link
Contributor

ailich commented Jun 8, 2022

I think having some built in ways to combine multiple categorical maps into a single thematic map could be really useful. Possibly by having the + operator behave differently if the raster is detected to be a factor or by having a dedicated function. I've included an example below showing how this can be done with data frames, how it can currently be done in terra (if you don't have more than 10 classes; though maybe there's a better way I'm not aware of), as well as some proposed syntax. The example I have is just combining two variables, but could be extended out to any number of factor variables.

library(dplyr)
library(forcats)

landcover<- factor(c("forest", "wetland", "lake", "developed", "agriculture"))
geomorph<- factor(c("sloping", "flat", "depression", "peak"))

set.seed(5)
df<- data.frame(landcover= sample(landcover, size = 10, replace = TRUE), geomorph= sample(geomorph, size = 10, replace = TRUE))

df<- df %>% mutate(combined= fct_cross(landcover, geomorph, sep = "_", keep_empty = TRUE))
levels(df$combined)
# [1] "agriculture_depression" "developed_depression"   "forest_depression"      "lake_depression"        "wetland_depression"     "agriculture_flat"      
# [7] "developed_flat"         "forest_flat"            "lake_flat"              "wetland_flat"           "agriculture_peak"       "developed_peak"        
# [13] "forest_peak"            "lake_peak"              "wetland_peak"           "agriculture_sloping"    "developed_sloping"      "forest_sloping"        
# [19] "lake_sloping"           "wetland_sloping"       

df
# landcover   geomorph           combined
# 1      wetland depression wetland_depression
# 2         lake    sloping       lake_sloping
# 3       forest       peak        forest_peak
# 4         lake       peak          lake_peak
# 5       forest       flat        forest_flat
# 6       forest    sloping     forest_sloping
# 7  agriculture       peak   agriculture_peak
# 8         lake depression    lake_depression
# 9         lake    sloping       lake_sloping
# 10     wetland       flat       wetland_flat

# The current way in terra
library(terra)
landcover_rast<- rast(nrow=10, ncol=10)
set.seed(5)
values(landcover_rast)<- sample(seq(from =0, to=40, by=10), size = 100, replace = TRUE) # represent factor levels shown above in that order

geomorph_rast<- rast(nrow=10, ncol=10)
set.seed(5)
values(geomorph_rast)<- sample(0:3, size = 100, replace = TRUE) # represent factor levels shown above in that order

combined_rast<- landcover_rast + geomorph_rast
all_levels_df<- data.frame(ID = c(0, 1, 2, 3,
                                  10, 11, 12, 13,
                                  20, 21, 22, 23,
                                  30, 31, 32, 33,
                                  40, 41, 42, 43),
                  combined = unlist(lapply(landcover, FUN = paste, geomorph, sep="_")))
all_levels_df
# ID               combined
# 0         forest_sloping
# 1            forest_flat
# 2      forest_depression
# 3            forest_peak
# 10        wetland_sloping
# 11           wetland_flat
# 12     wetland_depression
# 13           wetland_peak
# 20           lake_sloping
# 21              lake_flat
# 22        lake_depression
# 23              lake_peak
# 30      developed_sloping
# 31         developed_flat
# 32   developed_depression
# 33         developed_peak
# 40    agriculture_sloping
# 41       agriculture_flat
# 42 agriculture_depression
# 43       agriculture_peak

combined_rast<- as.factor(combined_rast)
levels(combined_rast)<- all_levels_df

# Proposed terra syntax
landcover_rast<- rast(nrow=10, ncol=10)
set.seed(5)
values(landcover_rast)<- sample(0:4, size = 100, replace = TRUE)
landcover_rast<- as.factor(landcover_rast)
levels(landcover_rast)<- data.frame(ID= 0:4, landcover = factor(c("forest", "wetland", "lake", "developed", "agriculture")))

geomorph_rast<- rast(nrow=10, ncol=10)
set.seed(5)
values(geomorph_rast)<- sample(0:3, size = 100, replace = TRUE)
levels(geomorph_rast)<- data.frame(ID= 0:3, geomorph = factor(c("sloping", "flat", "depression", "peak")))

combined_rast<- landcover_rast + geomorph_rast # Have + operator automatically detect that these layers are factors and instead of adding the literal values use it to combine factor levels (including empty levels).
combined_rast<- combine_cats(landcover_rast, geomorph_rast,  keep_empty = TRUE) # Or instead of the plus operator have a function
# These functions could also potentially be applied across a raster stack
combined_rast<- sum(c(landcover_rast, geomorph_rast))
combined_rast<- app(c(landcover_rast, geomorph_rast), fun= sum)
combined_rast<- combine_cats(c(landcover_rast, geomorph_rast),  keep_empty = TRUE)
rhijmans added a commit that referenced this issue Jun 26, 2022
@rhijmans
Copy link
Member

rhijmans commented Jun 26, 2022

If I understand you well, I do not think that +, sum etc. are clear, because that does not specify what you would do with the values from the different rasters. Instead, we can do this with cover (now implemented) and merge (not yet implemented). But let me know if I misunderstood you

library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE)
levels(r) <- data.frame(id=1:3, cover=c("forest", "water", "urban"))
r[1:5, ] <- NA

rr <- rast(r)
values(rr) <- sample(c(2,5,6), ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=c(2,5:6), cover=c("water", "green", "blue"))

a <- cover(r, rr)
levels(a)
levels(a)[[1]]
#  id  cover
#1  1 forest
#2  2  water
#3  3  urban
#4  5  green
#5  6   blue


plot(a)
text(a)

But it fails if there are conflicting levels (1 is "forest" in r and "red" in rr)

rr <- rast(r)
values(rr) <- sample(c(1,5,6), ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=c(1,5:6), cover=c("red", "green", "blue"))
b <- cover(r, rr)
#Warning message:
#[cover] cannot merge categories of layer 1
levels(b)[[1]]
#[1] ""
`
``

@rhijmans
Copy link
Member

I now see that you are asking for something else. You would like to combine the (active) categories, creating new categories. I will look at that too.

rhijmans added a commit that referenced this issue Jun 27, 2022
@rhijmans
Copy link
Member

rhijmans commented Jun 27, 2022

Now I can do:


library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE)
levels(r) <- data.frame(id=1:3, cover=c("forest", "water", "urban"))

rr <- rast(r)
values(rr) <- sample(1:3, ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=c(1:3), color=c("red", "green", "blue"))

x <- concats(r, rr)
x
#class       : SpatRaster 
#dimensions  : 10, 10, 1  (nrow, ncol, nlyr)
#resolution  : 36, 18  (x, y)
#extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 
#source      : memory 
#categories  : cover_color, idx, idy 
#name        : cover_color 
#min value   :  forest_red 
#max value   :  urban_blue 
 
levels(x)[[1]]
#  ID  cover_color
#1  0   forest_red
#2  1 forest_green
#3  2  forest_blue
#4  3    water_red
#5  4  water_green
#6  5   water_blue
#7  6    urban_red
#8  7  urban_green
#9  8   urban_blue

And note that with numeric values, you can do

u <- unique(c(r,rr)*1, as.raster=TRUE)
levels(u)[[1]]
#  ID label
#1  0   1_1
#2  1   1_2
#3  2   1_3
#4  3   2_1
#5  4   2_2
#6  5   2_3
#7  6   3_1
#8  7   3_2
#9  8   3_3

@ailich
Copy link
Contributor Author

ailich commented Jun 27, 2022

@rhijmans, awesome! And thanks for the tip with unique too!

@ailich ailich closed this as completed Jun 27, 2022
@Jackson2601
Copy link

If I understand you well, I do not think that +, sum etc. are clear, because that does not specify what you would do with the values from the different rasters. Instead, we can do this with cover (now implemented) and merge (not yet implemented). But let me know if I misunderstood you

library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE)
levels(r) <- data.frame(id=1:3, cover=c("forest", "water", "urban"))
r[1:5, ] <- NA

rr <- rast(r)
values(rr) <- sample(c(2,5,6), ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=c(2,5:6), cover=c("water", "green", "blue"))

a <- cover(r, rr)
levels(a)
levels(a)[[1]]
#  id  cover
#1  1 forest
#2  2  water
#3  3  urban
#4  5  green
#5  6   blue


plot(a)
text(a)

But it fails if there are conflicting levels (1 is "forest" in r and "red" in rr)

rr <- rast(r)
values(rr) <- sample(c(1,5,6), ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=c(1,5:6), cover=c("red", "green", "blue"))
b <- cover(r, rr)
#Warning message:
#[cover] cannot merge categories of layer 1
levels(b)[[1]]
#[1] ""
`
``

Hi there, i am currently trying to achieve this myself, and i get the same warning message since i have conflicting categories. Is there a way to work around this? Essentially, i have two rasters, with r1 providing two additional levels of a factor that i would like to add to r2. However, other levels in r1 conflict with those in r2. is there some way i can use the cover function but retain the category labels?

@ailich
Copy link
Contributor Author

ailich commented Jul 18, 2023

@Jackson2601 you should be able to adapt the code below. If there are conflicting ID's between levels you can reclassify the values and then add the factor labels back in.

library(terra)
#> Warning: package 'terra' was built under R version 4.3.1
#> terra 1.7.39
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE)
levels(r) <- data.frame(id=1:3, cover=c("forest", "water", "urban"))
r[1:5, ] <- NA

rr <- rast(r)
values(rr) <- sample(1:2, ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=1:2, cover=c("green", "blue"))

plot(c(r,rr), main=c("r", "rr"))

#Loss of factor levels since 1 corresponds to forest and green amd 2 corresponds to both water and blue
a <- cover(r, rr)
#> Warning: [cover] cannot merge categories of layer 1
levels(a)
#> [[1]]
#> [1] ""
rm(a)

# We need to redefine factor ID's
rcl<- matrix(data = c(1,2,4,5), ncol = 2)
rcl #reclassification matrix says to change 1 to 4 and 2 to 5
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
rr<- classify(rr, rcl) # Change 1 to 4 and 2 to 5
levels(rr) <- data.frame(id=4:5, cover=c("green", "blue")) #remap green and blue to new values
# Green is now 4 and blue is 5
levels(rr)
#> [[1]]
#>   id cover
#> 1  4 green
#> 2  5  blue

a <- cover(r, rr)
levels(a)
#> [[1]]
#>   id  cover
#> 1  1 forest
#> 2  2  water
#> 3  3  urban
#> 4  4  green
#> 5  5   blue

plot(a)
text(a)

Created on 2023-07-18 with reprex v2.0.2

@Jackson2601
Copy link

@Jackson2601 you should be able to adapt the code below. If there are conflicting ID's between levels you can reclassify the values and then add the factor labels back in.

library(terra)
#> Warning: package 'terra' was built under R version 4.3.1
#> terra 1.7.39
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE)
levels(r) <- data.frame(id=1:3, cover=c("forest", "water", "urban"))
r[1:5, ] <- NA

rr <- rast(r)
values(rr) <- sample(1:2, ncell(rr), replace=TRUE)
levels(rr) <- data.frame(id=1:2, cover=c("green", "blue"))

plot(c(r,rr), main=c("r", "rr"))

#Loss of factor levels since 1 corresponds to forest and green amd 2 corresponds to both water and blue
a <- cover(r, rr)
#> Warning: [cover] cannot merge categories of layer 1
levels(a)
#> [[1]]
#> [1] ""
rm(a)

# We need to redefine factor ID's
rcl<- matrix(data = c(1,2,4,5), ncol = 2)
rcl #reclassification matrix says to change 1 to 4 and 2 to 5
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
rr<- classify(rr, rcl) # Change 1 to 4 and 2 to 5
levels(rr) <- data.frame(id=4:5, cover=c("green", "blue")) #remap green and blue to new values
# Green is now 4 and blue is 5
levels(rr)
#> [[1]]
#>   id cover
#> 1  4 green
#> 2  5  blue

a <- cover(r, rr)
levels(a)
#> [[1]]
#>   id  cover
#> 1  1 forest
#> 2  2  water
#> 3  3  urban
#> 4  4  green
#> 5  5   blue

plot(a)
text(a)

Created on 2023-07-18 with reprex v2.0.2

You've no idea how much i appreciate you taking the time to write this. Thanks so much, you're a life saver!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants