Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gshift() of one column returns a vector, not list(vector) #5950

Merged
merged 2 commits into from
Feb 25, 2024

Conversation

MichaelChirico
Copy link
Member

Closes #5939

No NEWS since the NEWS item belongs in the patch branch.

@@ -1268,6 +1268,7 @@ SEXP gshift(SEXP x, SEXP nArg, SEXP fillArg, SEXP typeArg) {
copyMostAttrib(x, tmp); // needed for integer64 because without not the correct class of int64 is assigned
}
UNPROTECT(nprotect);
return(ans);
// consistency with plain shift(): "strip" the list in the 1-input case, for convenience
return isVectorAtomic(x) && length(ans) == 1 ? VECTOR_ELT(ans, 0) : ans;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ben-schwen do we actually need the isVectorAtomic() check in the gshift case? Is it possible to get a list() here in gshift?

Copy link
Member

@ben-schwen ben-schwen Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now no, the only supported types are the six atomic vector types (otherwise we would error before in the macro switch). AFAIR I deactivated the support for list columns because we use coerceAs for the fill argument and coerceAs only supports atomic vectors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also #4586

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave the isVectorAtomic() check, simpler to keep consistency with "plain" shift.

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.48%. Comparing base (82f559f) to head (9e3b0be).
Report is 1 commits behind head on master.

❗ Current head 9e3b0be differs from pull request most recent head a4889f0. Consider uploading reports for the commit a4889f0 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5950   +/-   ##
=======================================
  Coverage   97.48%   97.48%           
=======================================
  Files          80       80           
  Lines       14859    14859           
=======================================
  Hits        14486    14486           
  Misses        373      373           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@renkun-ken
Copy link
Member

remotes::install_github("Rdatatable/data.table#5950")
library(data.table)

dt <- expand.grid(
  date = 1:10,
  time = 1:10,
  id = 1:5
)

setDT(dt, key = c("date", "time", "id"))
dt[, x := runif(.N)]
dt[time == 1, x1 := shift(x, type = "lead"), by = id]
Key: <date, time, id>
Index: <time>
      date  time    id         x                                                              x1
     <int> <int> <int>     <num>                                                          <list>
  1:     1     1     1 0.3991771 0.1244421,0.6010204,0.7708546,0.4181234,0.6126064,0.6068231,...
  2:     1     1     2 0.1191543 0.1244421,0.6010204,0.7708546,0.4181234,0.6126064,0.6068231,...
  3:     1     1     3 0.8993754 0.1244421,0.6010204,0.7708546,0.4181234,0.6126064,0.6068231,...
  4:     1     1     4 0.1927022 0.1244421,0.6010204,0.7708546,0.4181234,0.6126064,0.6068231,...
  5:     1     1     5 0.9353783 0.1244421,0.6010204,0.7708546,0.4181234,0.6126064,0.6068231,...
 ---                                                                                            
496:    10    10     1 0.8060775                                                          [NULL]
497:    10    10     2 0.8797189                                                          [NULL]
498:    10    10     3 0.8200970                                                          [NULL]
499:    10    10     4 0.3559045                                                          [NULL]
500:    10    10     5 0.1516604                                                          [NULL]

still not desired result.

@ben-schwen
Copy link
Member

Apparently, gshift cannot work with subsetting since it always creates vectors of length(x) but not length irows.

@ben-schwen
Copy link
Member

Sorry for hijacking the PR, but it should go into patch release and was also #5939

@MichaelChirico
Copy link
Member Author

thanks; since this PR targets master let's leave the NEWS out. I'll add the NEWS to the copycat PR targeting the patch release. Or maybe we should target the patch instead 🤔

@jangorecki
Copy link
Member

Whichever one we target, we just cherry pick the other one.

@renkun-ken
Copy link
Member

The latest commit works for me now. Thanks!

@MichaelChirico
Copy link
Member Author

Sorry for hijacking the PR, but it should go into patch release and was also #5939

I think we actually have two distinct bugs on our hands, so I will cherry-pick your fix into a separate PR.

@MichaelChirico
Copy link
Member Author

@ben-schwen / @jangorecki any final review here? running up against our CRAN deadline

@ben-schwen
Copy link
Member

LGTM

@MichaelChirico MichaelChirico merged commit e24a66a into master Feb 25, 2024
2 of 3 checks passed
@MichaelChirico MichaelChirico deleted the gshift-1 branch February 25, 2024 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

x[, .(shift(b)), keyby = a] returns list type (should be int)
4 participants