Description
Description
I have been trying to split.xts()
my dataset by month. The splits are correct in terms of endpoints, but the nomenclature of the elements in the returned list isn't. In some cases, one month appears twice in a row (but with the data for the correct month). I updated my version of xts
, as well as R itself, but nothing seems to be changing. The issue is not systematic, however.
Expected behavior
In the code below, I use a full-scale (testdat_large.csv) and a reduced (testdat.csv) version of my original data. baz
, from the reduced data, has 3 elements Sep 2021, Oct 2021 and Nov 2021. So far, so good. But qux
has Sep 2021, Sep 2021 and Nov 2021 - but the splits themselves are correct.
Thinking it might be a problem with my data, I used sample_matrix
. The resulting foo
however, has Jan 2007, Jan 2007, Feb 2007, Mar 2007, Apr 2007, May 2007 instead of the expected Jan through Jun. But again, the endpoints are correct (e.g. the second instance of Jan 2007 has 28 days so is indeed February).
I thought it might be due to the fact that these datasets all have multiple columns, so I tried subsetting one in dat.sing
. But the resulting bar
presents the exact same problem. So the issue seems to be that for some reason, some list elements receive the wrong name, and subsequent elements are offset.
Minimal, reproducible example
library("xts")
data("sample_matrix")
dat <- as.xts(sample_matrix)
dat.sing <- dat$Open
dat2 <- read.delim.zoo("~/testdat.csv",
format = "%d.%m.%Y %H:%M",
tz = "CET",
sep = ";",
header = TRUE) |> as.xts()
dat3 <- read.delim.zoo("~/testdat_large.csv",
format = "%d.%m.%Y %H:%M",
tz = "CET",
sep = ";",
header = TRUE) |> as.xts()
foo <- split(dat, f = "months")
bar <- split(dat.sing, f = "months")
baz <- split(dat2, f = "months")
qux <- split(dat3, f = "months")
Session Info
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=French_Switzerland.utf8 LC_CTYPE=French_Switzerland.utf8 LC_MONETARY=French_Switzerland.utf8
[4] LC_NUMERIC=C LC_TIME=French_Switzerland.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0 dplyr_1.1.0 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[8] tibble_3.1.8 ggplot2_3.4.1 tidyverse_2.0.0 xts_0.13.0 zoo_1.8-11
loaded via a namespace (and not attached):
[1] rstudioapi_0.14 magrittr_2.0.3 hms_1.1.2 tidyselect_1.2.0 munsell_0.5.0 timechange_0.2.0 colorspace_2.1-0
[8] lattice_0.20-45 R6_2.5.1 rlang_1.0.6 fansi_1.0.4 tools_4.2.2 grid_4.2.2 gtable_0.3.1
[15] utf8_1.2.3 cli_3.4.1 withr_2.5.0 ellipsis_0.3.2 lifecycle_1.0.3 tzdb_0.3.0 vctrs_0.5.2
[22] glue_1.6.2 stringi_1.7.12 compiler_4.2.2 pillar_1.8.1 generics_0.1.3 scales_1.2.1 pkgconfig_2.0.3