Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated dates with different prices #16

Closed
ronaldocpontes opened this issue Apr 20, 2020 · 3 comments
Closed

Duplicated dates with different prices #16

ronaldocpontes opened this issue Apr 20, 2020 · 3 comments

Comments

@ronaldocpontes
Copy link

@ronaldocpontes ronaldocpontes commented Apr 20, 2020

Prices at 2020-04-17 are being duplicated with inconsistent values.

How to reproduce:

future::plan(future::multisession, workers = floor(parallel::detectCores()/2))

TICKERS = BatchGetSymbols(
  tickers=c('^GSPC','^DJI', '^IXIC'),
  first.date='2010-01-01',
  last.date=Sys.Date(),
  freq.data = 'daily',
  type.return='log',
  do.complete.data = FALSE,
  do.fill.missing.prices = FALSE,
  do.cache=TRUE,
  do.parallel=TRUE,
  cache.folder='data')

key = TICKERS$df.tickers %>% select(c(ticker, ref.date))
TICKERS$df.tickers[duplicated(key) | duplicated(key, fromLast=TRUE),]

Output:

Running BatchGetSymbols for:
   tickers =^GSPC, ^DJI, ^IXIC
   Downloading data for benchmark ticker
^GSPC | yahoo (1|1) | Found cache file
Running parallel BatchGetSymbols with 6 cores (12 available)

 Progress: ──────────────────────────────────────────────────────────────── 100%


^GSPC | yahoo (1|3) | Found cache file - Got 100% of valid prices | Youre doing good!
^DJI | yahoo (2|3) | Found cache file - Got 100% of valid prices | Got it!
^IXIC | yahoo (3|3) | Found cache file - Got 100% of valid prices | Good stuff!> key = TICKERS$df.tickers %>% select(c(ticker, ref.date))


> TICKERS$df.tickers[duplicated(key) | duplicated(key, fromLast=TRUE),]
     price.open price.high price.low price.close     volume price.adjusted
2590    2842.43   2879.220  2830.880    2874.560 5792140000       2874.560
2591    2842.43   2879.220  2830.880    2874.560 3554592893       2874.560
5181   23817.15  24264.211 23817.150   24242.490  525950000      24242.490
5182   23817.15  24264.211 23817.150   24242.490  530277705      24242.490
7772    8667.48   8670.300  8531.690    8650.140 4335020000       8650.140
7773    8667.48   8670.304  8531.688    8650.141 3915690406       8650.141
       ref.date ticker ret.adjusted.prices ret.closing.prices
2590 2020-04-17  ^GSPC        2.644093e-02       2.644093e-02
2591 2020-04-17  ^GSPC        0.000000e+00       0.000000e+00
5181 2020-04-17   ^DJI        2.950436e-02       2.950436e-02
5182 2020-04-17   ^DJI        0.000000e+00       0.000000e+00
7772 2020-04-17  ^IXIC        1.370943e-02       1.370943e-02
7773 2020-04-17  ^IXIC        1.129461e-07       1.129461e-07
> 
@msperlin
Copy link
Owner

@msperlin msperlin commented Apr 20, 2020

Thanks. I'll have a look and fix it.

@ronaldocpontes
Copy link
Author

@ronaldocpontes ronaldocpontes commented Apr 20, 2020

It is probably happening in the caching layer as it started working after I removed the files.

@msperlin
Copy link
Owner

@msperlin msperlin commented Apr 21, 2020

I couldn't replicate the problem but used unique() to make sure all rows are different. Thsi should fix it.

@msperlin msperlin closed this Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.