adjustOHLC bug with Yahoo! Data after URL change #160

Closed
LTFreed opened this Issue May 21, 2017 · 8 comments

Comments

Projects
None yet
5 participants
@LTFreed

LTFreed commented May 21, 2017

This seems to happen since the patch for the change in Yahoo URL. It may be a specific issue with this security, XLF, or it may be a systemic issue with processing Yahoo's dividend and split data since the URL change.

XLF had a special dividend on September 19 2016, in which it split into two ETFs, XLRE and a new XLF. Its an open question whether its economically right to link the old XLF to the new XLF time series, but, even so, this looks like an error:

getSymbols("XLF", from="2016-09-13")
head(XLF)
           XLF.Open XLF.High XLF.Low XLF.Close XLF.Volume XLF.Adjusted
2016-09-13    24.03    24.09   23.79     23.88   83230900     19.39886
2016-09-14    23.88    24.04   23.75     23.82   82876800     19.35012
2016-09-15    23.77    24.03   23.72     23.96   68626500     19.46385
2016-09-16    23.75    23.76   23.55     23.62   75235800     19.18765
2016-09-19    19.18    19.45   19.18     19.31   50396100     19.31000
2016-09-20    19.45    19.48   19.29     19.32   39903200     19.32000

You see the raw data above has a break between 09-16 and 09-19, whereas the Yahoo! calculated Adjusted Close is a smooth time series. Now look at the adjustOLHC:

head(adjustOHLC(XLF))
           XLF.Open XLF.High  XLF.Low XLF.Close XLF.Volume XLF.Adjusted
2016-09-13 15.64582 15.68488 15.48955  15.54815   83230900     19.39886
2016-09-14 15.54815 15.65233 15.46351  15.50909   82876800     19.35012
2016-09-15 15.47653 15.64582 15.44398  15.60024   68626500     19.46385
2016-09-16 15.53766 15.54420 15.40682  15.45261   75235800     19.18765
2016-09-19 19.02570 19.29352 19.02570  19.15465   50396100     19.31000
2016-09-20 19.29352 19.32328 19.13481  19.16457   39903200     19.32000

look again at the break between 09-16 and 09-19. However, this looks ok:

head(adjustOHLC(XLF, use.Adjusted = TRUE))
           XLF.Open XLF.High  XLF.Low XLF.Close XLF.Volume XLF.Adjusted
2016-09-13 19.52072 19.56946 19.32575  19.39886   83230900     19.39886
2016-09-14 19.39886 19.52884 19.29326  19.35012   82876800     19.35012
2016-09-15 19.30950 19.52072 19.26889  19.46385   68626500     19.46385
2016-09-16 19.29326 19.30138 19.13079  19.18765   75235800     19.18765
2016-09-19 19.18000 19.45000 19.18000  19.31000   50396100     19.31000
2016-09-20 19.45000 19.48000 19.29000  19.32000   39903200     19.32000

I can't tell what data the function is reading from Yahoo!, but my guess is that somehow the dividend is applied twice.

Session Info

R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] quantmod_0.4-8 TTR_0.23-1     xts_0.9-7      zoo_1.8-0     

loaded via a namespace (and not attached):
 [1] compiler_3.4.0                tools_3.4.0                   blotter_0.9.1741              curl_2.6                     
 [5] codetools_0.2-15              grid_3.4.0                    iterators_1.0.8               foreach_1.4.3                
 [9] quantstrat_0.9.1739           FinancialInstrument_1.2.0     PerformanceAnalytics_1.4.3541 lattice_0.20-35              
@javcasalc

This comment has been minimized.

Show comment
Hide comment
@javcasalc

javcasalc May 22, 2017

yes, there's something broken: check SPXL for example 2017-05-01:

> spxl[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    524.48    524.48   517.84     519.04      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63
> adjustOHLC(spxl, use.Adjusted = TRUE)[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    131.12    131.12   129.46     129.76      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63
> adjustOHLC(spxl, use.Adjusted = FALSE)[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    131.12    131.12   129.46     129.76      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63

javcasalc commented May 22, 2017

yes, there's something broken: check SPXL for example 2017-05-01:

> spxl[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    524.48    524.48   517.84     519.04      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63
> adjustOHLC(spxl, use.Adjusted = TRUE)[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    131.12    131.12   129.46     129.76      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63
> adjustOHLC(spxl, use.Adjusted = FALSE)[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]
           SPXL.Open SPXL.High SPXL.Low SPXL.Close SPXL.Volume SPXL.Adjusted
2017-04-28    131.12    131.12   129.46     129.76      510300        129.76
2017-05-01     32.63     32.86    32.46      32.63     1042100         32.63
@joshuaulrich

This comment has been minimized.

Show comment
Hide comment
@joshuaulrich

joshuaulrich May 22, 2017

Owner

spxl[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]

@javcasalc You should consider using xts' ISO-8601 subsetting. It's much easier to type and read. ;)

spxl["2017-04-28/2017-05-01"]
Owner

joshuaulrich commented May 22, 2017

spxl[index(spxl) >= "2017-04-28" & index(spxl) <= "2017-05-01"]

@javcasalc You should consider using xts' ISO-8601 subsetting. It's much easier to type and read. ;)

spxl["2017-04-28/2017-05-01"]
@javcasalc

This comment has been minimized.

Show comment
Hide comment
@javcasalc

javcasalc May 26, 2017

@joshuaulrich any clue on how to deal with this yahoo issue? Thanks!

@joshuaulrich any clue on how to deal with this yahoo issue? Thanks!

@joshuaulrich

This comment has been minimized.

Show comment
Hide comment
@joshuaulrich

joshuaulrich May 26, 2017

Owner

I don't think I can fix these. They look like errors in the raw data. In the case of XLF, Yahoo says it has a split and a dividend on 2016-09-16, even on their webpage. As far as I can tell, there was only a special dividend. The split and dividend data are roughly the same ratio, and adjustOHLC applies them both, so it looks like one is being applied twice.

The problem with SPXL is that the adjusted close price isn't adjusted for the 4/1 split on 2017-05-01, which is also reflected on it's webpage.

There's not much I can do when the raw data are randomly incorrect...

Owner

joshuaulrich commented May 26, 2017

I don't think I can fix these. They look like errors in the raw data. In the case of XLF, Yahoo says it has a split and a dividend on 2016-09-16, even on their webpage. As far as I can tell, there was only a special dividend. The split and dividend data are roughly the same ratio, and adjustOHLC applies them both, so it looks like one is being applied twice.

The problem with SPXL is that the adjusted close price isn't adjusted for the 4/1 split on 2017-05-01, which is also reflected on it's webpage.

There's not much I can do when the raw data are randomly incorrect...

@anthonylei

This comment has been minimized.

Show comment
Hide comment
@anthonylei

anthonylei May 29, 2017

@joshuaulrich I agree the problem occurs in raw data, but I am not sure if it happened in only random symbols. I found that the "adjusted closed" in Yahoo Finance handle split correctly, but dividend is incorrect. Just look at AAPL on May 11, 2017, and MSFT on May 16, 2017. Dividends are not reflected on the adjusted closed.

anthonylei commented May 29, 2017

@joshuaulrich I agree the problem occurs in raw data, but I am not sure if it happened in only random symbols. I found that the "adjusted closed" in Yahoo Finance handle split correctly, but dividend is incorrect. Just look at AAPL on May 11, 2017, and MSFT on May 16, 2017. Dividends are not reflected on the adjusted closed.

@nitingupta2

This comment has been minimized.

Show comment
Hide comment
@nitingupta2

nitingupta2 Jun 3, 2017

Even though adjustOHLC doesn't throw any errors, I'm not sure it's properly adjusting the data. Try downloading data for SPY, adjustOHLC, create a daily return series and compare annual returns to S&P 500 Total Returns Index. There's a big difference now. It appears dividends are no longer adjusted, or maybe the raw data is bad altogether. Yahoo has really messed up everything!

Even though adjustOHLC doesn't throw any errors, I'm not sure it's properly adjusting the data. Try downloading data for SPY, adjustOHLC, create a daily return series and compare annual returns to S&P 500 Total Returns Index. There's a big difference now. It appears dividends are no longer adjusted, or maybe the raw data is bad altogether. Yahoo has really messed up everything!

@anthonylei

This comment has been minimized.

Show comment
Hide comment
@anthonylei

anthonylei Jun 5, 2017

Is there any other dividend adjustment dataset can be used? Honestly, Yahoo Finance doesn't seem reliable for now.

Is there any other dividend adjustment dataset can be used? Honestly, Yahoo Finance doesn't seem reliable for now.

@joshuaulrich

This comment has been minimized.

Show comment
Hide comment
@joshuaulrich

joshuaulrich Mar 19, 2018

Owner

Closing, since this is a data source issue that can't be fixed with code changes.

Owner

joshuaulrich commented Mar 19, 2018

Closing, since this is a data source issue that can't be fixed with code changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment