Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch data from the Tiingo API with a getSymbols.tiingo() method. #220

Merged
merged 3 commits into from
Apr 9, 2018

Conversation

SteveBronder
Copy link
Contributor

@SteveBronder SteveBronder commented Mar 10, 2018

Commit Message:

This enables package users to download historical OHLC, Adjusted OHLC,
Dividend, and Split information from the Tiingo API. Tiingo's free
offering allows access to 60K global securities for 30+ years, though with some rate limits. Tiingo
provides daily, weekly, monthly, and annual frequencies.

Details of the service can be found at the following links
https://www.tiingo.com/pricing
https://api.tiingo.com/docs/tiingo/daily

Tiingo also provides fundamentals, mutual fund information, aggregated news feeds, crypto, and a 5 minute delay quote system, though this implementation includes no provision for downloading them.

Me:

I've been searching for good free data online and Tiingo appears to be not bad! Possibly, good! A big thing for me was that they clearly lay out the methodology for adjustments and give you the data to verify the adjustment. See the bottom of this page
https://api.tiingo.com/docs/tiingo/daily

The rate limits are reasonable for good free data. If I need more then I'll be fine with paying $10 bucks a month.

There is also a websocket API which would be cool to use, though I didn't feel like learning how to use that.
https://api.tiingo.com/docs/general/connecting

I was making this for myself and thought to submit a PR in case you wanted it. It's essentially just a copy/paste of Paul's Alpha Vantage code

@SteveBronder
Copy link
Contributor Author

So after pulling about 6K stocks from Tiingo I can say the data quality seems pretty good! Running some basic tests there were only about 50 OTC penny stocks which had oddities. Two stocks had negative adjusted values though the unadjusted looked fine. I still need to do a deeper dive, but as of now I think the quality is nice and would be good for users of quantmod.

@tiingo
Copy link

tiingo commented Mar 14, 2018

Hi All,

This is Rishi - founder of Tiingo. The way our system works is that we combine 3-5 data providers for each ticker. We've done this because we've found every vendor misses splits, dividends, or gets the closing values incorrect. We also do this because some vendors have outages. We have a system that runs suites of tests on every data point to help track errors. The additional benefit is that we can keep track of which vendors are good/not-so-good and swap them out.

In the case of Steve's checks - yes we found one vendor was giving us negative values and removed them awhile ago for other reasons. Our systems were able to squash most of the negative values, but a few got through - so 19 of our 60,000 tickers had these values. We've since applied a check and removed those values, so going forward it will not be an issue.

Sometimes we feel like historians - tracking down a merger that happened 10-15 years ago and correcting for issues. And when we do manually override points, we keep an audit log with all the notes from our calculation.

The point of Tiingo is to build the data quality mechanism we always wanted when quant trading. It's taking time but because of the quant community we're rolling through :) Thanks all and thanks @SteveBronder for the PR!

@joshuaulrich
Copy link
Owner

Thanks for the contribution; this is a great start! I need to get an update to CRAN within a couple weeks, and I plan to include this. I'm going to make a few changes, including making @SteveBronder the author instead of Paul.

I'm also going to add an adjust (TRUE / FALSE) argument, and return only adjusted or unadjusted OHLCV values. Returning both will cause trouble for the extractor functions (e.g Cl(), OHLC(), etc.).

@tiingo, is there a way request only the unadjusted values, or only the adjusted values? Adding that functionality would be useful for me, and would lower user bandwidth. It would also allow users to calculate the adjusted prices themselves, which should allow them to get greater accuracy with lower bandwidth. Also, is there a documented example of an "over-the-limit" response? I couldn't find one on your website, and I would like to provide users an informative error if they hit a limit. Thanks!

@tiingo
Copy link

tiingo commented Apr 2, 2018

Thanks all for the hard work. @joshuaulrich - would you mind if we implemented this via allowing users to pass the columns they would like? That way we could make this use case more general, and to get just unadjusted prices we could pass the following:

Unadjusted:
&columns='open','high','low','close','volume', 'divCash', 'splitFactor'

Adjusted:
&columns = 'adjOpen', 'adjHigh', 'adjLow', 'adjClose', 'adjVolume', 'divCash', 'splitFactor'

I'm partial to this method as it can allow a more general use case.

Let me know if this would work!

I made an API Token that you can use which will return a usage error:
d116c846835e633aacedb1a31959dd2724cd67b8

JSON: https://api.tiingo.com/tiingo/daily/spy/prices?startDate=2017-1-1&token=d116c846835e633aacedb1a31959dd2724cd67b8
CSV: https://api.tiingo.com/tiingo/daily/spy/prices?startDate=2017-1-1&format=csv&token=d116c846835e633aacedb1a31959dd2724cd67b8

Thanks Joshua

Rishi

@joshuaulrich
Copy link
Owner

Thanks for the prompt reply, and the API token to generate a rate limit response! I also prefer more general functionality, so allowing the user to specify all the columns to return would be great!

@tiingo
Copy link

tiingo commented Apr 2, 2018

Of course - the community building this helps us tremendously. Please keep letting us know what we can do.

The ?columns parameter has been added.

Unadjusted values:
https://api.tiingo.com/tiingo/daily/spy/prices?startDate=2017-1-1&columns=open,high,low,close,volume,divCash,splitFactor

Adjusted values
https://api.tiingo.com/tiingo/daily/spy/prices?startDate=2017-1-1&columns=adjOpen,adjHigh,adjLow,adjClose,adjVolume,divCash,splitFactor

To get the data as CSV, append &format=csv, e.g.
https://api.tiingo.com/tiingo/daily/spy/prices?startDate=2017-1-1&columns=open,high,low,close,volume,divCash,splitFactor&format=csv

Hope this helps!
Rishi

@SteveBronder
Copy link
Contributor Author

Thanks @tiingo and @joshuaulrich !

Josh if you want to make the changes then by all means, else I can do this later in the week

R/getSymbols.R Outdated
"ts", "matrix", "timeSeries", "quantmod.OHLC"))
assertArgs(quote(periodicity), c("daily", "weekly", "monthly", "annually"))
assertArgs(quote(adjusted), c(TRUE, FALSE, "both"))
assertArgs(quote(data.type), c("json", "csv"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Josh we can throw this out if you don't like it. I wanted some form of argument assertion, though I usually use the checkmate package. The function above, such as the check on adjusted, returns an error if the argument value is not in the set. The error looks like

adjusted must be a value of TRUE, FALSE, both

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of this over match.args()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reinvented the wheel here. Good catch will fix

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually in the middle of rebasing and cleaning up. I can take care of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there is one edgecase

match.arg(TRUE, c(TRUE, FALSE, "both"))
# Error in match.arg(TRUE, c(TRUE, FALSE, "both")) : 
#  'arg' must be NULL or a character vector

but I think I would rather have

adjusted = match.arg(as.character(adjusted), c("TRUE", "FALSE", "both"))
# arg returns a char so we do conversion back to logical
if (adjusted == "TRUE" | adjusted == "FALSE") {
    adjusted = as.logical(adjusted)
}

or

_ = adjusted %in% c(TRUE, FALSE, "both") || 
    stop(paste0("Periodicity must a value of TRUE, FALSE, or 'both'"))

than my function. I'm going to do (1) unless you have a preference or alternative

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is that adjust is logical only. No support for "both" because that's not provided by any other getSymbols() functions and I don't believe it's a common use case.

I would also prefer to take care of it myself. I am amending some of your additions because they use different style/convention than the rest of the code (e.g. = for assignment, underscores in names, etc). Some of your commits include changes from prior commits in master that you didn't have because your branch is behind upstream/master. You also have merge commits, and I have a strong preference for rebasing versus including multiple merges of upstream/master into a feature branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is that adjust is logical only. No support for "both" because that's not provided by any other getSymbols() functions and I don't believe it's a common use case.

Yeah I'm cool with that

I would also prefer to take care of it myself.

Totes! If you need anything from me lmk.

I am amending some of your additions because they use different style/convention than the rest of the code (e.g. = for assignment, underscores in names, etc).

I'd rather not waste your time and would be happy to fix these things up myself.

Some of your commits include changes from prior commits in master that you didn't have because your branch is behind upstream/master. You also have merge commits, and I have a strong preference for rebasing versus including multiple merges of upstream/master into a feature branch.

Yes I saw that on the contributor guide, rebased at first but then was waiting for you to look things over before I rebased pre-merge to master. I just rebased / squashed everything

@SteveBronder
Copy link
Contributor Author

SteveBronder commented Apr 6, 2018

Looking at the url from the test's call it looks like google finance is down

http://finance.google.com/finance/historical?q=SPY&startdate=Mar+27,+2018&enddate=Apr+06,+2018&output=csv

Edit: fixed

@SteveBronder SteveBronder force-pushed the getsymbols_tiingo branch 3 times, most recently from f1865ce to e71bacb Compare April 7, 2018 21:53
@joshuaulrich
Copy link
Owner

@tiingo, I wrote some tests for the rate-limit error. I receive a HTTP 429 status with the JSON API, but 200 with the CSV API. Both payloads contain the error message. Shouldn't the CSV response status be 429 also?

@tiingo
Copy link

tiingo commented Apr 9, 2018

@joshuaulrich Yes, absolutely it should be a non-200 error code; however, we found that for some platforms, the data may not render in CSV format if an error response is received, so the user could be left with an error with no descriptive message. It was a difficult decision we had to make, and ultimately we decided to keep the message as a 200 response, even with error, so the user could receive some feedback on corrective action. I'm hesitant to change the behavior any time soon as I don't want to break anybody's code without sufficient notice or a LTS migration plan.

I understand the solution is not technically ideal, but generally, we made it a policy that JSON is our preferred method to deliver data, but many people wanted the data in CSV format so we obliged. For some of our APIs, like the crypto one, CSV loses data because we cannot nest or relate data in together. As we are expanding data sources, JSON will be our preferred method because of the flexibility.

Thanks to you and Steve for working on this.

@joshuaulrich
Copy link
Owner

@tiingo That's completely understandable, and I assumed something like that was happening. I wanted to verify, just in case. I also share your reservations about potentially breaking user code, so I'll figure out a work-around.

I would like to talk with you about a few ideas I have. If you're interested, you can reach me via email. My address is in the DESCRIPTION file of this repo.

@tiingo
Copy link

tiingo commented Apr 9, 2018 via email

SteveBronder and others added 3 commits April 9, 2018 13:13
This enables package users to download historical OHLC, Adjusted OHLC,
Dividend, and Split information from the Tiingo API. Tiingo's free
offering allows access to 60K global securities for 30+ years. Tiingo
provides daily, weekly, monthly, and annual frequencies.

Details of the service can be found at the following links
https://www.tiingo.com/pricing
https://api.tiingo.com/docs/tiingo/daily

Tiingo also provides fundamentals and a 5 minute delay quote system,
though this implementation includes no provision for downloading them.
Tiingo added a new 'columns' argument to the API that allows the user
to request only specific columns. This allows us to pull only raw or
adjusted data.

Add 'adjust' argument to allow user to specify either raw or adjusted
data.
The Tiingo JSON API responds with a HTTP 429 status when a user goes
over the rate limit and download.file() throws an error. The CSV API
responds with a HTTP 200 (OK) status even if a user goes over the rate
limit. In both cases, the temporary file contains the error message.

Use curl_fetch_disk() to avoid any error while processing the server
response. Check that the temporary file contains a header with the
columns requested, and throw an error if it doesn't.
@joshuaulrich joshuaulrich merged commit 283d01b into joshuaulrich:master Apr 9, 2018
@Aks1623
Copy link

Aks1623 commented Apr 28, 2018

Joshua, adjusted close price is different when i get it via different methods.. in getsymbols, adjusted close is actually adjusted open..

riingo <- riingo_prices(c("IBM"), start_date = "2018-04-21", end_date = Sys.Date(), resample_frequency = "daily")
getSymbols("IBM", src= "tiingo",from= "2018-04-21", to= Sys.Date(),adjust =TRUE)

thanks

@SteveBronder
Copy link
Contributor Author

SteveBronder commented Apr 28, 2018

EDIT: Apologies I was using an older branch

Yes seems there is a bad match. I'll make an issue and PR for this. Thanks for the catch!

@PCWin12
Copy link

PCWin12 commented Aug 28, 2018

hello,

just downloaded quantmod and registered at tiingo, how do i get an api.key?
the default value for getFin is set to google and reports an error.
Thank you

@joshuaulrich joshuaulrich added this to the Release 0.4-13 milestone Nov 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants