Significant time delays in get_historical_data() #80

cianryan09 · 2023-11-23T14:06:27Z

Hi,

I am having significant speed issues when running the get_historical_data() function.
The Toolkit works fine but running the above even for three stocks, takes ~135 seconds. Running for 100 takes 200 seconds. If I run for a large amount of stocks (say the 9500 or so of the large, mid and small cap stocks in the financedatabase module) the code always crashes and I get a long list of exception errors in multiple threads.

I ran CProfile, and no specific line of code seems to cause the backlog (tottime all < 0.001).

Does anyone have any idea on how to what could be the cause of this and stop errors from occurring when running for large numbers of stocks? Code snippet below:

`companies = Toolkit(
tickers=ticker_list,
start_date = date,
api_key="xxxxxx",
)

hist_data = companies.get_historical_data()`

JerBouma · 2023-11-23T14:56:56Z

What FinancialModelingPrep are you on? I've built in a wait timer when you reach your rate limit per minute. E.g. the Starter plan has a limit of 250 per minute. Most likely you've hit that and then when you try to run for a few companies straight after it waits again.

The downside of collecting data from FMP for historical data is that I do two API calls per company given that I obtain both the market data and dividends. I can build in an argument for that as well to exclude dividends if desired.

To overcome this, you'd have to upgrade your plan sadly. If this doesn't seem to be the issue please let me know and provide all the errors you get!

By the way, I am aware Yahoo Finance also has the historical data but you will get rate limited just as quickly with them which requires a much longer wait time once you do.

PS: I can make this more obvious by providing a print statement. Does that make sense to you?

EDIT: For thousands of companies, you are stretching the limit of my package. I suggest dividing it up in groups but do provide me with the errors you get!

JerBouma · 2023-12-04T16:53:44Z

Hi @cianryan09, this issue should now be resolved with the release of v1.6.1 which will automatically disable the wait timer that you are having issues with in this case. See: https://github.com/JerBouma/FinanceToolkit/releases/tag/v1.6.3

Please let me know if this doesn't solve the issue.

cianryan09 · 2023-12-04T19:43:11Z

Thank you Jeroen for the reply and apologies for the delay in getting back to your original response. I am currently on the free plan just to see what it's like. Does 1 stock = 1 call to the API or do all stocks in the 'tickers' variable count as one call? E.g. if I have 250 tickers in the ticker list and run Toolkit, does that exhaust my calls for the day?

I also updated to 1.6.3 and it did not seem to result in any speed improvements but that may just be purely due to the call limits described above.

JerBouma · 2023-12-04T19:49:34Z

Let's say you input TSLA, AAPL and MSFT into the Toolkit. Every time you call a function that collects data, it will costs about 3 API calls. So for balance sheets, income statements, cash flow statements, historical data and more. So for example if you call Balance + Income + Cash Flow that's 9 API calls.

I don't fully understand how it takes so long for you to collect data as it should be almost instantly. For example I am using a Free key here and it takes less than 3 seconds. Could you elaborate further?

cianryan09 · 2023-12-04T20:19:08Z

good to know, thanks. This is the code I have run and I timed it:

(The date is just today's date five years ago)
so the same three stocks are taking me nearly 9 seconds.

Is it possibly to do with how my environment is set up?
I set up a new conda environment with python 3.10 and pip installed financetoolkit. I am running the above a script from a file saved locally on VS code. Not really a coding expert so excuse me if the question is basic or off-topic.

EDIT: it also seems the API limit on the free plan is 250 / DAY and not minute, which probably explains the crashes happening with the tickers list gets into the hundreds

JerBouma · 2023-12-04T22:17:19Z

Does it change anything if you define the start_time right before the hist_data? I can't imagine it being that long. The environment you are using is fine, it shouldn't be an issue there.

The API limit should not be an issue as I've made sure that once you hit the limit it will just tell you no data could be collected instead of giving you errors or letting you wait.

cianryan09 · 2023-12-04T22:57:26Z

I tried the above but it made almost no difference. I also upgraded to one of the paid plans and still no speed improvement. And its nothing else in my script causing the slowdown - if I comment out the .get_historical() line it drops to 2 seconds.
It's hardly a hardware issue? Maybe all the threading uses a lot of memory/CPU that my older equipment can't handle?

JerBouma · 2023-12-05T10:07:16Z

Hi! I am expecting this to be a hardware issue or networking issue. What you can try is using Google Colab which already cuts it down to 5 seconds. For other components this can be as little as 2-3 seconds in Google Colab. If it doesn't for you, then its 100% a network issue.

cianryan09 · 2023-12-18T23:43:30Z

Hi - using Colab does cut the time down by a lot. 1000 tickers takes around 2 minutes. Thank you for the suggestion!

JerBouma closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant time delays in get_historical_data() #80

Significant time delays in get_historical_data() #80

cianryan09 commented Nov 23, 2023

JerBouma commented Nov 23, 2023 •

edited

Loading

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023 •

edited

Loading

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023 •

edited

Loading

JerBouma commented Dec 5, 2023 •

edited

Loading

cianryan09 commented Dec 18, 2023

Significant time delays in get_historical_data() #80

Significant time delays in get_historical_data() #80

Comments

cianryan09 commented Nov 23, 2023

JerBouma commented Nov 23, 2023 • edited Loading

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023 • edited Loading

JerBouma commented Dec 4, 2023

cianryan09 commented Dec 4, 2023 • edited Loading

JerBouma commented Dec 5, 2023 • edited Loading

cianryan09 commented Dec 18, 2023

JerBouma commented Nov 23, 2023 •

edited

Loading

cianryan09 commented Dec 4, 2023 •

edited

Loading

cianryan09 commented Dec 4, 2023 •

edited

Loading

JerBouma commented Dec 5, 2023 •

edited

Loading