Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent issue with Yahoo datasource #60

Open
RobWalker opened this issue Jul 29, 2021 · 11 comments
Open

Intermittent issue with Yahoo datasource #60

RobWalker opened this issue Jul 29, 2021 · 11 comments

Comments

@RobWalker
Copy link

When trying to update from Yahoo I get intermittent errors

Traceback (most recent call last):
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\shelve.py", line 111, in __getitem__
    value = self.cache[key]
KeyError: '236efef3aaeade6596ac1f98e86215e1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 508, in fetch_cached_price
    timestamp_created, result_naive = _CACHE[key]
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\shelve.py", line 113, in __getitem__
    f = BytesIO(self.dict[key.encode(self.keyencoding)])
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\dbm\dumb.py", line 147, in __getitem__
    pos, siz = self._index[key]     # may raise KeyError
KeyError: b'236efef3aaeade6596ac1f98e86215e1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\robwa\AppData\Local\Programs\Python\Python39\Scripts\bean-price.exe\__main__.py", line 7, in <module>
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 854, in main
    price_entries = sorted(price_entries, key=lambda e: e.currency)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 608, in result_iterator
    yield fs.pop().result()
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 438, in result
    return self.__get_result()
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 390, in __get_result
    raise self._exception
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 592, in fetch_price
    srcprice = fetch_cached_price(source, psource.symbol, dprice.date)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 526, in fetch_cached_price
    source.get_historical_price(symbol, time))
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\sources\yahoo.py", line 146, in get_historical_price
    series, currency = get_price_series(ticker, time - timedelta(days=5), time)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\sources\yahoo.py", line 102, in get_price_series
    series = [(datetime.fromtimestamp(timestamp, tz=tzone), Decimal(price))
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\sources\yahoo.py", line 102, in <listcomp>
    series = [(datetime.fromtimestamp(timestamp, tz=tzone), Decimal(price))
TypeError: conversion from NoneType to Decimal is not supported

However, this appears to corrupt the cache, since subsequent attempts to update return

Traceback (most recent call last):
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\shelve.py", line 111, in __getitem__
    value = self.cache[key]
KeyError: 'e9bbf1545efc11c7ce26136f938dff22'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\robwa\AppData\Local\Programs\Python\Python39\Scripts\bean-price.exe\__main__.py", line 7, in <module>
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 854, in main
    price_entries = sorted(price_entries, key=lambda e: e.currency)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 608, in result_iterator
    yield fs.pop().result()
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 438, in result
    return self.__get_result()
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\_base.py", line 390, in __get_result
    raise self._exception
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\concurrent\futures\thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 592, in fetch_price
    srcprice = fetch_cached_price(source, psource.symbol, dprice.date)
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\site-packages\beanprice\price.py", line 508, in fetch_cached_price
    timestamp_created, result_naive = _CACHE[key]
  File "c:\users\robwa\appdata\local\programs\python\python39\lib\shelve.py", line 114, in __getitem__
    value = Unpickler(f).load()
EOFError: Ran out of input

The only way forward is to delete the cache.

Is it possible to treat this as a soft error for that data point and simply skip it?

@grostim
Copy link

grostim commented Jan 3, 2022

Hi !

i think, i have met the same error:

INFO    : Fetching: 0P00000G9B.F (time: 2021-07-16 14:00:00+00:00)
DEBUG   : Starting new HTTPS connection (1): query1.finance.yahoo.com:443
DEBUG   : https://query1.finance.yahoo.com:443 "GET /v8/finance/chart/0P00000G9B.F?period1=1626012000&period2=1626444000&interval=1d&lang=en-US&corsDomain=finance.yahoo.com&.tsrc=finance HTTP/1.1" 200 515
INFO    : Fetching: 17393 (time: 2021-07-23 14:00:00+00:00)
Traceback (most recent call last):
ERROR   : Error fetching 17393: Import de l'historique pas encore implémenté
  File "/usr/lib/python3.7/shelve.py", line 111, in __getitem__
    value = self.cache[key]
KeyError: '5a6672d171bf18f1a0637f26c01760a5'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/lib/python3.7/site-packages/beanprice/price.py", line 508, in fetch_cached_price
    timestamp_created, result_naive = _CACHE[key]
  File "/usr/lib/python3.7/shelve.py", line 113, in __getitem__
    f = BytesIO(self.dict[key.encode(self.keyencoding)])
KeyError: b'5a6672d171bf18f1a0637f26c01760a5'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/bin/bean-price", line 33, in <module>
    sys.exit(load_entry_point('beanprice==1.2.0', 'console_scripts', 'bean-price')())
  File "/app/lib/python3.7/site-packages/beanprice/price.py", line 854, in main
    price_entries = sorted(price_entries, key=lambda e: e.currency)
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/app/lib/python3.7/site-packages/beanprice/price.py", line 592, in fetch_price
    srcprice = fetch_cached_price(source, psource.symbol, dprice.date)
  File "/app/lib/python3.7/site-packages/beanprice/price.py", line 526, in fetch_cached_price
    source.get_historical_price(symbol, time))
  File "/app/lib/python3.7/site-packages/beanprice/sources/yahoo.py", line 146, in get_historical_price
    series, currency = get_price_series(ticker, time - timedelta(days=5), time)
  File "/app/lib/python3.7/site-packages/beanprice/sources/yahoo.py", line 103, in get_price_series
    for timestamp, price in zip(timestamp_array, close_array)]
  File "/app/lib/python3.7/site-packages/beanprice/sources/yahoo.py", line 103, in <listcomp>
    for timestamp, price in zip(timestamp_array, close_array)]
TypeError: conversion from NoneType to Decimal is not supported

@mbafford
Copy link
Contributor

mbafford commented Mar 1, 2022

This issue is fixed (very simple change) with #66 and #68 (same as #66, but with test case) - can we get one of those two merged into the main branch? @blais

It seems like at least one cause is due to an issue with mutual funds while the market is open - where there's a timestamp for the last date in the query range, but the values are all null. Example JSON response in #68

@Nexus2k
Copy link

Nexus2k commented Mar 1, 2022

Facing the same issue/crash even with the PR #66 applied:

$ bean-price --date=2022-02-28 -i prices.beancount -v >> prices.beancount
INFO    : Using price cache at "/tmp/bean-price.cache" (with indefinite expiration)
INFO    : Processing at date: 2022-02-28
INFO    : Loading "prices.beancount"
INFO    : Fetching: VT (time: 2022-02-28 15:00:00+00:00)
INFO    : Fetching: BTC-USD (time: 2022-02-28 15:00:00+00:00)
Traceback (most recent call last):
  File "/usr/lib/python3.6/shelve.py", line 111, in __getitem__
    value = self.cache[key]
KeyError: '4a6b0583d2cb1c0ba8b143f7ff4d1327'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nexus2k/.local/lib/python3.6/site-packages/beancount/prices/price.py", line 359, in fetch_cached_price
    timestamp_created, result_naive = _CACHE[key]
  File "/usr/lib/python3.6/shelve.py", line 113, in __getitem__
    f = BytesIO(self.dict[key.encode(self.keyencoding)])
KeyError: b'4a6b0583d2cb1c0ba8b143f7ff4d1327'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nexus2k/.local/bin/bean-price", line 11, in <module>
    sys.exit(main())
  File "/home/nexus2k/.local/lib/python3.6/site-packages/beancount/prices/price.py", line 650, in main
    price_entries = sorted(price_entries, key=lambda e: e.currency)
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/nexus2k/.local/lib/python3.6/site-packages/beancount/prices/price.py", line 439, in fetch_price
    srcprice = fetch_cached_price(source, psource.symbol, dprice.date)
  File "/home/nexus2k/.local/lib/python3.6/site-packages/beancount/prices/price.py", line 377, in fetch_cached_price
    source.get_historical_price(symbol, time))
  File "/home/nexus2k/.local/lib/python3.6/site-packages/beancount/prices/sources/yahoo.py", line 124, in get_historical_price
    timestamp_array = result['timestamp']
KeyError: 'timestamp'

Woah I guess for cryptos yahoo is not that reliable: https://finance.yahoo.com/quote/DOT-USD/history?p=DOT-USD

@mbafford
Copy link
Contributor

mbafford commented Mar 1, 2022

@Nexus2k Can you reproduce the error, and post the JSON the API is returning for you when it exhibits the issue? It seems to work for me right now:

INFO    : Processing at date: 2022-02-28
DEBUG   : Starting new HTTPS connection (1): query1.finance.yahoo.com:443
DEBUG   : https://query1.finance.yahoo.com:443 "GET /v8/finance/chart/BTC-USD?period1=1645650000&period2=1646082000&interval=1d&lang=en-US&corsDomain=finance.yahoo.com&.tsrc=finance HTTP/1.1" 200 1366
2022-02-27 price BTC-USD                      43193.234375 USD

The JSON has no null values in the prices for me, and has the "timestamp" field your error is complaining about.

{"chart":{"result":[{"meta":{"currency":"USD","symbol":"BTC-USD","exchangeName":"CCC","instrumentType":"CRYPTOCURRENCY","firstTradeDate":1410912000,"regularMarketTime":1646167440,"gmtoffset":0,"timezone":"UTC","exchangeTimezoneName":"UTC","regularMarketPrice":44000.844,"chartPreviousClose":37296.57,"priceHint":2,"currentTradingPeriod":{"pre":{"timezone":"UTC","end":1646092800,"start":1646092800,"gmtoffset":0},"regular":{"timezone":"UTC","end":1646179140,"start":1646092800,"gmtoffset":0},"post":{"timezone":"UTC","end":1646179140,"start":1646179140,"gmtoffset":0}},"dataGranularity":"1d","range":"","validRanges":["1d","5d","1mo","3mo","6mo","1y","2y","5y","10y","ytd","max"]},"timestamp":[1645574400,1645660800,1645747200,1645833600,1645920000,1646006400],"indicators":{"quote":[{"close":[37296.5703125,38332.609375,39214.21875,39105.1484375,37709.78515625,43193.234375],"high":[39122.39453125,38968.83984375,39630.32421875,40005.34765625,39778.94140625,43760.45703125],"open":[38285.28125,37278.56640625,38333.74609375,39213.08203125,39098.69921875,37706.0],"volume":[21849073843,46383802093,26545599159,17467554129,23450127612,35690014104],"low":[37201.81640625,34459.21875,38111.34375,38702.53515625,37268.9765625,37518.21484375]}],"adjclose":[{"adjclose":[37296.5703125,38332.609375,39214.21875,39105.1484375,37709.78515625,43193.234375]}]}}],"error":null}}

@Nexus2k
Copy link

Nexus2k commented Mar 1, 2022

Check https://finance.yahoo.com/quote/DOT-USD/history?p=DOT-USD so "USD:yahoo/DOT-USD" for some reason it has incomplete historic pricing. I've switched to coinbase for the pricing info of that now...

@mbafford
Copy link
Contributor

mbafford commented Mar 2, 2022

@Nexus2k
Ok, your issue is actually a different issue than the one mentioned in this ticket. The API didn't return any timestamp field at all for your query. Also, your error shows you're using an older version of beanprice - the line numbers are wrong for the current code and the error you mention is solved in the latest code:

    if 'timestamp' not in result:
        raise YahooError(
            "Yahoo returned no data for ticker {} for time range {} - {}".format(
                ticker, time_begin, time_end))

Your issue was addressed in 6fe4c5e


That said, #66 and #68 (same as #66, but with test case) address the issue described in this issue - it would be nice to have them merged into the main branch.

@michaelMinar
Copy link

Not sure if this is related or some other issue, but I was running into this same problem of TypeError: conversion from NoneType to Decimal is not supported. I then did a pip upgrade of bean-price and tried running again, because of the fix here. I'm now getting an error with Celery and importing dbm much earlier in the process.

Traceback (most recent call last):
  File "/usr/local/bin/bean-price", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/mminar/git/beanprice/beanprice/price.py", line 839, in main
    args, jobs, entries, dcontext = process_args()
                                    ^^^^^^^^^^^^^^
  File "/Users/mminar/git/beanprice/beanprice/price.py", line 771, in process_args
    setup_cache(args.cache_filename, args.clear_cache)
  File "/Users/mminar/git/beanprice/beanprice/price.py", line 564, in setup_cache
    _CACHE = shelve.open(cache_filename, flag=flag)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/shelve.py", line 243, in open
    return DbfilenameShelf(filename, flag, protocol, writeback)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/shelve.py", line 227, in __init__
    Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
                         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dbm/__init__.py", line 91, in open
    raise error[0]("db type is {0}, but the module is not "
dbm.error: db type is dbm.gnu, but the module is not available

@blais
Copy link
Member

blais commented Mar 9, 2023 via email

@michaelMinar
Copy link

Thank you Blais, apologies but i'm not quite sure how to do that on macOS. Do you have a few more bread crumbs? I was just using brew to install gdbm, but brew says there is no formula for gdbm-dev. And i was also using brew to upgrade python...so I assume you just mean to do an upgrade there as well? or is rebuilding a more complicated step?

@dnicolodi
Copy link

Please not that Celery has nothing to do with this. There is a Cellar component in the Python module paths, but that is due to how Homebrew installs things. Your issue is due to the fact that Homebrew disabled support for dbm.gnu in Python 3.11 in favor of the dbm.ndbm backend. If you need to work with existing dbm.gnu databases, you can install the python-gdbm@3.11 Homebrew package.

@blais
Copy link
Member

blais commented Mar 11, 2023

Michael: this is an issue with your setup, not a Beancount issue.
Thanks Daniele for helping out,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants