Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas.io.data.Options is broken due to Yahoo options page change #8612

Closed
jasonstrimpel opened this issue Oct 23, 2014 · 19 comments · Fixed by #8631
Closed

pandas.io.data.Options is broken due to Yahoo options page change #8612

jasonstrimpel opened this issue Oct 23, 2014 · 19 comments · Fixed by #8631
Labels
Milestone

Comments

@jasonstrimpel
Copy link

Yahoo reconfigured their options page (http://finance.yahoo.com/q/op?s=AAPL+Options) which broke the entire Options class.

Are there plans for a fix in a future release?

@jreback
Copy link
Contributor

jreback commented Oct 23, 2014

cc @dstephens99

well, someone needs to have a look. @strimp099 if you would have a look would be gr8!.

@jreback jreback added this to the 0.15.1 milestone Oct 23, 2014
@jasonstrimpel
Copy link
Author

Guess that's a call to action :) will take a look.

JAS

On Oct 23, 2014, at 12:42, jreback notifications@github.com wrote:

cc @dstephens99

well, someone needs to have a look. @strimp099 if you would have a look would be gr8!.


Reply to this email directly or view it on GitHub.=

@PollyP
Copy link

PollyP commented Oct 23, 2014

I'm seeing this too:

>>> pandas.__version__
'0.15.0'
>>> sp500options = Options('spy', 'yahoo')
>>> data = sp500options.get_all_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1108, in get_all_data
    months = self._get_expiry_months()
  File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1146, in _get_expiry_months
    raise RemoteDataError('Expiry months not available')
pandas.io.data.RemoteDataError: Expiry months not available

@jasonstrimpel
Copy link
Author

This is the error you'll see. The XML paths have changed.

JAS

On Oct 23, 2014, at 14:02, Polly Powledge notifications@github.com wrote:

I'm seeing this too:

pandas.version
'0.15.0'
sp500options = Options('spy', 'yahoo')
data = sp500options.get_all_data()
Traceback (most recent call last):
File "", line 1, in
File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1108, in get_all_data
months = self._get_expiry_months()
File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1146, in _get_expiry_months
raise RemoteDataError('Expiry months not available')
pandas.io.data.RemoteDataError: Expiry months not available

Reply to this email directly or view it on GitHub.=

@davidastephens
Copy link
Contributor

I'm working on it. Its changed quite a bit.

@jasonstrimpel
Copy link
Author

I forked the repo and made a few updates if you want to take a look.

@MichaelWS
Copy link
Contributor

It's a lot easier to handle it to be honest

@MichaelWS
Copy link
Contributor

can I help with it? I parsed the new site 2 weeks ago.

@PollyP
Copy link

PollyP commented Oct 24, 2014

Hi, thanks for taking a look at this. I'm afraid I'm still seeing the problem, though. Is there a way I could check to see that I'm running with the correct bits, besides checking the version?


pandas.version
'0.15.0.dev'
import pandas.io.data as web
sp500options = web.Options('spy', 'yahoo')
data = sp500options.get_all_data()
Traceback (most recent call last):
File "", line 1, in
File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1108, in get_all_data
months = self._get_expiry_months()
File "/Users/foo/anaconda/lib/python2.7/site-packages/pandas/io/data.py", line 1146, in _get_expiry_months
raise RemoteDataError('Expiry months not available')

pandas.io.data.RemoteDataError: Expiry months not available

@BMeridian
Copy link

Maybe this could help. I think there is problem with going to unix timestamp from old method.

http://stackoverflow.com/questions/5493514/webscraping-with-beautifulsoup-or-lxml-html/26553199#26553199

@PollyP
Copy link

PollyP commented Oct 26, 2014

I'm new to pandas but took a run at this:


def _get_expiry_months(self):
    """
    Gets available expiry months.

    Returns
    -------
    months : List of datetime objects
    """

    url = 'http://finance.yahoo.com/q/op?s={sym}'.format(sym=self.symbol)
    root = self._parse_url(url)

    try:
        options = root.xpath('.//form')[2].xpath('//option')
    except IndexError:
        raise RemoteDataError('Expiry months not available')

    # put the datestrings from the options into a list
    dates = []
    dates = [element.text for element in options]

    # create a list of datetimes, unique by month and year, while maintaining the original order
    unique_dict = {}
    months = []
    for datestring in dates:
        (month,day,year) = datestring.split(' ')
        if not unique_dict.has_key(year+month):
            try:
                adt = dt.datetime.strptime(datestring,"%B %d, %Y")
                months.append(adt)
            except ValueError:
                raise RemoteDataError('Cannot parse expiry months')
            unique_dict[year+month] = True

    self.months = months

    return months

But it's not clear to me if it really should be returning expirations unique for each month and year as I'm doing here (and which the method name implies), or if I should be returning expirations, period. If it's the latter then I'll change the code. Can anyone enlighten me?

Alas, this is not the only thing the changed Yahoo page breaks: _get_option_data() is broken too. I haven't looked at that yet.

Any other suggestions? I know I'll need to clean up the xpath string. It's just begging to have another form added to the page and break things all over again.

UPDATE: As I'm going through the Options class, I see that it's just using the expiry month and year throughout the code. If you're not using dates, how this would work for months with multiple expiration dates, e.g., weeklies and quarterlies? Are the non-standard options getting skipped over? Or am I missing something here?

@davidastephens
Copy link
Contributor

Yahoo used to have all the expiries for a month on the same page. My PR (#8631) changes the code so that it uses a specific date now.

@jorisvandenbossche
Copy link
Member

The example in the docs (http://pandas-docs.github.io/pandas-docs-travis/remote_data.html#yahoo-finance-options) is broken, so this certainly needs to be addressed for 0.15.1 (fixed or temporarily removed from the docs)

@davidastephens
Copy link
Contributor

#8631 Will fix the docs.

@BMeridian
Copy link

Look at accepted answer fixed. Does not use pandas...but Unix date is fixed. Url fixed. can be run in loop

http://stackoverflow.com/questions/5493514/webscraping-with-beautifulsoup-or-lxml-html/26589843#26589843

@jreback jreback modified the milestones: 0.15.1, 0.15.2 Oct 30, 2014
@chiragmatkar
Copy link

Anything i am missing here?Still not working i guess

sp500options = Options('spy', 'yahoo')
data = sp500options.get_all_data()
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Chirag\Anaconda\lib\site-packages\pandas\io\data.py", line 1111
, in get_all_data
m2 = month.month
AttributeError: 'str' object has no attribute 'month'

@jorisvandenbossche
Copy link
Member

@chiragmatkar what version are you using? Working for me with 0.16.2

@chiragmatkar
Copy link

@jorisvandenbossche 0.14.1

@chiragmatkar
Copy link

@jorisvandenbossche Thanks .upgraded and its working fine now.

gfyoung added a commit to forking-repos/pandas that referenced this issue Aug 27, 2016
Partially addresses pandas-devgh-8254.
Closes pandas-devgh-8612 because pd.to_datetime has a format arg.

[ci skip]
jorisvandenbossche pushed a commit that referenced this issue Aug 27, 2016
Partially addresses gh-8254.
Closes gh-8612 because pd.to_datetime has a format arg.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants