Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 500 error in fetch_mldata mauna-loa-atmospheric-co2 #11108

Closed
rth opened this issue May 18, 2018 · 5 comments
Closed

HTTP 500 error in fetch_mldata mauna-loa-atmospheric-co2 #11108

rth opened this issue May 18, 2018 · 5 comments

Comments

@rth
Copy link
Member

rth commented May 18, 2018

There have been several PRs (#11100 (review), #11106) where CircleCI arbitrarly fails due to HTTP 500 errors when calling fetch_mldata('mauna-loa-atmospheric-co2'),

Partial traceback below,

Traceback (most recent call last):
  File "/home/circleci/project/examples/gaussian_process/plot_gpr_co2.py", line 75, in <module>
    data = fetch_mldata('mauna-loa-atmospheric-co2').data
  File "/home/circleci/project/sklearn/datasets/mldata.py", line 154, in fetch_mldata
    mldata_url = urlopen(urlname)
  File "/home/circleci/miniconda/envs/testenv/lib/python3.6/urllib/request.py", line 223, in urlopen
[...]

  File "/home/circleci/miniconda/envs/testenv/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

If this keeps repeating, a possible workaround could be,

@rth rth changed the title HTTP 500 error in fetch_mldata('mauna-loa-atmospheric-co2') HTTP 500 error in fetch_mldata mauna-loa-atmospheric-co2 May 18, 2018
@lesteve
Copy link
Member

lesteve commented May 18, 2018

My understanding is that the long-term solution is the openml fetcher #9543 (not 100% sure what the status is).

mldata.org has historically not been extremely reliable but if this is just temporary glitches I would say we should ignore them as we have done so far. The feeling I got when investigating #8588 is that mldata.org maintenance is not very active (no disrespect intended, just saying that there is not a staff of 10 full-time people behind it). Edit: more details about who maintains mldata.org: #8588 (comment).

If it starts to be too annoying to be ignored, we could probably implement a retry mechanism, but someone should double-check that it actually fixes the problem. For example when a glitch happens it may actually last for a few minutes, in which case a retry mechanism may not be a great fit.

@qinhanmin2014
Copy link
Member

Apart from fetch_mldata('mauna-loa-atmospheric-co2'), fetch_mldata('MNIST original') also fails in #11106.
I'm +1 for ignoring the issue since I think it doesn't occur very often (in the previous year at least).
I also doubt whether the retry mechanism will introduce unnecessary complexity for the examples.

@rth
Copy link
Member Author

rth commented May 18, 2018

Agreed. Let's close this for now and re-open later if needed.

@rth rth closed this as completed May 18, 2018
@amueller
Copy link
Member

@lesteve status is it's on my todo and I'm back from the dead (aka teaching)

@jnothman
Copy link
Member

jnothman commented May 21, 2018 via email

@jnothman jnothman mentioned this issue Jul 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants