Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] fix broken links #6161

Merged
merged 9 commits into from Nov 6, 2023
Merged

[docs] fix broken links #6161

merged 9 commits into from Nov 6, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Oct 30, 2023

The check-links CI job is failing as a result of broken links (recent build).

This PR fixes those errors:

Kubeflow Fairing is no longer supported (click me)
URL        `https://www.kubeflow.org/docs/components/fairing/fairing-overview'
Name       `Kubeflow Fairing'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/Parallel-Learning-Guide.html, line 452, col 4
Real URL   https://www.kubeflow.org/docs/external-add-ons/fairing/fairing-overview
Check time 0.452 seconds
Warning    [http-redirected] Redirected to
           `https://www.kubeflow.org/docs/external-add-ons/fairing/fairing-overview'
           status: 301 Moved Permanently.
Result     Error: 404 Not Found

See kubeflow/fairing#573.

And at https://github.com/kubeflow/fairing/tree/master

Screen Shot 2023-10-29 at 11 09 01 PM

And https://www.kubeflow.org/docs/external-add-ons/fairing/fairing-overview

image
Paris Kaggle Meetup slides have been removed or made private (click me)
URL        `https://drive.google.com/file/d/0B6qJBmoIxFe0ZHNCOXdoRWMxUm8/view'
Name       `Kaggle Paris Meetup #12 Slides'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/gcc-Tips.html, line 124, col 8
Real URL   https://drive.google.com/file/d/0B6qJBmoIxFe0ZHNCOXdoRWMxUm8/view
Check time 0.549 seconds
Result     Error: 401 Unauthorized

https://drive.google.com/file/d/0B6qJBmoIxFe0ZHNCOXdoRWMxUm8/view

image
MSLR dataset moved (click me)
URL        `http://research.microsoft.com/en-us/projects/mslr/'
Name       `link'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/Experiments.html, line 159, col 8
Real URL   https://www.microsoft.com/en-us/research/redirect/?ref=https://research.microsoft.com/en-us/projects/mslr/
Check time 62.618 seconds
Size       0B
Warning    [http-redirected] Redirected to
           `https://www.microsoft.com/en-us/research/redirect/?ref=https://research.microsoft.com/en-us/projects/mslr/'
           status: 301 Moved Permanently.
Result     Error: ReadTimeout: HTTPSConnectionPool(host='www.microsoft.com', port=443): Read timed out. (read timeout=60)
SWIG docs site requests sometimes fail (click me)
URL        `http://www.swig.org/download.html'
Name       `SWIG'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/Installation-Guide.html, line 791, col 16
Real URL   http://www.swig.org/download.html
Check time 10.217 seconds
Result     Error: ConnectionError: HTTPConnectionPool(host='www.swig.org', port=80): Max retries exceeded with url: /download.html (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fc81fd499a0>: Failed to resolve 'www.swig.org' ...

I suspect that that might be a result of the use of http:// instead of https://... maybe that site is doing a redirection of http:// traffic that results in failures.

While looking through this, I also saw 2 warnings about redirects from http:// to https:// pages.. this PR fixes those as well.

redirects (click me)
URL        `http://stackoverflow.com/questions/18085571/pip-install-error-setup-script-specifies-an-absolute-path'
Name       `this thread on stackoverflow'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/FAQ.html, line 383, col 10
Real URL   https://stackoverflow.com/questions/18085571/pip-install-error-setup-script-specifies-an-absolute-path
Check time 0.318 seconds
Warning    [http-redirected] Redirected to
           `https://stackoverflow.com/questions/18085571/pip-install-error-setup-script-specifies-an-absolute-path'
           status: 301 Moved Permanently.
URL        `http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html'
Name       `link2'
Parent URL file:///home/runner/work/LightGBM/LightGBM/docs/_build/html/GPU-Performance.html, line 162, col 8
Real URL   https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
Check time 3.449 seconds
Size       9KB
Warning    [http-redirected] Redirected to
           `https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html'
           status: 301 Moved Permanently.

Notes for Reviewers

✅ triggered the job on this branch and saw it succeed: https://github.com/microsoft/LightGBM/actions/runs/6766631790

@jameslamb jameslamb changed the title [docs] fix broken links WIP: [docs] fix broken links Oct 30, 2023
@jameslamb jameslamb marked this pull request as draft October 30, 2023 04:38
@@ -11,7 +11,7 @@ ignore=
http.*amd.com/.*
https.*dl.acm.org/doi/.*
https.*tandfonline.com/.*
ignorewarnings=http-robots-denied,https-certificate-error
ignorewarnings=http-redirected,http-robots-denied,https-certificate-error
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like as of the most recent release of linkchecker (v10.3.0), HTTP redirects now throw a warning.

From this PR: linkchecker/linkchecker#750

That's leading to this job failing even with 0 errors. (example build)

That's it. 1087 links in 1115 URLs checked. 28 warnings found. 0 errors found.
Error: Process completed with exit code 255.

In this PR, I'm proposing not considering the presence of such warnings to cause the job to fail. It'll take some time to go through the 28 warnings and fix them, and I'd prefer to get this job working again as soon as possible to catch truly broken links.

If reviewers agree, I'll put up a new issue documenting the desire to remove this filter and enforce no-redirects in this job again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I triggered check-links on this branch before adding this change to the config: https://github.com/microsoft/LightGBM/actions/runs/6766475624/job/18387635055

And despite there being 0 errors in the result ...

That's it. 1088 links in 1116 URLs checked. 28 warnings found. 0 errors found.
Stopped checking at 2023-11-06 05:04:50+000 (6 minutes, 20 seconds)

... it's still failing

Error: Process completed with exit code 255.

I suspect that maybe linkchecker returns a non-0 exit code if any warnings are found. The last successful run in this project was https://github.com/microsoft/LightGBM/actions/runs/6219774651/job/16878494852, and that ended with:

That's it. 1089 links in 1125 URLs checked. 0 warnings found. 0 errors found.

@jameslamb jameslamb changed the title WIP: [docs] fix broken links [docs] fix broken links Nov 6, 2023
@jameslamb jameslamb marked this pull request as ready for review November 6, 2023 05:28
@jameslamb jameslamb merged commit 1600422 into master Nov 6, 2023
41 checks passed
@jameslamb jameslamb deleted the docs/fix-links branch November 6, 2023 17:59
david-cortes pushed a commit to david-cortes/LightGBM that referenced this pull request Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants