Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF8 in links #50

Closed
matkoniecz opened this issue Aug 5, 2021 · 2 comments
Closed

UTF8 in links #50

matkoniecz opened this issue Aug 5, 2021 · 2 comments

Comments

@matkoniecz
Copy link

matkoniecz commented Aug 5, 2021

Sorry if that is my misunderstanding again but as I understand it the UTF8 is de facto working in links

UTF8 may be internally different but browsers seems 100% fine with links including letters like https://en.wikipedia.org/wiki/Ogonek

Sanity check: https://stackoverflow.com/questions/22357509/can-urls-have-utf-8-characters

Even DNS supports URF8 characters (with some workarounds and restrictions) https://en.wikipedia.org/wiki/Internationalized_domain_name

From running blcl -ro . --filter-level 3 on https://github.com/matkoniecz/broken-link-checker-local-utf8

git clone https://github.com/matkoniecz/broken-link-checker-local-utf8
cd broken-link-checker-local-utf8
blcl -ro . --filter-level 3
Starting server for path: /home/mateusz/Desktop/test/broken-link-checker-local-utf8
Getting links from: http://localhost:43451/
├─BROKEN─ http://localhost:43451/test.html (HTTP_404)
├───OK─── http://localhost:43451/test%20space.html
└─BROKEN─ http://localhost:43451/test_zażółć.html (BLC_UNKNOWN)
Finished! 3 links found. 2 broken.

Getting links from: http://localhost:43451/test%20space.html
├─BROKEN─ http://localhost:43451/test.html (HTTP_404)
└─BROKEN─ http://localhost:43451/test_zażółć.html (HTTP_undefined)
Finished! 3 links found. 1 excluded. 2 broken.

Finished! 6 links found. 1 excluded. 4 broken.
Elapsed time: 0 seconds 

sorry if I misunderstood something again and it works only by accident (but both local files and deployed website works fine with such links that are reported as broken here!)

@LukasHechenberger
Copy link
Owner

Can you report this to broken-link-checker?
The crawling itself is done over there...

@matkoniecz
Copy link
Author

matkoniecz commented Aug 5, 2021

stevenvachon/broken-link-checker#234

Sorry for not checking upstream first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants