Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appstreamcli validate should enforce valid urls #145

Closed
aleixpol opened this issue Oct 19, 2017 · 13 comments
Closed

appstreamcli validate should enforce valid urls #145

aleixpol opened this issue Oct 19, 2017 · 13 comments

Comments

@aleixpol
Copy link
Collaborator

Making sure that <screenshot> and <url> are in an existing server and don't return 404 would be nice.

@ximion
Copy link
Owner

ximion commented Oct 19, 2017

I agree, but that would mean linking libappstream against libcurl or libsoup, which I would like to avoid. That's so far the only reason why this feature doesn't exist yet.

@ximion
Copy link
Owner

ximion commented Oct 21, 2017

I could call the curl or wget binaries though, I'll think about implementing something like that (not ideal, but at least it gives us URL validation).

@ximion ximion closed this as completed in 3e61164 Oct 21, 2017
@ximion
Copy link
Owner

ximion commented Oct 21, 2017

This is implemented now, making the validator just spawn curl for this purpose.
However, URLs like https://bugs.kde.org/enter_bug.cgi?format=guided&product=kfind now get flagged as broken, because apparently the server returns an empty response here initially (does it redirect? not sure what's happening here).

So, I am not sure if I like this feature the way it is right now.

@aleixpol
Copy link
Collaborator Author

It doesn't return an empty response here... :/

$ curl "https://bugs.kde.org/enter_bug.cgi?format=guided&product=kfind" | wc 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8216    0  8216    0     0   8216      0 --:--:-- --:--:-- --:--:-- 30429
    264     624    8216

What am I missing?

@ximion
Copy link
Owner

ximion commented Oct 23, 2017

To make this check quick and easy to embed into ascli, the following command is used:

curl --output /dev/null --silent --head --fail https://example.com

And that returns a non-zero exist status for the bugs.kde.org URLs, maybe because the server doesn't allow header-only requests, or because there is some weird rerouting going on.
Maybe there is a better curl command for this though.

@aleixpol
Copy link
Collaborator Author

aleixpol commented Nov 2, 2017

Then don't pass --head?

$ curl --output /dev/null --silent --head --fail https://bugs.kde.org
$ echo $?
52
$ curl --output /dev/null --silent --fail https://bugs.kde.org
$ echo $?
0

@ximion
Copy link
Owner

ximion commented Nov 2, 2017

In that case it fetches the whole document, which makes validation take longer (if we need to download the full URL every time instead of just checking the HTTP headers).

@aleixpol
Copy link
Collaborator Author

aleixpol commented Nov 2, 2017

Is it unbearably longer? xD

Note that these checks can also be triggered in parallel...

@ximion
Copy link
Owner

ximion commented Nov 2, 2017

Well, that depends on how fast your internet connection is :P
These checks can't easily be triggered in parallel given the sequential way the validator works (of course you can add parallelism to this particular section, but it would be quite a bit more effort).

@aleixpol aleixpol reopened this Nov 3, 2017
@aleixpol
Copy link
Collaborator Author

aleixpol commented Nov 3, 2017

Well now it just fails for every KDE component, despite the URL being valid.

@ximion
Copy link
Owner

ximion commented Nov 4, 2017

Fix your server to return the right header/allow HEAD requests ;-)

(I have one other idea that might work with curl)

@ximion ximion closed this as completed in 3f4ef50 Nov 4, 2017
@ximion
Copy link
Owner

ximion commented Nov 4, 2017

Try again, this worked in my tests.
Sending a GET instead of HEAD request but downloading only the first byte is a good compromise here, IMO.
Still, I consider servers that don't answer HEAD to not play nice (until recently I didn't even know that not answering that request was allowed).

@aleixpol
Copy link
Collaborator Author

aleixpol commented Nov 6, 2017

Confirmed this fixes the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants