New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: add support to read and store the referrer header #6591
Conversation
Any opinions if using "URL" in this context is correct or should we just use 'referer'? [YES] After all the referrer field can contain any string, not just a URL. Next question, if |
1855392
to
12328d1
Compare
I vote for just
Hm. That's a good point. I think we should make sure any credentials get stripped off already in the internal logic. |
1dd7300
to
93141d1
Compare
Still some failures here and there, but couldn't tell conclusively that they have something to do with the changes in this PR: |
2e79051
to
65b3808
Compare
i've noticed seemingly random failures in the torture tests recently i'm not sure what that's about i'll take a look |
I've seen test 2100 fail in torture tests recently but when I've tried to reproduce locally I have not managed ... 😞 |
I found one problem in ASAN, exit on leak disrupts closing the memdebug file so it reports wrong results. Reported as google/sanitizers#1374. Could be solved in curl with an atexit to fclose the file. or with MEMDEBUG_LOG_SYNC but IMO we should close it explicitly. I'm not sure if valgrind does the same type of abort when a leak is detected. Second problem (an actual leak) has something to do with the proxy, all the arbitrary failed torture tests are proxy I noticed. 264 335 552 try -t on any of those. If I had to guess it's 46620b9 since that is a close proxy one and also I randomly picked the CI of some earlier commits and don't see anything similar. 264 I can reproduce consistently, and so far I've traced it back to this call returning OOM, which causes something to be missed at cleanup: Lines 2411 to 2412 in 94719e7
edit: ignore the line numbers I sprinkled in a lot of printfs.. |
@jay: Thank you for investigating! Any thoughts if we should make this PR pending till these get resolved, or do you think this looks fine enough to merge now? |
Reported-by: Jay Satiro Bug: #6591 (comment)
The 264 problem should hopefully be fixed with #6614... |
Reported-by: Jay Satiro Bug: #6591 (comment)
Reported-by: Jay Satiro Reviewed-by: Jay Satiro Reviewed-by: Emil Engler Closes #6614 Bug: #6591 (comment)
- Use atexit to register a dbg cleanup function that closes the logfile. ASAN/LSAN calls _exit() instead of exit() on error so the logfile must be closed explicitly on exit or data could be lost. Prior to this change the logfile was not explicitly closed so it was possible that if LSAN detected a leak and called _exit (which does not flush or close files like exit) then the logfile could be missing data. That could then cause curl's memanalyze to report false leaks (eg a malloc was recorded to the logfile but the corresponding free was discarded from the buffer instead of written to the logfile, then memanalyze reports that as a leak). Ref: google/sanitizers#1374 Bug: curl#6591 (comment) Closes #xxxx
65b3808
to
e54d5b2
Compare
One permanent-looking CI issue is that the Cirrus task for FreeBSD 12.1 fails at pkg update -f
Updating FreeBSD repository catalogue...
Fetching meta.txz: . done
Fetching packagesite.txz: .......... done
Processing entries:
Newer FreeBSD version for package py37-mlt:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1202000
- running kernel: 1201000
Ignore the mismatch and continue? [Y/n]: pkg: repository FreeBSD contains packages for wrong OS version: FreeBSD:12:amd64
Processing entries... done
Unable to update repository FreeBSD
Error updating repositories! Opened a PR that deletes CI builds with this image, following the suggestion of FreeBSD forum thread. |
docs/cmdline-opts/write-out.d
Outdated
@@ -93,6 +93,9 @@ When an HTTP request was made without --location to follow redirects (or when | |||
--max-redir is met), this variable will show the actual URL a redirect | |||
\fIwould\fP have gone to. (Added in 7.18.2) | |||
.TP | |||
.B referer | |||
The referrer header, if there was any. (Added in 7.76.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd stick to referer spelling (one r) throughout so it's less confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rule of thumb was to use the rr flavour in human sentences and the other one in code. (this also kind-of matches existing use). In this particular case we can work it around by going The Referer: header, if there was any. (Added in 7.76.0)
. This style can be found in CURLOPT_AUTOREFERER
and CURLOPT_REFERER
-related docs.
Does that sound OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both spellings seem to be used, see --referer using referrer for example. Personally I find it confusing but there seems to be some precedent for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it isn't used fully consistently as it is. I've pushed an update for this specific line here, which I'm hoping makes it look better.
e54d5b2
to
2bc92fc
Compare
Much better CI results so far. There is one test consistently failing for c-ares builds:
|
Yeah, my fix for 1188 was not good enough. I landed #6626 since then should fix that too. |
- add CURLINFO_REFERER libcurl option - add --write-out '%{referer}' command-line option - extend --xattr command-line option to fill user.xdg.referrer.url extended attribute with the referrer (if there was any) Closes #xxxx
2bc92fc
to
7afd2bf
Compare
Remaining failures:
It seems unlikely these have something to do with this patch, and they were happening quite randomly in the past few days for all CI passes. |
- Use atexit to register a dbg cleanup function that closes the logfile. LeakSantizier (LSAN) calls _exit() instead of exit() when a leak is detected on exit so the logfile must be closed explicitly or data could be lost. Though _exit() does not call atexit handlers such as this, LSAN's call to _exit() comes after the atexit handlers are called. Prior to this change the logfile was not explicitly closed so it was possible that if LSAN detected a leak and called _exit (which does not flush or close files like exit) then the logfile could be missing data. That could then cause curl's memanalyze to report false leaks (eg a malloc was recorded to the logfile but the corresponding free was discarded from the buffer instead of written to the logfile, then memanalyze reports that as a leak). Ref: google/sanitizers#1374 Bug: curl#6591 (comment) Closes #xxxx
random ci weirdness. i restarted them and they passed |
- Use atexit to register a dbg cleanup function that closes the logfile. LeakSantizier (LSAN) calls _exit() instead of exit() when a leak is detected on exit so the logfile must be closed explicitly or data could be lost. Though _exit() does not call atexit handlers such as this, LSAN's call to _exit() comes after the atexit handlers are called. Prior to this change the logfile was not explicitly closed so it was possible that if LSAN detected a leak and called _exit (which does not flush or close files like exit) then the logfile could be missing data. That could then cause curl's memanalyze to report false leaks (eg a malloc was recorded to the logfile but the corresponding free was discarded from the buffer instead of written to the logfile, then memanalyze reports that as a leak). Ref: google/sanitizers#1374 Bug: #6591 (comment) Closes #6620
This patch implements:
CURLINFO_REFERER
libcurl option--write-out '%{referer}'
command-line option--xattr
to filluser.xdg.referrer.url
extended attribute with the referrer. This extended attribute is supported bywget
and the referrer value is also stored by at least Safari (albeit in a platform-specific format.)This can also be useful to retain a meaningful download URL where the initial URL is being redirected to a temporary one, e.g. with GitHub releases:
Implemented:
Any comments/suggestions are welcome.