Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP: Remove short read check in nxt_http_static_buf_completion() #1067

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

ac000
Copy link
Member

@ac000 ac000 commented Jan 18, 2024

On GH, @tonychuuy reported an issue when using Units 'share' action they would get the following error in the unit log

2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to accomplish and the error message is not entirely accurate either.

The check in question goes like

n = pread(file->fd, buf, size, offset);
return n;
...
if (n != size) {
    if (n >= 0) {
        /* log file changed error and finish */
    }

   	/* log general error and finish */
}

If the number of bytes read is not what we asked for and is > -1 (i.e not an error) then it says the file has changed, but really it only checks if the file has shrunk (we can't get back more bytes than we asked for) since it was stat'd.

This is what happens

recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to get the file size.

We would then go into a pread/writev loop reading the file data and sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file of 149922 bytes)

pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above check triggered and the file transfer was aborted and the above error message logged.

Normally for a regular file you will only get less bytes than asked for if the read call is interrupted by a signal or you're near the end of file.

There is however at least another situation where this may happen, if the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served over the 9P filesystem protocol. Unit was running in a docker container in an Ubuntu VM under Windows/WSL2 and the files where being passed through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in real world scenarios.

If it was really desired to check if the had changed since it was opened/stat'd then it would require a different methodology and be a patch for another day. But as it stands this current check does more harm than good, so lets just remove it.

With it removed we now get for the above test file

recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
close(23) = 0
writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two pread(2)s and another writev(2) and all the file data has been read and sent to the client.

@hongzhidao
Copy link
Contributor

hongzhidao commented Jan 18, 2024

Hi @ac000,
Good investigation.

I say bogus because it's not clear what the check is trying to accomplish and the error message is not entirely accurate either.

If it was really desired to check if the had changed since it was opened/stat'd then it would require a different methodology and be a patch for another day.

Agree.

It turns out that was indeed the case here, the files where being served over the 9P filesystem protocol. Unit was running in a docker container in an Ubuntu VM under Windows/WSL2 and the files where being passed through to the VM from Windows over 9P.

Could you check if it's a specific error code in the use case? Try with something like this.

diff -r 5bc9788b05ec src/nxt_http_static.c
--- a/src/nxt_http_static.c     Mon Dec 11 10:46:58 2023 +0800
+++ b/src/nxt_http_static.c     Thu Jan 18 09:08:04 2024 +0800
@@ -882,7 +882,8 @@
     if (n != size) {
         if (n >= 0) {
             nxt_log(task, NXT_LOG_ERR, "file \"%FN\" has changed "
-                    "while sending response to a client", fb->file->name);
+                    "while sending response to a client (%E)",
+                    fb->file->name, nxt_errno);
         }

I mean if the patch is to fix some error code, we can enhance it and keep the logic n != size if it's helpful. It depends on if the errno is specific.

@ac000
Copy link
Member Author

ac000 commented Jan 18, 2024

It's not an error, that would be signalled by a return of -1 and that case is handled by the outer if ()...

pread(2) has been asked to read X amount of bytes, it returned > -1 < X amount of bytes. Let me quote from the man page

RETURN VALUE top

   On success, pread() returns the number of bytes read (a return of
   zero indicates end of file) and pwrite() returns the number of
   bytes written.

   Note that it is not an error for a successful call to transfer
   fewer bytes than requested (see [read(2)](https://man7.org/linux/man-pages/man2/read.2.html) and [write(2)](https://man7.org/linux/man-pages/man2/write.2.html)).

   On error, -1 is returned and [errno](https://man7.org/linux/man-pages/man3/errno.3.html) is set to indicate the error.

Note the middle paragraph.

As can be seen, with this did we read less bytes than asked for but it's not an error check in place, we terminate the transfer if that check is true.

If we remove that check we will pread(2) as many times as necessary (possibly reading less bytes than asked for each time) until all the file is read and sent.

Assuming a read on a regular file will return all the bytes asked for, is often true but there are circumstances where it is not, this, where the files being served are coming from a network filesystem is one of them.

@ac000 ac000 force-pushed the short-reads branch 2 times, most recently from b7ca3fa to 93c8d6d Compare January 18, 2024 06:04
@ac000
Copy link
Member Author

ac000 commented Jan 18, 2024

Highlighted in the code snippet where the problem lies.

Added Zhidao's Reviewed-by tag.

$ git range-diff f4a8c080...b7ca3fae
1:  f4a8c080 ! 1:  b7ca3fae HTTP: Remove short read check in nxt_http_static_buf_completion()
    @@ Commit message
             if (n != size) {
                 if (n >= 0) {
                     /* log file changed error and finish */
    +
    +               /* >> Problem is here << */
                 }
     
                 /* log general error and finish */
    @@ Commit message
         Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
         Fixes: 08a8d1510 ("Basic support for serving static files.")
         Closes: https://github.com/nginx/unit/issues/1064
    +    Reviewed-by: Zhidao Hong <z.hong@f5.com>
         Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
     
      ## src/nxt_http_static.c ##

@ac000
Copy link
Member Author

ac000 commented Jan 18, 2024

Fixed alignment in commit message

$ git range-diff b7ca3fae...93c8d6d0
1:  b7ca3fae ! 1:  93c8d6d0 HTTP: Remove short read check in nxt_http_static_buf_completion()
    @@ Commit message
                 if (n >= 0) {
                     /* log file changed error and finish */
     
    -               /* >> Problem is here << */
    +                /* >> Problem is here << */
                 }
     
                 /* log general error and finish */

@ac000
Copy link
Member Author

ac000 commented Jan 18, 2024

Wrap the (n == -1) check in a nxt_slow_path()

$ git range-diff 93c8d6d0...061ebc0e
1:  93c8d6d0 ! 1:  061ebc0e HTTP: Remove short read check in nxt_http_static_buf_completion()
    @@ src/nxt_http_static.c: nxt_http_static_buf_completion(nxt_task_t *task, void *ob
     -                    "while sending response to a client", fb->file->name);
     -        }
     -
    -+    if (n == -1) {
    ++    if (nxt_slow_path(n == -1)) {
              nxt_http_request_error_handler(task, r, r->proto.any);
              goto clean;
          }

@tippexs tippexs added this to the 1.32 milestone Jan 18, 2024
@ac000
Copy link
Member Author

ac000 commented Jan 19, 2024

Check for NXT_ERROR from nxt_file_read() rather than its numeric value.

$ git range-diff 061ebc0e...b6aaa60e
1:  061ebc0e ! 1:  b6aaa60e HTTP: Remove short read check in nxt_http_static_buf_completion()
    @@ src/nxt_http_static.c: nxt_http_static_buf_completion(nxt_task_t *task, void *ob
     -                    "while sending response to a client", fb->file->name);
     -        }
     -
    -+    if (nxt_slow_path(n == -1)) {
    ++    if (nxt_slow_path(n == NXT_ERROR)) {
              nxt_http_request_error_handler(task, r, r->proto.any);
              goto clean;
          }

@tippexs tippexs added this to 💪 In Development in NGINX Unit roadmap-tracker Jan 19, 2024
@ac000 ac000 linked an issue Jan 19, 2024 that may be closed by this pull request
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
@ac000
Copy link
Member Author

ac000 commented Jan 20, 2024

Add Andrei's Reviewed-by

$ git range-diff b6aaa60e...02d1984c
1:  b6aaa60e ! 1:  02d1984c HTTP: Remove short read check in nxt_http_static_buf_completion()
    @@ Commit message
         Fixes: 08a8d1510 ("Basic support for serving static files.")
         Closes: https://github.com/nginx/unit/issues/1064
         Reviewed-by: Zhidao Hong <z.hong@f5.com>
    +    Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
         Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
     
      ## src/nxt_http_static.c ##

@ac000 ac000 merged commit 02d1984 into nginx:master Jan 20, 2024
NGINX Unit roadmap-tracker automation moved this from 💪 In Development to 🚀 Released Jan 20, 2024
@ac000 ac000 deleted the short-reads branch January 20, 2024 03:47
@alejandro-colomar
Copy link
Contributor

@ac000

I remember having seen this error while developing the gzip feature (not in the final version, though). I don't remember what was the trigger, but it was probably something I did wrong while trying things. That part of the code was the one I felt less comfortable with (it is, as you say, very weird), so when you review those patches, check twice that part of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🚀 Released
NGINX Unit roadmap-tracker
  
🚀 Released
Development

Successfully merging this pull request may close these issues.

Cannot serve css/js static files big size
5 participants