Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot serve css/js static files big size #1064

Closed
tonychuuy opened this issue Jan 15, 2024 · 44 comments · Fixed by #1067
Closed

Cannot serve css/js static files big size #1064

tonychuuy opened this issue Jan 15, 2024 · 44 comments · Fixed by #1067

Comments

@tonychuuy
Copy link

I always get the following error:

2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client
 2024/01/15 17:53:41 [error] 49#52 *104 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a 
telescope/app.css" has changed while sending response to a client

This file size is 147kb, but if I try to load a css in the same directory with size of 1kb, it works fine.
My config file:

{
    "listeners":{
        "*:80":{
            "pass":"routes"
        }
    },

    "routes": [
        {
            "match": {
                "uri": "!/index.php"
            },
            "action": {
                "share": "/var/www/html/public$uri",
                "fallback": {
                    "pass": "applications/laravel"
                }
            }
        }
    ],

    "applications": {
        "laravel": {
            "type": "php",
            "root": "/var/www/html/public/",
            "script": "index.php"
        }
    }
}
@danielledeleo
Copy link
Contributor

danielledeleo commented Jan 15, 2024

I'm unable to replicate this as being a problem with large static files. The error mentions that the file has changed during the request, which does appear to be the issue. I'll look into it.

"/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

Besides the error in the logs, does the large CSS file load properly in the browser?

@tonychuuy
Copy link
Author

Trying to access directly to the file in the browser:
NS_ERROR_NET_RESET
The connection was reset

@tonychuuy
Copy link
Author

I'm unable to replicate this as being a problem with large static files.

I even delete some content to the css file it loads properly when it is about 64kb

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

Hi @tonychuuy

Seeing as we are unable to reproduce this behaviour, here is how you may be able to debug this a little further... I'll demonstrate with an example... (this assumes your running on Linux...)

Simple Unit config

{
    "listeners": {
        "[::1]:8080": {
            "pass": "routes"
        }
    },

    "routes": [
        {
            "match": {
                "uri": "*"
            },

            "action": {
                  "share": "/tmp/unit-share$uri"
            }
        }
    ]
}

Create a 1MiB file in the share directory

$ dd if=/dev/zero of=1m bs=1 count=$((1024*1024))
1048576+0 records in
1048576+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 2.49936 s, 420 kB/s

Configure unit with the above config (in the following I'm assuming unit is just running as yourself...)

Find the PID of the 'unit: router' process, something like

$ ps -ef | grep 'unit: router'
andrew   338444 338441  0 00:45 pts/7    00:00:00 unit: router
andrew   338902   7964  0 01:02 pts/2    00:00:00 grep --color=auto unit: route

Then start a strace(1) on it

$ strace -f -p 338444
...

Where the PID (-p option) is the one you got above.

Now hit Unit with a request for that file, something like

$ curl -v localhost:8080/1m -o /dev/null
* processing: localhost:8080/1m
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /1m HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.2.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Last-Modified: Tue, 16 Jan 2024 00:44:03 GMT
< ETag: "65a5d153-100000"
< Server: Unit/1.32.0
< Date: Tue, 16 Jan 2024 01:06:49 GMT
< Content-Length: 1048576
< 
{ [33107 bytes data]
100 1024k  100 1024k    0     0   237M      0 --:--:-- --:--:-- --:--:--  250M
* Connection #0 to host localhost left intact

So you see it fully retrieved the 1MiB file.

While the strace(1) output shows

[{events=EPOLLIN, data={u32=1073746752, u64=140717087265600}}], 32, -1) = 1
[pid 338447] accept4(11, {sa_family=AF_INET6, sin6_port=htons(39486), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [28], SOCK_NONBLOCK) = 32
[pid 338447] accept4(11, 0x7ffb40001808, [28], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 338447] epoll_ctl(16, EPOLL_CTL_ADD, 32, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=1073752128, u64=140717087270976}}) = 0

Unit accepted a connection from the client and will now wait for the request...

[pid 338447] epoll_wait(16, [{events=EPOLLIN, data={u32=1073752128, u64=140717087270976}}], 32, 180000) = 1
[pid 338447] recvfrom(32, "GET /1m HTTP/1.1\r\nHost: localhos"..., 2048, 0, NULL, NULL) = 79
[pid 338447] openat(AT_FDCWD, "/tmp/unit-share/1m", O_RDONLY|O_NONBLOCK) = 33
[pid 338447] newfstatat(33, "", {st_mode=S_IFREG|0644, st_size=1048576, ...}, AT_EMPTY_PATH) = 0

Unit received the request from the client. opened the requested file and we can see from the fstatat((2) (newfstatat) that the file size is indeed 1MiB (st_size=1048576)

[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 131072
[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 131072) = 131072
[pid 338447] writev(32, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=173}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 3) = 262317

Unit reads 256KiB worth of data from the file and sends it to the client...

[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 262144) = 131072
[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 393216) = 131072
[pid 338447] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144

and again...

[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 524288) = 131072
[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 655360) = 131072
[pid 338447] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144

and again...

[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 786432) = 131072
[pid 338447] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 917504) = 131072
[pid 338447] close(33)                  = 0
[pid 338447] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144
[pid 338447] setsockopt(32, SOL_TCP, TCP_NODELAY, [1], 4) = 0

and sends the final 256KiB of data back to the client...

[pid 338447] epoll_wait(16, [{events=EPOLLIN|EPOLLRDHUP, data={u32=1073752128, u64=140717087270976}}], 32, 179999) = 1
[pid 338447] recvfrom(32, "", 2048, 0, NULL, NULL) = 0
[pid 338447] epoll_ctl(16, EPOLL_CTL_DEL, 32, 0xebe9e4) = 0
[pid 338447] epoll_wait(16, [], 32, 0)  = 0
[pid 338447] close(32)                  = 0
[pid 338447] epoll_wait(16,

Then some final clean up...

The return values of the pread(2) system calls should add up to our file size here of 1048576, as likewise should the return values of the writev(2) system calls.

So perhaps if you are able to use the above method when trying to retrieve a problematic file and show the strace(1) output?.

@tonychuuy
Copy link
Author

tonychuuy commented Jan 16, 2024

I'm using the docker image unit:1.31.1-php8.2, I'm trying to apply this test inside the running container, by using docker compose exec <container> bash, but this does not have a ps command or strace.

@tonychuuy
Copy link
Author

I installed procps and strace in the docker image, but now i'm getting the following error:

root@d7156127adc3:/var/www/html# ps -ef | grep 'unit: router'
unit          48       1  0 10:44 ?        00:00:00 unit: router
root          90      53  0 10:48 pts/0    00:00:00 grep unit: router

root@d7156127adc3:/var/www/html# strace -f -p 48
strace: attach: ptrace(PTRACE_SEIZE, 48): Operation not permitted

@tonychuuy
Copy link
Author

Using the top option from docker and applying strace outside docker works for the test:

root@d7156127adc3:/var/www/html# curl -v localhost:8080/1m -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /1m HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Last-Modified: Tue, 16 Jan 2024 10:31:07 GMT
< ETag: "65a65aeb-100000"
< Server: Unit/1.31.1
< Date: Tue, 16 Jan 2024 11:00:02 GMT
< Content-Length: 1048576
<
{ [65363 bytes data]
100 1024k  100 1024k    0     0  62.5M      0 --:--:-- --:--:-- --:--:-- 62.5M
* Connection #0 to host localhost left intact
[pid 84809] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=3221230560, u64=140169478935520}}], 32, -1) = 1
[pid 84809] accept4(11, {sa_family=AF_INET, sin_port=htons(59566), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 32
[pid 84809] accept4(11, 0x7f7bc0001c08, [16], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 84809] epoll_ctl(14, EPOLL_CTL_ADD, 32, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=3221235936, u64=140169478940896}}) = 0
[pid 84809] epoll_wait(14, [{events=EPOLLIN, data={u32=3221235936, u64=140169478940896}}], 32, 180000) = 1
[pid 84809] recvfrom(32, "GET /1m HTTP/1.1\r\nHost: localhos"..., 2048, 0, NULL, NULL) = 80
[pid 84809] openat(AT_FDCWD, "/tmp/unit-share/1m", O_RDONLY|O_NONBLOCK) = 33
[pid 84809] fstat(33, {st_mode=S_IFREG|0777, st_size=1048576, ...}) = 0
[pid 84809] mprotect(0x7f7bc0021000, 28672, PROT_READ|PROT_WRITE) = 0
[pid 84809] mprotect(0x7f7bc0028000, 135168, PROT_READ|PROT_WRITE) = 0
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 131072
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 131072) = 131072
[pid 84809] writev(32, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=173}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 3) = 262317
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 262144) = 131072
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 393216) = 131072
[pid 84809] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 524288) = 131072
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 655360) = 131072
[pid 84809] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 786432) = 131072
[pid 84809] pread64(33, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 917504) = 131072
[pid 84809] close(33)                   = 0
[pid 84809] writev(32, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=131072}], 2) = 262144
[pid 84809] setsockopt(32, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 84809] epoll_wait(14, [{events=EPOLLIN|EPOLLRDHUP, data={u32=3221235936, u64=140169478940896}}], 32, 179980) = 1
[pid 84809] recvfrom(32, "", 2048, 0, NULL, NULL) = 0
[pid 84809] epoll_ctl(14, EPOLL_CTL_DEL, 32, 0x5641a3208ca4) = 0
[pid 84809] epoll_wait(14, [], 32, 0)   = 0
[pid 84809] close(32)                   = 0
[pid 84809] epoll_wait(14,

@tonychuuy
Copy link
Author

The same test with my config and the css file:

root@0af3bfe1dd1a:/var/www/html# curl -v localhost/vendor/telescope/app.css -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /vendor/telescope/app.css HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
>
* Empty reply from server
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
[pid 87080] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=1140860976, u64=139965535103024}}], 32, -1) = 1
[pid 87080] accept4(14, {sa_family=AF_INET, sin_port=htons(32912), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 38
[pid 87080] accept4(14, 0x7f4c44001c88, [16], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 87080] epoll_ctl(20, EPOLL_CTL_ADD, 38, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=1140861408, u64=139965535103456}}) = 0
[pid 87080] epoll_wait(20, [{events=EPOLLIN, data={u32=1140861408, u64=139965535103456}}], 32, 180000) = 1
[pid 87080] recvfrom(38, "GET /vendor/telescope/app.css HT"..., 2048, 0, NULL, NULL) = 97
[pid 87080] openat(AT_FDCWD, "/var/www/html/public/vendor/telescope/app.css", O_RDONLY|O_NONBLOCK) = 39
[pid 87080] fstat(39, {st_mode=S_IFREG|0777, st_size=149922, ...}) = 0
[pid 87080] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c51cb0000
[pid 87080] pread64(39, "@charset \"UTF-8\";\n/*!\n * Bootstr"..., 131072, 0) = 65512
[pid 87080] write(2, "2024/01/16 11:11:49 [error] 48#5"..., 138) = 138
[pid 87080] munmap(0x7f4c51cb0000, 135168) = 0
[pid 87080] close(39)                   = 0
[pid 87080] shutdown(38, SHUT_RDWR)     = 0
[pid 87080] epoll_ctl(20, EPOLL_CTL_DEL, 38, 0x557e2aa4cd64) = 0
[pid 87080] epoll_wait(20, [], 32, 0)   = 0
[pid 87080] close(38)                   = 0

@javorszky
Copy link
Contributor

@tonychuuy out of curiosity, is this a Laravel app running the Telescope package?

@tonychuuy
Copy link
Author

Yes it is.

@javorszky
Copy link
Contributor

Thanks. I'll try to figure out whether there's anything weird about that file in particular 🤔 . I managed to reproduce the error by actually continually changing the size of a file being shared, though it doesn't look like that's happening in your case.

This happens 100% of the time you're trying to access that file?

@tonychuuy
Copy link
Author

Yes 100%, I can access telescope but it renders horrible because JS/CSS is not loaded.

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

@tonychuuy

Thanks for managing to do that test!

Yes, I can seem something has gone very wrong. Let me see if I can replicate it with a file of the same size as yours.

What kind of filesystem are these files, or I guess the docker image stored on? Any networking involved? e.g nfs/cifs etc...

@tonychuuy
Copy link
Author

tonychuuy commented Jan 16, 2024

This is my config for docker:

docker-compose.yml

version: '3.8'
services:
  php:
    build:
      target: development
    ports:
      - "80:80"
    volumes:
      - .:/var/www/html
      - ./unit.config.json:/docker-entrypoint.d/config.json
    networks:
      - development

I have more services like mysql and redis.

Dockerfile

FROM unit:1.31.1-php8.2 as base

ADD https://github.com/mlocati/docker-php-extension-installer/releases/latest/download/install-php-extensions /usr/local/bin/

RUN chmod +x /usr/local/bin/install-php-extensions && \
    install-php-extensions \
    bcmath \
    imagick \
    intl \
    opcache \
    pcntl \
    pdo_mysql \
    redis \
    soap \
    xsl

FROM base as development

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

Thanks, not really familiar with docker myself...

[pid 87080] openat(AT_FDCWD, "/var/www/html/public/vendor/telescope/app.css", O_RDONLY|O_NONBLOCK) = 39
[pid 87080] fstat(39, {st_mode=S_IFREG|0777, st_size=149922, ...}) = 0

OK...

[pid 87080] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c51cb0000

I wonder where this mmap(2) comes from?

[pid 87080] pread64(39, "@charset \"UTF-8\";\n/*!\n * Bootstr"..., 131072, 0) = 65512

So we wanted to read 128KiB, but only managed a little under 64KiB.

Just to double check some things, because this is one of these things that just doesn't make any sense, this is a regular file right?, not a pipe/socket or some such?

This also happens with any file over a certain size?

Another test to do here would be if you could, when unit is all up and running, copy that file to some other location and check its copied size.

Actually could you also check if this happens if you don't have the PHP application running, I.e remove the application part from the unit config to rule it out from doing something nefarious...

@tonychuuy
Copy link
Author

Just to double check some things, because this is one of these things that just doesn't make any sense, this is a regular file right?, not a pipe/socket or some such?

I think is a regular file, and as I said in another comment, I deleted some content of the same file to reach 64kb and it loads correctly.

This also happens with any file over a certain size?

Any file bigger than 64kb, js or css files are the ones I have tested.

Actually could you also check if this happens if you don't have the PHP application running, I.e remove the application part from the unit config to rule it out from doing something nefarious...

I removed the php application and accessing the file directly in the browser same error is presented, other small files are loaded correctly.

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

I think is a regular file, and as I said in another comment, I deleted some content of the same file to reach 64kb and it loads correctly.

Yes it is... st_mode=S_IFREG

When you did the 1MiB file test, that was effectively the same environment as the one your having the issue with?

Are you able to put that 1MiB in the same directory as the css file and try pulling both with curl?

@tonychuuy
Copy link
Author

It was the same environment, I put the file in the /tmp directory.

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

Is /tmp tmpfs?

$ mount | grep /tmp
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,nr_inodes=1048576,inode64)

Could you try putting it with the other files?

@tonychuuy
Copy link
Author

I just did that test:

root@533620ccf085:/var/www/html# curl -v localhost/vendor/telescope/1m -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /vendor/telescope/1m HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
>
* Empty reply from server
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
[pid  3749] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=872425520, u64=140072640849968}}], 32, -1) = 1
[pid  3749] accept4(14, {sa_family=AF_INET, sin_port=htons(59906), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 38
[pid  3749] accept4(14, 0x7f6534001c88, [16], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid  3749] epoll_ctl(20, EPOLL_CTL_ADD, 38, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=872425952, u64=140072640850400}}) = 0
[pid  3749] epoll_wait(20, [{events=EPOLLIN, data={u32=872425952, u64=140072640850400}}], 32, 180000) = 1
[pid  3749] recvfrom(38, "GET /vendor/telescope/1m HTTP/1."..., 2048, 0, NULL, NULL) = 92
[pid  3749] openat(AT_FDCWD, "/var/www/html/public/vendor/telescope/1m", O_RDONLY|O_NONBLOCK) = 39
[pid  3749] fstat(39, {st_mode=S_IFREG|0777, st_size=1048576, ...}) = 0
[pid  3749] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f65403cf000
[pid  3749] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f65403ae000
[pid  3749] pread64(39, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 65512
[pid  3749] write(2, "2024/01/16 16:23:04 [error] 48#5"..., 132) = 132
[pid  3749] munmap(0x7f65403cf000, 135168) = 0
[pid  3749] munmap(0x7f65403ae000, 135168) = 0
[pid  3749] close(39)                   = 0
[pid  3749] shutdown(38, SHUT_RDWR)     = 0
[pid  3749] epoll_ctl(20, EPOLL_CTL_DEL, 38, 0x561d4052ed64) = 0
[pid  3749] epoll_wait(20, [], 32, 0)   = 0
[pid  3749] close(38)                   = 0

@tonychuuy
Copy link
Author

Thinking this is a windows issue, I'm on windows using WSL with docker but the files are actually in the windows system. Let me try in a mac.

@tonychuuy
Copy link
Author

Trying the same config in a mac with docker desktop, it works perfectly, all the files from Telescope are loaded properly. Unfortunately I need to work on windows because this is an old mac and it is super slow.

@danielledeleo
Copy link
Contributor

Thinking this is a windows issue, I'm on windows using WSL with docker but the files are actually in the windows system. Let me try in a mac.

Someone on Reddit appears to have had a the same issue. While they don't mention they're running Unit, the error message is identical so I suspect they are: https://www.reddit.com/r/docker/comments/tjomqm/file_has_changed_while_sending_response_to_a/

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

Good find.

I know nothing about Windows or WSL. But are we talking about WSL version 1 or 2 here? And what version of Windows?

@tonychuuy
Copy link
Author

Good find.

I know nothing about Windows or WSL. But are we talking about WSL version 1 or 2 here? And what version of Windows?

WSL version: 2.0.14.0
Windows version: 10.0.19045.3930

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

I will give it a try on my WSL2 Ubuntu system. So you are saying the files are on C / D drive not in the WSL Linux file system? How do you start the docker compose service? From the WSL2 cli or from docker desktop?

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

From your

curl -v localhost/vendor/telescope/1m -o /dev/null
[pid  3749] pread64(39, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 65512

Interesting that it again only read 65512 bytes, same as when you tried the css file.

Your initial 1MiB file test worked I assume because it was being served out of tmpfs which is effectively RAM.

OK, so you're using WSL 2, I think I've read that it uses the 9p filesystem protocol, which is a network protocol, which could perhaps explain why you're seeing short reads...

Need to try and devise some way of testing that theory...

@tonychuuy
Copy link
Author

I will give it a try on my WSL2 Ubuntu system. So you are saying the files are on C / D drive not in the WSL Linux file system? How do you start the docker compose service? From the WSL2 cli or from docker desktop?

cli inside WSL ubuntu

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

Hmm, interesting

msize=n
The msize is the max 9P packet payload size. The default is 65512 (65536-24)

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

@ac000 correct! That's why I am ALWAYS using the Linux file system for all my project data as the 9P Client / Server can be very slow and causes issues BUT I have worked with projects and Unit and can't remember this issue. I will just try reproduce and share the findiges from my WSL2 Ubuntu and Docker Desktop setup in a couple of minutes / hour.

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

@ac000 let me know if you need anything. I have all the things set up 👍🏼

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

@tonychuuy

If using 9p is optional, can you try taking that out of the equation?

Alternatively you could try this patch

diff --git a/src/nxt_http_static.c b/src/nxt_http_static.c
index e51ba6b0..14420669 100644
--- a/src/nxt_http_static.c
+++ b/src/nxt_http_static.c
@@ -879,12 +879,7 @@ nxt_http_static_buf_completion(nxt_task_t *task, void *obj, void *data)
 
     n = nxt_file_read(fb->file, b->mem.start, size, fb->file_pos);
 
-    if (n != size) {
-        if (n >= 0) {
-            nxt_log(task, NXT_LOG_ERR, "file \"%FN\" has changed "
-                    "while sending response to a client", fb->file->name);
-        }
-
+    if (n == -1) {
         nxt_http_request_error_handler(task, r, r->proto.any);
         goto clean;
     }

@tonychuuy
Copy link
Author

Where do I change this?

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

@ac000 @tonychuuy it works for me having the files inside the Linux filesystem outside the mounted windows shares. So taking 9P out will solve it. Testing more

@tonychuuy
Copy link
Author

@tippexs But then I need to work completely on ubuntu right, how do you edit your files with an IDE installed on windows?

@ac000
Copy link
Member

ac000 commented Jan 16, 2024

@tonychuuy

Taking 9p out of the equation may be easier for you.

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

@tippexs But then I need to work completely on ubuntu right, how do you edit your files with an IDE installed on windows?

I do that using the Ubuntu mount on my windows box. Let me drop you a couple of screenshots.

grafik

This is how I do it.

I am also watching this thread for while microsoft/WSL#4197

So from my point of view I had nothing but issues with the 9P Network Filesystem. I am always working on the Linux file system and using the wsl target in windows.

grafik

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

This is an example project on my WSL on Unit. The css files are sometimes larger than 64K
grafik

This one just works super fine. I will copy it over to the /mnt/d/ and see what that will do.

@tonychuuy
Copy link
Author

@tippexs Amazing, everything is loaded correctly, but phpstorm IDE is so slow to open/indexing the files from ubuntu.

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

Good find.
I know nothing about Windows or WSL. But are we talking about WSL version 1 or 2 here? And what version of Windows?

WSL version: 2.0.14.0 Windows version: 10.0.19045.3930

I have just tested it on my end. My Environment

$>wsl --version
WSL version: 2.0.9.0
Kernel version: 5.15.133.1-1
WSLg version: 1.0.59
MSRDC version: 1.2.4677
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.3007

I have copied the project files to /mnt/d/projects/app. I am able to see the files in the Windows Explorer as well.

I have started the Unit Container from inside WSL and the curl just works but we have just checked the output of mnt.

drvfs on /mnt/d type 9p (rw,noatime,dirsync,aname=drvfs;path=D:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)

See the msize? This is the reason why it is working on my end. So I guess the patch from @ac000 should help. We are working on a process to let you build a new Docker Image / Dockerfile with the patch.

@tippexs
Copy link
Contributor

tippexs commented Jan 16, 2024

@tippexs Amazing, everything is loaded correctly, but phpstorm IDE is so slow to open/indexing the files from Ubuntu.

Take a look at this https://www.jetbrains.com/help/phpstorm/how-to-use-wsl-development-environment-in-product.html

Yes the initial indexing might be slower but after that PHPStorm should work just super smooth. I am more than happy to chat more about a dev environment using WSL2. Happy to move to a Discussion OR drop me message on Discord: h3x0ne

@tonychuuy
Copy link
Author

@tippexs Thanks, I will wait for the indexing process, if the patch is applied I maybe return to the windows system since this is a development environment.

@ac000
Copy link
Member

ac000 commented Jan 17, 2024

OK, so I was able to reproduce this locally by exporting a directory into a QEMU/KVM virtual machine running Fedora using the 9P protocol via

$ qemu-kvm ... -virtfs local,mount_tag=unit_share,path=/srv/unit-share,security_model=passthrough,readonly=on

In the guest I mounted it thus

# mount -t 9p -o trans=virtio,version=9p2000.L,msize=65512,ro unit_share /mnt/9p

Note I set the msize to 65512

After pointing Unit to this directory and executing (in the guest)

$ curl -v localhost:8080/tfile -o /dev/null

Where tfile is a file of 149922 bytes in length, I got the expected failure

* Empty reply from server
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Closing connection
curl: (52) Empty reply from server

And looking at the strace output

[pid  1851] recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
[pid  1851] openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
[pid  1851] newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We got the request from the client, opened the requested file and stat'd it...

[pid  1851] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbcfd428000
[pid  1851] pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
[pid  1851] write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We asked to read 131072 bytes but only got 61440

[pid  1851] munmap(0x7fbcfd428000, 135168) = 0
[pid  1851] close(23)                   = 0
[pid  1851] shutdown(22, SHUT_RDWR)     = 0

At which point the check in unit that if we read less then we asked for kicks in and aborts the transfer...

Now trying it with the above patch which removes that check

$ curl -v localhost:8080/tfile -o /dev/null
...
< HTTP/1.1 200 OK
< Last-Modified: Tue, 16 Jan 2024 14:25:19 GMT
< ETag: "65a691cf-249a2"
< Server: Unit/1.32.0
< Date: Wed, 17 Jan 2024 15:32:35 GMT
< Content-Length: 149922
< 
{ [33109 bytes data]
100  146k  100  146k    0     0  8469k      0 --:--:-- --:--:-- --:--:-- 8612k

All good, lets check the strace output

[pid  1967] recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
[pid  1967] openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
[pid  1967] newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

Again, we got the request from the client, opened the requested file and stat'd it (oh and just to be clear, we stat(2) the file to get its size).

[pid  1967] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
[pid  1967] pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
[pid  1967] pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
[pid  1967] writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461

So, now we do two reads totalling 80290 bytes and send that back to the client along with some headers.

[pid  1967] pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
[pid  1967] pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
[pid  1967] close(23)                   = 0
[pid  1967] writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

We do another two reads (giving a total of 149922 bytes read) and send that to the client and that's us done.

This was certainly a fun one to debug ;)

I'll turn the patch into a PR...

ac000 added a commit to ac000/unit that referenced this issue Jan 17, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
ac000 added a commit to ac000/unit that referenced this issue Jan 18, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

	   /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
ac000 added a commit to ac000/unit that referenced this issue Jan 18, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
ac000 added a commit to ac000/unit that referenced this issue Jan 18, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
ac000 added a commit to ac000/unit that referenced this issue Jan 19, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
@ac000 ac000 closed this as completed in 02d1984 Jan 20, 2024
@ac000
Copy link
Member

ac000 commented Jan 20, 2024

@tonychuuy

Thanks for your help in debugging this issue. This fix will appear in the next release of Unit, 1.32

andrey-zelenkov pushed a commit that referenced this issue Feb 27, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: #1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
pkillarjun pushed a commit to pkillarjun/unit that referenced this issue May 29, 2024
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d15 ("Basic support for serving static files.")
Closes: nginx#1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants