build.koreader.rocks / ota.koreader.rocks down (Azure bandwidth issues) #10615

Hzj-jie · 2023-06-26T19:14:17Z

Now it's a 1st gen vm on azure, the vm type has been deprecated and would be removed around 2024.
Meanwhile the Ubuntu is 16, and should be upgraded.

Hzj-jie · 2023-06-29T00:39:30Z

I have some time recently to work on it.
But before starting, do we still have a list of the binaries in the machine?

Frenzie · 2023-06-29T05:59:13Z

How do you mean precisely? Like the list of apt installed packages and the Docker images?

Wrt the OS we shouldn't need much of anything other than Bash and Docker; all the important things are done in Docker.

It might be quicker to talk on Gitter or something btw, if you're on there?

Hzj-jie · 2023-07-01T01:33:02Z

Oh, that's great. Previously I thought we copied the binaries.
When would be the good time to do the upgrade? I haven't done it before, I am not very sure if I can preserve the original ip.
I will do some homework first.

Frenzie · 2023-07-01T06:26:27Z

There are a couple of things on there like ncdu that can be useful but I don't think a specific effort is needed there. We can just install it if or when we need it. What's important is ops and the user configs.

Pinging @houqp for the Cloudflare stuff.

Hzj-jie · 2023-07-01T15:19:04Z

I know progress sync and ota are on the machine, while progress sync has persistent date. Anything else?

Frenzie · 2023-07-01T15:51:19Z

The configs and scripts, the signing key, I meant the ops thing quite literally (perhaps including the actual archives for ota, though that'd regenerate in time). Just everything that's in there.

What's in the cronfile is potentially also important, though it's probably just to run the cleanup script once or twice a day.

Hzj-jie · 2023-07-03T07:55:41Z

OK, let me first upgrade the machine type, currently it's running an A1_Basic which will be deprecated August 2024.
https://learn.microsoft.com/en-us/azure/virtual-machines/av1-series-retirement

Likely the B1ms VM type is the best approach.
B1ms | General purpose | 1 | 2 | 2 | 640 | 4 | Supported | $18.10
I.e. 1 vCPU 2G ram and 640 IOPS

The only thing matters is the temporary storage, 40GB vs 4GB, not sure if it's sufficient.

Or if 1G ram is enough, we can try B1s, and save more budget for the bandwidth - azure charges bandwidth separately which was the reason it ran out my budget from the subscription.
B1s | General purpose | 1 | 1 | 2 | 320 | 4 | Supported | $9.05
I.e. 1G ram and 320 IOPS vs 2G ram and 640 IOPS.

It seems like I lost access to the VM, previously it used a user name & password. But now it accepts only the ssh key-pair. But azure is very bad at the keys, replacing the key would break you guys I believe.
If possible, would you mind to add my public key so that I can log into it and take a look.

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDfnbW0jcCTFJknPks6Lir9ZZfiX8By62414r0bvN4ciIQWleU147Ma4ZBrR5E7GV8IyX4zbLmldI00uKbFb9q2IpHN7ebNmKfIOnDnTFOuLMPjsAUHCl13yIr0yLlEWILu1Tni7w3oeNXGy7WK2oDJ5DgwASvoCrQ5GKg3SNnmBJBk/1EMWVQHZ+SralMx5Udz80ij1YutWV0S8kJH3YgHXE1G4SVmTq9oC7riMI5l1QWJgaKynrY2D171VRhIbafqLkR7SmcN1Vw23fnAEbIga94SBcXyl9tVG2r5rYUSGyHkvO+rjc+XHf701AudeG/+LQB2Uf5t90y8e1oV5IREeVz2BIYomQBNQjTyS+EoOB+ai/ponXwaeoVUxTdYRWQTzVtZ0ewPoLTheCK8qEDwm04al4xjWgfCAXzd+5ZKpS2NcgLi+RDL3bN5uExMPevOtkVhhr1BrdJFBz7LeB1X7cx34I1GxhuLfGMoh0oVYhmv8dC2RIoNCngRzFVTKq0= hzj-jie@hzj-jie-x1

Hzj-jie · 2023-07-03T08:04:03Z

By the way, changing the machine type (size in azure concept) would preserve the data but restart the machine. Should we set up the services into systemd first?

Frenzie · 2023-07-03T08:34:10Z

I'm thinking by Docker container name, something like this?

[Unit]
Description=Docker nginx
Requires=docker.service
After=docker.service

[Service]
Restart=always
ExecStart=/usr/bin/docker start -a nginx
ExecStop=/usr/bin/docker stop -t 2 nginx

[Install]
WantedBy=multi-user.target

Frenzie · 2023-07-03T08:42:12Z

The only thing matters is the temporary storage, 40GB vs 4GB, not sure if it's sufficient.

What does temporary storage mean exactly? Over in /mnt? That's not really used atm; actually I'm not sure I even knew that existed. I figure we can easily make do with 4 GB for some temp extraction/file manipulation provided the actually important permanent stuff is at least some 30 GB as it is now.

Hzj-jie · 2023-07-03T17:10:59Z

I don't know, but I can take a look once I can login to the machine.

poire-z · 2023-07-03T17:32:13Z

I've just added your key into your own authorized_keys (hoping you remember the spelling of your username :)
(Been years since I last logged into this server :)

Hzj-jie · 2023-07-06T07:59:25Z

I cleaned up some my bad user names. But if I unfortunately broke your authentication, let me know.

Hzj-jie · 2023-07-06T08:18:11Z

So the machine has two disks. The root is sda1, persistent AFAICT, and it's ~30G; the sdb is mounted at /mnt, temporary, and it's ~40G. So I think the "Temporary storage" means it, but we never use it.

/dev/sdb1       41151808    49036  38989340   1% /mnt

Hzj_jie@hzj-jie-ubuntu:/$ ll /mnt/
total 28
drwxr-xr-x  3 root root  4096 Jun 26 02:24 ./
drwxr-xr-x 25 root root  4096 Jun 26 02:23 ../
-r--r--r--  1 root root   639 Jun 26 02:24 DATALOSS_WARNING_README.txt
drwx------  2 root root 16384 Jun 26 02:23 lost+found/

The sda1 in contrast is 98% full,

/dev/sda1       29711408 29068328    626696  98% /

Hzj_jie@hzj-jie-ubuntu:/$ du -h -d 1 2>/dev/null
44K     ./tmp
680K    ./run
4.0K    ./root
17M     ./bin
14M     ./sbin
35M     ./opt
0       ./dev
159M    ./boot
8.0K    ./snap
4.0K    ./lib64
952M    ./lib
7.7M    ./etc
0       ./proc
249M    ./home
17G     ./ops
4.0K    ./docker
16K     ./lost+found
1.3G    ./usr
0       ./sys
24K     ./mnt
4.0K    ./srv
4.0K    ./media
438M    ./var
20G     .

Pretty much everything is in /ops, or more specifically, /ops/prod/build/download/stable. Are we serving only the latest build from here via ota? If so we can delete the old stable versions then.

For the server update, it's lucky that the do-release-upgrade exists on the vm and we may use it.

Frenzie · 2023-07-06T09:05:32Z

If so we can delete the old stable versions then.

Not as a matter of course, people depend on that and we purposefully keep a limited number of old ones around. But in order to free up 10+ GB for a release upgrade, go right ahead. :-)

Frenzie · 2023-07-06T09:06:11Z

Same story for the older nightlies.

Frenzie · 2023-07-06T09:07:01Z

PS I'm sure you're aware, but don't forget to run the upgrade in tmux in case of connectivity issues. :-)

Hzj-jie · 2023-07-07T21:06:56Z

I will try :)

Hzj-jie · 2023-07-09T07:34:29Z

Added and enabled services:

Hzj_jie@hzj-jie-ubuntu:/ops/prod/build/download$ sudo systemctl enable kosync.service
Created symlink from /etc/systemd/system/multi-user.target.wants/kosync.service to /etc/systemd/system/kosync.service.
Hzj_jie@hzj-jie-ubuntu:/ops/prod/build/download$ sudo systemctl enable kosync.service
Hzj_jie@hzj-jie-ubuntu:/ops/prod/build/download$ sudo systemctl enable nginx
Created symlink from /etc/systemd/system/multi-user.target.wants/nginx.service to /etc/systemd/system/nginx.service.
Hzj_jie@hzj-jie-ubuntu:/ops/prod/build/download$ sudo systemctl enable nightswatcher
Created symlink from /etc/systemd/system/multi-user.target.wants/nightswatcher.service to /etc/systemd/system/nightswatcher.service.

Anyone want to confirm if I have done it right?

If everything goes right, we may want to add these files into repo.

Frenzie · 2023-07-09T08:07:42Z

Luckily you don't seem to have interfered with my release proceedings. 😅

I'll check in a few minutes.

Hzj-jie · 2023-07-09T08:11:43Z

Oh, I will definitely let you guys know first before proceeding updates or restarting.

Frenzie · 2023-07-09T09:02:10Z

I adjusted the nightswatcher startup script which had --rm in it (so stop would remove it immediately) but I think it's good to go now. I also temporarily removed all the nightly/stable files that are merely nice to have and not crucial, so I figure there should be enough space for a release upgrade.

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            821M     0  821M   0% /dev
tmpfs           169M   18M  151M  11% /run
/dev/sda1        29G   10G   19G  36% /
tmpfs           842M  368K  842M   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           842M     0  842M   0% /sys/fs/cgroup
none             64K     0   64K   0% /etc/network/interfaces.dynamic.d
/dev/sdb1        40G   48M   38G   1% /mnt
tmpfs           169M     0  169M   0% /run/user/1004
tmpfs           169M     0  169M   0% /run/user/1002

Frenzie · 2023-07-09T09:42:01Z

Incidentally I just noticed a Prometheus and Grafana that haven't run in years. Therefore no need to add it to SystemD at this time, but just mentioning it.

Hzj-jie · 2023-07-12T07:20:58Z

I created koreader/koreader-misc#45 to add the definition files into koreader-misc.

Meanwhile, what's the good time for me to try a restart and ensure the services would be up and running after the restart?

Frenzie · 2023-07-12T08:25:47Z

It doesn't really matter if the nightlies are missing for a day, but the ideal time would be sometime after 7 UTC so the artifacts for the day are there already. Unless you mean wrt the current Android tc update (e.g., #10679) in which case it may be better to wait until that's sorted out.

Hzj-jie · 2023-07-13T04:49:25Z

Oh, maybe in another way, would you please share the commands you are using to start the services? I do see five dockers now. My concern is that if the services are not working as expected (I do not think so though), I can manually start the dockers to avoid breaking the services.

Hzj-jie · 2023-07-13T05:09:57Z

By the way, memory-wise, 1G would be a little bit restricted if I read /proc/meminfo correctly.

root@hzj-jie-ubuntu:~# cat /proc/meminfo
MemTotal:        1724096 kB
MemFree:          157712 kB
MemAvailable:     991028 kB
Buffers:           49364 kB
Cached:           964028 kB
SwapCached:            0 kB
Active:           952812 kB
Inactive:         424736 kB

Frenzie · 2023-07-13T07:45:56Z

Oh, maybe in another way, would you please share the commands you are using to start the services?

They're scripts in ops (under build, nginx and sync).

Hzj-jie · 2023-09-26T02:23:35Z

I resized the machine to 1 vCPU + 1G ram, and it seems working. Likely it can buy us 3-5 more days per month.

So the public address is public accessible, I am not sure if anyone crawls the site directly through the public ip.

Hzj-jie · 2023-09-26T05:41:00Z

I do see requests like

172.71.17.150 - - [15/Sep/2023:18:59:20 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09
-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 OPR/101.0.0.0"

and

162.158.175.177 - - [15/Sep/2023:20:53:55 +0000] "GET /download/nightly/v2023.03-45-ga8ab5e84_2023-04-18/ HTTP/1.1" 404 146 "-" "Mozilla/5.0 AppleWebKit/537.
36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"

Both should be disallowed.

Frenzie · 2023-09-26T10:51:13Z

I can see why you'd say that about some bot (though it should either get it from the CF cache or be the first hence putting it in the cache) but the first is simply Opera so I'm not sure what you're getting at?

For reference, mine looks like this:

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 OPR/102.0.0.0

Hzj-jie · 2023-09-30T06:08:14Z

Oh, I got them from the nginx log.
I think the requests should be cached and only accessed by cf like

108.162.246.82 - - [04/Jun/2017:03:23:57 +0000] "HEAD / HTTP/1.1" 301 0 "-" "Mozilla/5.0 (compatible; CloudFlare-AlwaysOnline/1.0; +http://www.cloudflare.com/always-online)"

rather than the regular browsers.

Frenzie · 2023-09-30T06:50:56Z

A request like that will happen all the time for every caching edge node. There's no such thing as regular browsers just accessing, CF is always in between.

Hzj-jie · 2023-09-30T07:36:16Z

Oh, do you mean requests like

172.71.17.150 - - [15/Sep/2023:18:59:20 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09
-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 OPR/101.0.0.0"

are indeed passing through cf? I made a mistake here, the hosts are not resolved to the public ip address of the vm.

But as long as it returned 200, the request was served through the azure network and got charged. For example, the following apk was accessed 8 times, and some requests were hours apart, and cf should cache them.

172.70.91.197 - - [13/Sep/2023:10:05:47 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Linux; Android 11; RMX1901 Build/RKQ1.201217.002) AppleWebKit/537.36 (KHTML, like Gecko)  Chrome/116.0.0.0 Mobile Safari/537.36"
172.71.242.147 - - [13/Sep/2023:10:05:50 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Linux; Android 11; RMX1901 Build/RKQ1.201217.002) AppleWebKit/537.36 (KHTML, like Gecko)  Chrome/116.0.0.0 Mobile Safari/537.36"
172.70.86.27 - - [13/Sep/2023:10:06:01 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Linux; Android 11; RMX1901 Build/RKQ1.201217.002) AppleWebKit/537.36 (KHTML, like Gecko)  Chrome/116.0.0.0 Mobile Safari/537.36"
172.69.59.188 - - [14/Sep/2023:00:51:25 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Android 13; Mobile; rv:109.0) Gecko/117.0 Firefox/117.0"
172.71.186.230 - - [14/Sep/2023:01:40:37 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "http://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk" "Mozilla/5.0 (Android 13; Mobile; rv:68.0) Gecko/68.0 Firefox/68.0"
172.71.17.151 - - [14/Sep/2023:03:47:53 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36"
172.68.10.185 - - [14/Sep/2023:03:49:10 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Mobile Safari/537.36"
172.71.17.150 - - [15/Sep/2023:18:59:20 +0000] "GET /download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/koreader-android-arm64-v2023.08-23-g3f677a7fd_2023-09-13.apk HTTP/1.1" 200 33851392 "https://build.koreader.rocks/download/nightly/v2023.08-23-g3f677a7fd_2023-09-13/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 OPR/101.0.0.0"

Or the tar.gz files in the dict/

172.69.33.129 - - [26/Sep/2023:02:31:50 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.08 (https://koreader.rocks/) LuaSocket/3.0.0"
162.158.91.27 - - [26/Sep/2023:02:31:51 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.08 (https://koreader.rocks/) LuaSocket/3.0.0"
172.71.130.85 - - [26/Sep/2023:03:00:08 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.08-8028 (https://koreader.rocks/) LuaSocket/3.0.0"
172.71.122.161 - - [26/Sep/2023:03:00:10 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.08-8028 (https://koreader.rocks/) LuaSocket/3.0.0"
162.158.158.199 - - [26/Sep/2023:03:08:32 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.05.1 (https://koreader.rocks/) LuaSocket/3.0.0"
162.158.155.194 - - [26/Sep/2023:03:08:35 +0000] "GET /download/dict/gcide.tar.gz HTTP/1.1" 200 14958781 "-" "KOReader/2023.05.1 (https://koreader.rocks/) LuaSocket/3.0.0"

I haven't seen cache related headers in the nginx configuration,

        location /download {
                alias /data/release_download;
                autoindex on;
                fancyindex on;
                fancyindex_exact_size off;
                fancyindex_name_length 95;
                fancyindex_natural_sort on;
                fancyindex_header /fancyindex/header.html;
                fancyindex_ignore fancyindex;
        }

Are you sure cf would cache the files in the case?

Frenzie · 2023-09-30T08:11:27Z

are indeed passing through cf?

Yes. It's impossible for it to be any other way.

For example, the following apk was accessed 8 times, and some requests were hours apart, and cf should cache them.

"Hours apart" combined with different edge nodes means there's no particular reason it would be cached. They might drop it sooner than whatever you set or might expect if it hasn't been accessed in a while. Something like CF helps the most when there are a ton of requests coming in a short span using the same edge node. But that's why we have a 50-60% cache hit rate rather than over 90% (on average, this morning at 7 it was actually 96.54%).

Are you sure cf would cache the files in the case?

Yes, you can easily verify for yourself with curl -I or -v.

Most important, you can see CF is aware of Last-Modified: Wed, 09 Jan 2019 15:35:31 GMT. In principle this is all it should check when it needs to revalidate.

$ curl -I http://build.koreader.rocks/download/dict/gcide.tar.gz
HTTP/1.1 200 OK
Date: Sat, 30 Sep 2023 07:44:14 GMT
Content-Type: text/plain
Content-Length: 14958781
Connection: keep-alive
Last-Modified: Wed, 09 Jan 2019 15:35:31 GMT
ETag: "5c3614c3-e440bd"
Cache-Control: max-age=14400
CF-Cache-Status: MISS
[…]

$ curl -I http://build.koreader.rocks/download/dict/gcide.tar.gz
HTTP/1.1 200 OK
Date: Sat, 30 Sep 2023 07:49:41 GMT
Content-Type: text/plain
Content-Length: 14958781
Connection: keep-alive
Last-Modified: Wed, 09 Jan 2019 15:35:31 GMT
ETag: "5c3614c3-e440bd"
Cache-Control: max-age=14400
CF-Cache-Status: REVALIDATED
[…]

$ curl -I http://build.koreader.rocks/download/dict/gcide.tar.gz
HTTP/1.1 200 OK
Date: Sat, 30 Sep 2023 07:50:01 GMT
Content-Type: text/plain
Content-Length: 14958781
Connection: keep-alive
Last-Modified: Wed, 09 Jan 2019 15:35:31 GMT
ETag: "5c3614c3-e440bd"
Cache-Control: max-age=14400
CF-Cache-Status: HIT
[…]

$ curl -v http://build.koreader.rocks/download/dict/gcide.tar.gz
* processing: http://build.koreader.rocks/download/dict/gcide.tar.gz
*   Trying 188.114.97.3:80...
* Connected to build.koreader.rocks (188.114.97.3) port 80
> GET /download/dict/gcide.tar.gz HTTP/1.1
> Host: build.koreader.rocks
> User-Agent: curl/8.2.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Sat, 30 Sep 2023 07:53:48 GMT
< Content-Type: text/plain
< Content-Length: 14958781
< Connection: keep-alive
< Last-Modified: Wed, 09 Jan 2019 15:35:31 GMT
< ETag: "5c3614c3-e440bd"
< Cache-Control: max-age=14400
< CF-Cache-Status: HIT

I haven't seen cache related headers in the nginx configuration,

Indeed, it might help a few percentage points to explicitly add some for at least a week (except for some files that should only be cached a few hours or maybe a day tops).

But ultimately the bandwidth looks like this. Those are quite comfortable numbers. If I find the time later I'll set up my VPS that I'm not really using because Azure is just… weirdly unattractive, but first it's probably better to identify where the data leak is coming from. But in any case it can't be from build.koreader.rocks or ota.koreader.rocks since as stated those pass completely through CF and are tracked by CF. :-/

Hzj-jie · 2023-10-02T05:41:23Z

Ok, my previous company used some edge networking services a while ago which transfered the data across their edge servers, so the static requests would never reach the server again after some 5 minutes I recall.

But anyway, glad to know CF does not need any heads to function. (I remember we discussed this before :)
Though it's still suspicious that build.koreader.rocks is contributing most of the traffic.

Frenzie · 2023-10-02T07:38:14Z

The long and short of it is that it's all proxied and I can't imagine them somehow missing tens of gigabytes of data traffic. 47 (total) - 26 (cached) = 21 GB traffic from build. and ota over the past week.

Which it seems to me really only leaves the thing I blanked out here, or something internal to Azure.

Although technically this is possible:

curl -I build.koreader.rocks/download --resolve build.koreader.rocks:[[IP]]

With some file:

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 02 Oct 2023 07:32:54 GMT
Content-Type: application/zip
Content-Length: 42455211
Last-Modified: Sun, 10 Sep 2023 06:10:46 GMT
Connection: keep-alive
ETag: "64fd5de6-287d0ab"
Accept-Ranges: bytes

Of course that's not quite as it should be so we'll have to double check the traffic numbers from the nginx logs just in case.

https://developers.cloudflare.com/fundamentals/setup/allow-cloudflare-ip-addresses/

Frenzie · 2023-10-03T15:08:56Z

I've added these caching headers:

      location ~* \.(apk|AppImage|targz|tar.gz|zip)$ {
        add_header Cache-Control "public, max-age=86400, immutable";
      }

      location ~* \.zsync$ {
        add_header Cache-Control "public, max-age=3600";
      }

Of course as stated this will only potentially affect the 22 GB of uncached traffic (mainly depending on whether CF merely checks last-modified or simply redownloads the entire thing because why not), not whatever traffic is actually being problematic.

Frenzie · 2023-10-03T15:46:07Z

build.koreader.rocks says 10.12 GB over the past 7 days (as in goaccess with --keep-last=7)

The ota.koreader.rocks logs say 107.58 GB. That's most peculiar to say the least because I only see CF IPs in the logs. Almost all of it from 172.71.126.something, and some fairly negligible amounts from other CF IPs. Then why doesn't CF show this? It was clearly served through CF. This doesn't make a lick of sense.

Hzj-jie · 2023-10-06T06:06:41Z

That's really a lot. Can cf handle 206 correctly?

By the way, the name asdfsdfa is cool 👍

Frenzie · 2023-10-06T08:40:50Z

I think I see what you're saying.

A partial content request comes in
CF grabs the entire file (which may not necessarily be completely idiotic, provided it then caches it for future requests)
CF only serves that which is actually requested (hopefully)

2, 3 ⇒ CF has downloaded some 50–70 GB more from the server than it actually served to end users, explaining the missing data, and using more data than it ever saved in the process.

This is probably fairly testable, though I don't have the time to right this moment.

ilyats · 2024-01-25T23:10:11Z

To whom it may concern, the OTA server is down again. Latest APK on a mirror site is from 01/19/24

Hzj-jie · 2024-01-26T17:54:43Z

It was resumed yesterday.

moshin34 · 2024-02-24T02:02:29Z

My sync randomly stopped working a few days ago across all devices. Is this a global thing?

Miladiir · 2024-02-24T09:57:05Z

ota.koreader.rocks is down starting from atleast yesterday ~08:00 UTC. I can contribute financially or with my time. I have azure and linux server experience if you guys need help.

pazos · 2024-02-24T22:18:59Z

ota.koreader.rocks is down starting from atleast yesterday ~08:00 UTC. I can contribute financially or with my time. I have azure and linux server experience if you guys need help.

Thanks for the offer :)

The server will be back online the 26. AFAIK it gets killed when the network bandwitdh reaches some limit. I think @Frenzie and @Hzj-jie are the people involved with the server management. Not sure what can be done without increasing quotas (I'm happily unaware :p)

Frenzie · 2024-02-24T23:03:48Z

Not sure what can be done without increasing quotas (I'm happily unaware :p)

I find the VPS very bad compared to all others I have access to. It's slow and SSH connections are unreliable. From e.g. AWS EC2 I know that's definitely not an issue with being in North America. These issues might be a worthy trade-off for a lot of bandwidth, yet bandwidth is apparently also more than an order of magnitude less than anything else. I am nonetheless grateful for its existence.

But as you know these things largely work on annoyance thresholds and it never quite managed to make enough impact, although it's come close. I definitely wouldn't want to lower the annoyance threshold by rewarding Azure for their terrible product.

kyxap · 2024-06-22T23:21:35Z

btw github pipeline does not much build requirements? just curious

pazos · 2024-06-22T23:45:44Z

btw github pipeline does not much build requirements? just curious

No idea, but lets focus on solve the bandwidth issue here.

Feel free to open another ticket if you want to build artifacts from github. I think gitlab is better. Not all the eggs on the same basket :)

Hzj-jie added the enhancement label Jun 26, 2023

Frenzie mentioned this issue Jun 26, 2023

ota.koreader.rocks is down (should be back up on the 26th) #10603

Closed

NiLuJe mentioned this issue Sep 20, 2023

Server for nightly builds down? #10924

Closed

Frenzie mentioned this issue Sep 24, 2023

Default KoSync server down #10936

Closed

Frenzie mentioned this issue Nov 23, 2023

build.koreader.rocks down #11146

Closed

Frenzie changed the title ~~[Process] update the vm running ota.koreader.rocks~~ build.koreader.rocks / ota.koreader.rocks down (Azure bandwidth issues) Nov 23, 2023

Frenzie mentioned this issue Dec 7, 2023

Progress Sync Don't work #11149

Closed

pazos mentioned this issue Jan 22, 2024

Progress Sync on Android stopped working #11397

Closed

pazos mentioned this issue Jun 22, 2024

build.koreader.rocks is down - Connection timed out Error code 522 #12073

Closed

pazos pinned this issue Jun 22, 2024

Frenzie mentioned this issue Jun 26, 2024

Koreader progress sync not working with default server #12097

Closed

build.koreader.rocks / ota.koreader.rocks down (Azure bandwidth issues) #10615

build.koreader.rocks / ota.koreader.rocks down (Azure bandwidth issues) #10615

Comments

Hzj-jie commented Jun 26, 2023

Hzj-jie commented Jun 29, 2023

Frenzie commented Jun 29, 2023

Hzj-jie commented Jul 1, 2023

Frenzie commented Jul 1, 2023

Hzj-jie commented Jul 1, 2023

Frenzie commented Jul 1, 2023

Hzj-jie commented Jul 3, 2023

Hzj-jie commented Jul 3, 2023

Frenzie commented Jul 3, 2023

Frenzie commented Jul 3, 2023

Hzj-jie commented Jul 3, 2023

poire-z commented Jul 3, 2023

Hzj-jie commented Jul 6, 2023

Hzj-jie commented Jul 6, 2023

Frenzie commented Jul 6, 2023

Frenzie commented Jul 6, 2023

Frenzie commented Jul 6, 2023

Hzj-jie commented Jul 7, 2023

Hzj-jie commented Jul 9, 2023 • edited Loading

Frenzie commented Jul 9, 2023

Hzj-jie commented Jul 9, 2023

Frenzie commented Jul 9, 2023 • edited Loading

Frenzie commented Jul 9, 2023

Hzj-jie commented Jul 12, 2023

Frenzie commented Jul 12, 2023

Hzj-jie commented Jul 13, 2023

Hzj-jie commented Jul 13, 2023

Frenzie commented Jul 13, 2023

Hzj-jie commented Sep 26, 2023 • edited Loading

Hzj-jie commented Sep 26, 2023

Frenzie commented Sep 26, 2023

Hzj-jie commented Sep 30, 2023

Frenzie commented Sep 30, 2023

Hzj-jie commented Sep 30, 2023 • edited Loading

Frenzie commented Sep 30, 2023

Hzj-jie commented Oct 2, 2023

Frenzie commented Oct 2, 2023

Frenzie commented Oct 3, 2023

Frenzie commented Oct 3, 2023

Hzj-jie commented Oct 6, 2023

Frenzie commented Oct 6, 2023 • edited Loading

ilyats commented Jan 25, 2024

Hzj-jie commented Jan 26, 2024

moshin34 commented Feb 24, 2024

Miladiir commented Feb 24, 2024

pazos commented Feb 24, 2024 • edited Loading

Frenzie commented Feb 24, 2024

kyxap commented Jun 22, 2024

pazos commented Jun 22, 2024

Hzj-jie commented Jul 9, 2023 •

edited

Loading

Frenzie commented Jul 9, 2023 •

edited

Loading

Hzj-jie commented Sep 26, 2023 •

edited

Loading

Hzj-jie commented Sep 30, 2023 •

edited

Loading

Frenzie commented Oct 6, 2023 •

edited

Loading

pazos commented Feb 24, 2024 •

edited

Loading