Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload speed much slower than download speed in glftpd #35

Open
Vunacle opened this issue Jan 2, 2020 · 15 comments
Open

Upload speed much slower than download speed in glftpd #35

Vunacle opened this issue Jan 2, 2020 · 15 comments

Comments

@Vunacle
Copy link

@Vunacle Vunacle commented Jan 2, 2020

Tested with v2.09 , v2.10

glftpd has some limit on upload speed (when glftpd is receiving), at around 4-6Gbit per thread, this depends on cpu clock speed where this limit sits.
But on a Ryzen 7 1700@3.7Ghz, its around 5Gbit.
Tested on other cpus as well like skylake/broadwell, and they all sit around those numbers, regardless of Intel or AMD.

Download speed does not seem to be limited.

These tests are without ssl enabled.

Upload 1,024.0 Mbytes/1.75(s)/613,917.57Kbps

Download 1,024.0 Mbytes/0.69(s)/1,551,650.03Kbps

Running multiple upload transfers at once, gives num_of_xfers x limit (around those 5Gbit)
So 4 xfers at once would give a total of 4x5Gbit = 20Gbit.

Any reason upload speed is limited this way ?

@glftpd

This comment has been minimized.

Copy link
Owner

@glftpd glftpd commented Jan 3, 2020

Could be many reasons. Possibly because of the internall 8kb read buffer which isnt configurable. I dont have such hardware to profile on so cant say.

Feel free to play around with turning off real time crc calculation or using ul_buffered to write larged chunks to disk. Maybe that will have some effect.

Also see if other ftp servers have any different results...

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 3, 2020

Thank you for the reply. I have tested the options suggested.

All tests are done with either ramdisks or nvme drives, to remove storage bottleneck.

To disable calc_crc gives another around 3Gbit, moving from 4-6Gbit to 7-9Gbit.

Without calc_crc, I get these results below.
Upload 1,024.0 Mbytes/1.21(s)/888,123.92Kbps
Upload 10.0 Gbytes/14.91(s)/720,197.08Kbps
Upload 1,024.0 Mbytes/1.41(s)/764,229.06Kbps

The ul_buffered_force option does not look to have any impact on speeds on upload.
ul_buffered_force 100
Upload 1,024.0 Mbytes/1.90(s)/566,020.99Kbps
ul_buffered_force 10
Upload 1,024.0 Mbytes/1.76(s)/609,042.44Kbps
ul_buffered_force 50
Upload 1,024.0 Mbytes/1.96(s)/547,827.46Kbps

Without ul_buffered_force option
Upload 1,024.0 Mbytes/1.73(s)/619,585.59Kbps

Tested with other ftp server as suggested (ProFTPD)
Sending from glFTPd to ProFTPD -> 1,024.0 Mbytes/0.57(s)/1,867,377.09Kbps
Sending from ProFTPD to glFTPd -> 1,024.0 M byte(s) i 1.94 (552,905.16 KBps) (calc_crc on)
Sending from ProFTPD to glFTPd -> 1,024.0 M byte(s) i 1.20 (898,528.72 KBps) (calc_crc off)

It looks like its a specific glFTPd upload issue.

@sirmarksalot

This comment has been minimized.

Copy link

@sirmarksalot sirmarksalot commented Jan 3, 2020

are you running a "post_check" script?

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 4, 2020

are you running a "post_check" script?

Thanks for the reply.

Tested both with and without post_check (the zipscript-c), does not change anything.
Still around the 5Gbit mark with calc_crc on, and 8Gbit with calc_crc off, regardless of the post_check setting is there or not.

@PatriotM

This comment has been minimized.

Copy link

@PatriotM PatriotM commented Jan 4, 2020

as described by Vunacle. I can confirm all tests are exactly as reproducible.
it feels like the glftpd is very slow in some places. compared to http download with wget (both server sides) where I also reach 10 gbit full (with only 1 thread) or with ProFTPD the speed on the same server a great deal better. crc calc off, dirlog off, dupefile off, post_check, glftpd logs linked to /dev/null everything brings no 10 gbit speed during upload with nvme or ram drive. even local over the glftpd to the glftpd from ram drive to ram drive does not bring 10gbit as full bandwidth.
is there a way to somehow debug the glftpd to see what it might be attached to?

@sirmarksalot

This comment has been minimized.

Copy link

@sirmarksalot sirmarksalot commented Jan 4, 2020

I did a few tests for comparison.

The following test are performed using a ftp client on the same box as glftpd.
connecting to 127.0.0.1 with pasv_addr 127.0.0.1

Test file is a 4GB sparse file created with fallocate.
File is being sent to a tmpfs mount via glftpd.

Intel(R) Xeon(R) CPU E5-2620 v2
 calc-crc OFF 600 MB/sec
 calc-crc  ON 360 MB/sec

Intel(R) Core(TM) i7-8700
 calc-crc OFF 1.53 GB/s
 calc-crc  ON 919 MB/s
 
 AMD Ryzen 7 3700X
 calc-crc OFF 1.7 GB/s
 calc-crc  ON 710 MB/s
@PatriotM

This comment has been minimized.

Copy link

@PatriotM PatriotM commented Jan 4, 2020

thanks for the feedback with the tests.
how exactly did you create the file? with dd as zero or random?

zero: dd if=/dev/zero of=speedfile_zero bs=1G count=4
random: dd if=/dev/random of=speedfile_random bs=1G count=4

could you do a test with the two created files?

maybe it is really the internal buffer of 8kb that is too low
proftp use it:
Use buffer sizes of 32KB for both reading and writing
SocketOptions rcvbuf 32768 sndbuf 32768

@glftpd

This comment has been minimized.

Copy link
Owner

@glftpd glftpd commented Jan 5, 2020

See what happens if you change the 32kb buffer to 8kb in proftpd.
Might be possible to either just increase this for next version or even make it configurable if its confimed as helpful.

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 5, 2020

Thank you for the reply.

Did some tests with changing the buffer size on ProFTPD from auto and down to 8kb

Used this option in ProFTPD -> SocketOptions sndbuf 8192 rcvbuf 8192

Varied the buffer size from 8kb to 512kb to see the changes.

This is the speeds I got when xfering from glFTPd to ProFTPD.

Upload ProFTPD (8kb buffer) 1,024.0 Mbytes/1.10(s)/974,357.37Kbps
Upload ProFTPD (16kb buffer) 1,024.0 Mbytes/0.93(s)/1,160,801.97Kbps
Upload ProFTPD (32kb buffer) 1,024.0 Mbytes/0.80(s)/1,348,921.89Kbps
Upload ProFTPD (64kb buffer) 1,024.0 Mbytes/0.66(s)/1,624,420.31Kbps
Upload ProFTPD (128kb buffer) 1,024.0 Mbytes/0.61(s)/1,748,765.19Kbps
Upload ProFTPD (256kb buffer) 1,024.0 Mbytes/0.54(s)/1,992,099.86Kbps
Upload ProFTPD (512kb buffer) 1,024.0 Mbytes/0.63(s)/1,715,242.53Kbps
Upload ProFTPD (auto buffer, no config line) 1,024.0 Mbytes/0.56(s)/1,900,428.01Kbps

Looks to be capped around 950M-1GB/s with 8kb option in place.

@glftpd

This comment has been minimized.

Copy link
Owner

@glftpd glftpd commented Jan 5, 2020

if you want to try a gl version with other buffer size poke Hujer on #efnet

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 5, 2020

Did some testing with a version with larger buffers on upload in glFTPd (fresh install)
Tested on a Ryzen 7 1700@3.7Ghz.
All tests are done without ssl enable on data transfer.

Upload buffer scaling on upload

calc_crc off

UPLOAD glFTPd (8kb buffer) 1,024.0 Mbytes/1.00(s)/1,072,669.15Kbps
UPLOAD glFTPd (16kb buffer) 1,024.0 Mbytes/0.78(s)/1,369,568.65Kbps
UPLOAD glFTPd (32kb buffer) 1,024.0 Mbytes/0.69(s)/1,547,178.42Kbps
UPLOAD glFTPd (64kb buffer) 1,024.0 Mbytes/0.65(s)/1,644,321.32Kbps
UPLOAD glFTPd (128kb buffer) 1,024.0 Mbytes/0.58(s)/1,857,684.82Kbps
UPLOAD glFTPd (256kb buffer) 1,024.0 Mbytes/0.51(s)/2,093,063.98Kbps

Calc_crc on

UPLOAD glFTPd (256kb buffer) 1,024.0 Mbytes/1.14(s)/939,406.67Kbps
UPLOAD glFTPd (128kb buffer) 1,024.0 Mbytes/1.18(s)/907,643.13Kbps
UPLOAD glFTPd (64kb buffer) 1,024.0 Mbytes/1.17(s)/919,299.51Kbps
UPLOAD glFTPd (32kb buffer) 1,024.0 Mbytes/1.22(s)/882,285.80Kbps
UPLOAD glFTPd (16kb buffer) 1,024.0 Mbytes/1.35(s)/793,014.64Kbps
UPLOAD glFTPd (8kb buffer) 1,024.0 Mbytes/1.47(s)/731,930.35Kbps

With these adjustments, performance on upload now looks to be the same in glFTPd as with ProFTPD. with calc_crc off.

I suspect the calc_crc lower performance is a cpu limitation, as 1 core on the cpu sits at 100% when calc_crc is on, during a single thread transfer. I guess its internal cpu/mem latency and IPC+clock speed that decide how fast calc_crc on can run.

Is there any way to speed up calc_crc on maybe ?, as calc_crc on does speedup zipscript a lot.

This looks to be what was the issue, buffers was just too small for high speed networking links (10G+)

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 6, 2020

This is great work, really some massive improvements.
tested with cpu Ryzen 7 1700@3.7Ghz
All tests are done without ssl enabled on data transfer.

Posting some tests done with the improved crc alg.

Calc_crc on

UPLOAD glFTPd (256kb buffer) 1,024.0 Mbytes/0.84(s)/1,281,314.83Kbps
UPLOAD glFTPd (128kb buffer) 1,024.0 Mbytes/1.01(s)/1,061,009.71Kbps
UPLOAD glFTPd (64kb buffer) 1,024.0 Mbytes/0.98(s)/1,096,774.08Kbps
UPLOAD glFTPd (32kb buffer) 1,024.0 Mbytes/1.09(s)/980,586.14Kbps
UPLOAD glFTPd (16kb buffer) 1,024.0 Mbytes/1.18(s)/912,270.03Kbps
UPLOAD glFTPd (8kb buffer) 1,024.0 Mbytes/1.29(s)/830,426.78Kbps

So now calc_crc on, is able to hit 10Gbit.

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 8, 2020

Here is some tests in case someone wanted to know the upload numbers with SSL Datachannel enabled.
Cpu Ryzen 7 1700@3.7Ghz
Tested on a fully routed network, using sparsefile and ramdisk.
SSL Used TLS v1.2

SSL/Calc_crc on

Upload glFTPd (256kb buffer) 1,024.0 Mbytes/1.72(s)/626,088.53Kbps
Upload glFTPd (128kb buffer) 1,024.0 Mbytes/1.71(s)/629,760.60Kbps
Upload glFTPd (64kb buffer) 1,024.0 Mbytes/1.69(s)/634,974.47Kbps
Upload glFTPd (32kb buffer) 1,024.0 Mbytes/1.63(s)/657,124.74Kbps
Upload glFTPd (16kb buffer) 1,024.0 Mbytes/1.69(s)/634,974.47Kbps
Upload glFTPd (8kb buffer) 1,024.0 Mbytes/1.76(s)/608,697.18Kbps

SSL/Calc_crc off

Upload glFTPd (8kb buffer) 1,024.0 Mbytes/1.36(s)/790,097.00Kbps
Upload glFTPd (16kb buffer) 1,024.0 Mbytes/1.39(s)/772,476.13Kbps
Upload glFTPd (32kb buffer) 1,024.0 Mbytes/1.55(s)/694,080.04Kbps
Upload glFTPd (64kb buffer) 1,024.0 Mbytes/1.36(s)/787,200.75Kbps
Upload glFTPd (128kb buffer) 1,024.0 Mbytes/1.52(s)/707,806.08Kbps
Upload glFTPd (256kb buffer) 1,024.0 Mbytes/1.33(s)/808,540.53Kbps

Its all cpu limited it looks. even the cpu should be capable of doing around 1GB/s of AES in hw. per. core.

@donmagic1

This comment has been minimized.

Copy link

@donmagic1 donmagic1 commented Jan 8, 2020

so how did you fixed it?

@Vunacle

This comment has been minimized.

Copy link
Author

@Vunacle Vunacle commented Jan 9, 2020

so how did you fixed it?

The fix for the slower upload issue is what was suggested earlier in this thread, that we need a glFTPd version with a larger internal buffer size, which is 8kb as standard, it needs to be able to scale up to use high bandwidth network speeds, beyond 5-6Gbit.
It has been tested as shown to give the increased performance.

I expect it will be included in the next version, as it now has been proven to improve performance. But someone from the actual glFTPd team will have to confirm this.

Another improvements that was shown in the tests is a faster crc algorithm, that gave a nice boost in calc_crc speed too. Which again I expect will be kept and be in the next version as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.