Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf]: Performance tuning log #1383

Open
popcnt1 opened this issue Feb 19, 2024 · 15 comments
Open

[perf]: Performance tuning log #1383

popcnt1 opened this issue Feb 19, 2024 · 15 comments
Assignees
Labels

Comments

@popcnt1
Copy link
Collaborator

popcnt1 commented Feb 19, 2024

platform1:

MacBook Apple M2 24GB 512GB

disk:

fio --name=random_write_latency_test --filename=rand_write_test --filesize=10G --time_based=1 --ramp_time=30s --runtime=1m --ioengine=posixaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.36
Starting 1 thread
random_write_latency_test: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=26.5MiB/s][w=6796 IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=3075: Mon Feb 26 10:15:55 2024
  write: IOPS=6985, BW=27.3MiB/s (28.6MB/s)(1637MiB/60001msec); 0 zone resets
    slat (nsec): min=0, max=67000, avg=700.14, stdev=872.37
    clat (usec): min=7, max=7074, avg=142.25, stdev=224.22
     lat (usec): min=7, max=7074, avg=142.95, stdev=224.36
    clat percentiles (usec):
     |  1.00th=[   87],  5.00th=[   91], 10.00th=[   93], 20.00th=[   95],
     | 30.00th=[   97], 40.00th=[   99], 50.00th=[  103], 60.00th=[  109],
     | 70.00th=[  122], 80.00th=[  127], 90.00th=[  135], 95.00th=[  145],
     | 99.00th=[ 1647], 99.50th=[ 1680], 99.90th=[ 1745], 99.95th=[ 1778],
     | 99.99th=[ 1876]
    lat percentiles (usec):
     |  1.00th=[   88],  5.00th=[   92], 10.00th=[   94], 20.00th=[   96],
     | 30.00th=[   98], 40.00th=[  100], 50.00th=[  103], 60.00th=[  110],
     | 70.00th=[  123], 80.00th=[  128], 90.00th=[  135], 95.00th=[  147],
     | 99.00th=[ 1647], 99.50th=[ 1696], 99.90th=[ 1745], 99.95th=[ 1778],
     | 99.99th=[ 1876]
   bw (  KiB/s): min=25203, max=29458, per=100.00%, avg=27959.66, stdev=524.94, samples=119
   iops        : min= 6300, max= 7364, avg=6989.58, stdev=131.24, samples=119
  lat (usec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=40.50%, 250=57.03%

mem filesystem

fio --name=random_write_latency_test --filename=rand_write_test --filesize=200M --time_based=1 --ramp_time=30s --runtime=1m --ioengine=posixaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.36
Starting 1 thread
random_write_latency_test: Laying out IO file (1 file / 200MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=176MiB/s][w=45.2k IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=2819: Tue Feb 27 00:18:06 2024
  write: IOPS=45.2k, BW=177MiB/s (185MB/s)(10.3GiB/60001msec); 0 zone resets
    slat (nsec): min=0, max=911000, avg=2641.40, stdev=2013.63
    clat (nsec): min=0, max=1019.0k, avg=19213.98, stdev=4345.28
     lat (usec): min=3, max=1022, avg=21.86, stdev= 4.82
    clat percentiles (nsec):
     |  1.00th=[ 5024],  5.00th=[11968], 10.00th=[15040], 20.00th=[16064],
     | 30.00th=[17024], 40.00th=[18048], 50.00th=[19072], 60.00th=[20096],
     | 70.00th=[21120], 80.00th=[22912], 90.00th=[23936], 95.00th=[24960],
     | 99.00th=[28032], 99.50th=[30080], 99.90th=[34048], 99.95th=[38144],
     | 99.99th=[48896]
    lat percentiles (nsec):
     |  1.00th=[ 7008],  5.00th=[16064], 10.00th=[17024], 20.00th=[18048],
     | 30.00th=[19072], 40.00th=[20096], 50.00th=[21888], 60.00th=[22912],
     | 70.00th=[24960], 80.00th=[27008], 90.00th=[28032], 95.00th=[28032],
     | 99.00th=[32128], 99.50th=[33024], 99.90th=[38144], 99.95th=[41216],
     | 99.99th=[51968]
   bw (  KiB/s): min=179027, max=184368, per=100.00%, avg=180911.83, stdev=992.36, samples=119
   iops        : min=44756, max=46092, avg=45227.61, stdev=248.16, samples=119
  lat (nsec)   : 2=0.06%
  lat (usec)   : 2=0.20%, 4=0.06%, 10=2.70%, 20=49.35%, 50=47.62%
  lat (usec)   : 100=0.01%, 250=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu          : usr=6.38%, sys=13.64%, ctx=4108648, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2712161,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=177MiB/s (185MB/s), 177MiB/s-177MiB/s (185MB/s-185MB/s), io=10.3GiB (11.1GB), run=60001-60001msec

original result (updated: #1392)

BUILDING simple_blog
execute_tx              time:   [5.5210 ms 6.0733 ms 6.3923 ms]
                        change: [-12.752% -2.5564% +8.6448%] (p = 0.66 > 0.05)
                        No change in performance detected.

platform 2

Fedora39

12700k with optane905:

fio --name=random_write_latency_test --filename=rand_write_test --filesize=10G --time_based=1 --ramp_time=30s --runtime=1m --ioengine=libaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.35
Starting 1 thread
random_write_latency_test: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=368MiB/s][w=94.3k IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=24159: Sun Apr 21 02:02:27 2024
  write: IOPS=88.7k, BW=346MiB/s (363MB/s)(20.3GiB/60001msec); 0 zone resets
    slat (nsec): min=809, max=228619, avg=1015.68, stdev=278.03
    clat (nsec): min=235, max=4003.5k, avg=10082.54, stdev=14267.28
     lat (usec): min=9, max=4005, avg=11.10, stdev=14.28
    clat percentiles (usec):
     |  1.00th=[    9],  5.00th=[    9], 10.00th=[    9], 20.00th=[   10],
     | 30.00th=[   10], 40.00th=[   10], 50.00th=[   10], 60.00th=[   10],
     | 70.00th=[   10], 80.00th=[   10], 90.00th=[   12], 95.00th=[   13],
     | 99.00th=[   15], 99.50th=[   35], 99.90th=[  129], 99.95th=[  310],
     | 99.99th=[  343]
    lat percentiles (usec):
     |  1.00th=[   10],  5.00th=[   10], 10.00th=[   10], 20.00th=[   10],
     | 30.00th=[   11], 40.00th=[   11], 50.00th=[   11], 60.00th=[   11],
     | 70.00th=[   11], 80.00th=[   11], 90.00th=[   13], 95.00th=[   14],
     | 99.00th=[   16], 99.50th=[   36], 99.90th=[  133], 99.95th=[  314],
     | 99.99th=[  343]
   bw (  KiB/s): min=302248, max=377848, per=100.00%, avg=354862.81, stdev=19468.80, samples=120
   iops        : min=75564, max=94462, avg=88715.71, stdev=4867.16, samples=120
  lat (nsec)   : 250=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=80.69%, 20=18.63%, 50=0.56%
  lat (usec)   : 100=0.02%, 250=0.01%, 500=0.09%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
  cpu          : usr=3.91%, sys=13.64%, ctx=5322086, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5322032,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=346MiB/s (363MB/s), 346MiB/s-346MiB/s (363MB/s-363MB/s), io=20.3GiB (21.8GB), run=60001-60001msec

Disk stats (read/write):
  nvme0n1: ios=79/7737330, merge=0/438, ticks=2/87801, in_queue=87802, util=99.76%

Platform 3

Google Cloud with 2TB persistent SSD

sudo fio --name=random_write_latency_test --filename=write_test --filesize=2G --time_based=1 --ramp_time=3s --runtime=20s --ioengine=posixaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.33
Starting 1 thread
Jobs: 1 (f=1): [w(1)][100.0%][w=5868KiB/s][w=1467 IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=26437: Thu Apr 11 12:25:19 2024
  write: IOPS=1430, BW=5722KiB/s (5859kB/s)(112MiB/20001msec); 0 zone resets
    slat (nsec): min=1021, max=37776, avg=4154.48, stdev=1152.85
    clat (usec): min=219, max=113935, avg=693.93, stdev=1277.30
     lat (usec): min=223, max=113939, avg=698.08, stdev=1277.28
    clat percentiles (usec):
     |  1.00th=[  251],  5.00th=[  277], 10.00th=[  297], 20.00th=[  338],
     | 30.00th=[  494], 40.00th=[  586], 50.00th=[  635], 60.00th=[  693],
     | 70.00th=[  791], 80.00th=[  930], 90.00th=[ 1090], 95.00th=[ 1287],
     | 99.00th=[ 1582], 99.50th=[ 1696], 99.90th=[ 2147], 99.95th=[ 6259],
     | 99.99th=[91751]
    lat percentiles (usec):
     |  1.00th=[  255],  5.00th=[  281], 10.00th=[  302], 20.00th=[  343],
     | 30.00th=[  498], 40.00th=[  594], 50.00th=[  644], 60.00th=[  701],
     | 70.00th=[  799], 80.00th=[  938], 90.00th=[ 1090], 95.00th=[ 1287],
     | 99.00th=[ 1582], 99.50th=[ 1696], 99.90th=[ 2147], 99.95th=[ 6259],
     | 99.99th=[91751]
   bw (  KiB/s): min= 4168, max= 7576, per=100.00%, avg=5723.72, stdev=469.48, samples=40
   iops        : min= 1042, max= 1894, avg=1430.80, stdev=117.37, samples=40
  lat (usec)   : 250=0.85%, 500=29.52%, 750=35.81%, 1000=18.83%
  lat (msec)   : 2=14.89%, 4=0.04%, 10=0.01%, 20=0.02%, 50=0.02%
  lat (msec)   : 100=0.01%, 250=0.01%
  cpu          : usr=0.43%, sys=1.01%, ctx=28699, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,28609,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=5722KiB/s (5859kB/s), 5722KiB/s-5722KiB/s (5859kB/s-5859kB/s), io=112MiB (117MB), run=20001-20001msec

Disk stats (read/write):
  sdb: ios=58074/34227, merge=0/1781, ticks=38676/71788, in_queue=110465, util=100.00%

Platform 4

alicloud

sudo fio --name=random_write_latency_test --filename=write_test --filesize=2G --time_based=1 --ramp_time=3s --runtime=20s --ioengine=posixaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 thread
random_write_latency_test: Laying out IO file (1 file / 2048MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=11.0MiB/s][w=2811 IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=8418: Fri Apr 12 17:08:53 2024
  write: IOPS=2814, BW=11.0MiB/s (11.5MB/s)(220MiB/20001msec); 0 zone resets
    slat (nsec): min=450, max=13775, avg=924.47, stdev=321.14
    clat (usec): min=268, max=2899, avg=353.81, stdev=41.99
     lat (usec): min=269, max=2902, avg=354.74, stdev=42.00
    clat percentiles (usec):
     |  1.00th=[  334],  5.00th=[  338], 10.00th=[  338], 20.00th=[  338],
     | 30.00th=[  343], 40.00th=[  343], 50.00th=[  343], 60.00th=[  347],
     | 70.00th=[  347], 80.00th=[  351], 90.00th=[  371], 95.00th=[  429],
     | 99.00th=[  494], 99.50th=[  562], 99.90th=[  676], 99.95th=[  807],
     | 99.99th=[ 1860]
    lat percentiles (usec):
     |  1.00th=[  334],  5.00th=[  338], 10.00th=[  338], 20.00th=[  343],
     | 30.00th=[  343], 40.00th=[  343], 50.00th=[  347], 60.00th=[  347],
     | 70.00th=[  347], 80.00th=[  351], 90.00th=[  371], 95.00th=[  429],
     | 99.00th=[  494], 99.50th=[  562], 99.90th=[  676], 99.95th=[  807],
     | 99.99th=[ 1860]
   bw (  KiB/s): min=11040, max=11352, per=100.00%, avg=11259.35, stdev=60.98, samples=40
   iops        : min= 2760, max= 2838, avg=2814.77, stdev=15.23, samples=40
  lat (usec)   : 500=99.22%, 750=0.71%, 1000=0.05%
  lat (msec)   : 2=0.02%, 4=0.01%
  cpu          : usr=0.54%, sys=0.43%, ctx=56291, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,56284,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=11.0MiB/s (11.5MB/s), 11.0MiB/s-11.0MiB/s (11.5MB/s-11.5MB/s), io=220MiB (231MB), run=20001-20001msec

Disk stats (read/write):
  vdb: ios=0/65290, merge=0/2898, ticks=0/22339, in_queue=22346, util=99.61%

Platform 5

Fedora39

Intel12700K + Solidigm D5-P5430

fio --name=random_write_latency_test --filename=rand_write_test --filesize=10G --time_based=1 --ramp_time=30s --runtime=1m --ioengine=libaio --direct=1 --verify=0 --randrepeat=0 --bs=4k --iodepth=1 --rw=randwrite -norandommap=1 --lat_percentiles=1 --thread=1 --numjobs=1--group_reporting
random_write_latency_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.35
Starting 1 thread
Jobs: 1 (f=1): [w(1)][100.0%][w=399MiB/s][w=102k IOPS][eta 00m:00s]
random_write_latency_test: (groupid=0, jobs=1): err= 0: pid=21512: Sun Apr 21 01:56:32 2024
  write: IOPS=101k, BW=395MiB/s (414MB/s)(23.2GiB/60001msec); 0 zone resets
    slat (nsec): min=779, max=72545, avg=928.41, stdev=134.39
    clat (nsec): min=1923, max=4005.2k, avg=8782.28, stdev=4782.13
     lat (usec): min=8, max=4007, avg= 9.71, stdev= 4.79
    clat percentiles (nsec):
     |  1.00th=[ 7904],  5.00th=[ 7968], 10.00th=[ 7968], 20.00th=[ 8032],
     | 30.00th=[ 8096], 40.00th=[ 8096], 50.00th=[ 8096], 60.00th=[ 8160],
     | 70.00th=[ 8160], 80.00th=[ 8160], 90.00th=[ 8256], 95.00th=[ 9792],
     | 99.00th=[31104], 99.50th=[32384], 99.90th=[40192], 99.95th=[44288],
     | 99.99th=[66048]
    lat percentiles (nsec):
     |  1.00th=[ 8768],  5.00th=[ 8896], 10.00th=[ 8896], 20.00th=[ 8896],
     | 30.00th=[ 9024], 40.00th=[ 9024], 50.00th=[ 9024], 60.00th=[ 9024],
     | 70.00th=[ 9152], 80.00th=[ 9152], 90.00th=[ 9280], 95.00th=[10688],
     | 99.00th=[32128], 99.50th=[33536], 99.90th=[40704], 99.95th=[45312],
     | 99.99th=[68096]
   bw (  KiB/s): min=398028, max=409696, per=100.00%, avg=404733.77, stdev=2282.72, samples=120
   iops        : min=99507, max=102424, avg=101183.42, stdev=570.68, samples=120
  lat (usec)   : 2=0.01%, 4=0.01%, 10=95.26%, 20=2.45%, 50=2.27%
  lat (usec)   : 100=0.02%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
  cpu          : usr=4.11%, sys=14.94%, ctx=6070378, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,6070293,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=395MiB/s (414MB/s), 395MiB/s-395MiB/s (414MB/s-414MB/s), io=23.2GiB (24.9GB), run=60001-60001msec

Disk stats (read/write):
  nvme2n1: ios=0/9052037, merge=0/0, ticks=0/65611, in_queue=65611, util=99.87%
@popcnt1 popcnt1 self-assigned this Feb 19, 2024
@popcnt1

This comment was marked as resolved.

popcnt1 added a commit that referenced this issue Feb 22, 2024
## Summary

1. move build tx out of execution bench, result:

#1383 (comment)

huge improvement(It's actually closer to the real server-side execution overhead) , the major cost is from tokio wake in present (need to figure out)

2. split write/query bench

---------

Co-authored-by: popcnt-subodhi <subodhi@west>
@popcnt1

This comment was marked as resolved.

@popcnt1

This comment was marked as resolved.

@popcnt1

This comment was marked as resolved.

@popcnt1
Copy link
Collaborator Author

popcnt1 commented Feb 26, 2024

with mem fs:

RUST_LOG=error PUREMEM=/Volumes/RAMDisk/rtmp cargo bench --bench bench_tx_write
   Compiling rooch-benchmarks v0.1.0 (/Users/templex/rooch/bodhi/rooch/crates/rooch-benchmarks)
warning: unused import: `std::sync::Arc`
  --> crates/rooch-benchmarks/src/tx.rs:44:5
   |
44 | use std::sync::Arc;
   |     ^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

warning: unused import: `tempfile::TempDir`
  --> crates/rooch-benchmarks/src/tx.rs:46:5
   |
46 | use tempfile::TempDir;
   |     ^^^^^^^^^^^^^^^^^

warning: `rooch-benchmarks` (lib) generated 2 warnings (run `cargo fix --lib -p rooch-benchmarks` to apply 2 suggestions)
    Finished bench [optimized + debuginfo] target(s) in 13.26s
     Running benches/bench_tx_write.rs (/Users/templex/rooch/bodhi/rooch/target/release/deps/bench_tx_write-9809035715c4c70d)
INCLUDING DEPENDENCY MoveStdlib
INCLUDING DEPENDENCY MoveosStdlib
INCLUDING DEPENDENCY RoochFramework
BUILDING simple_blog
execute_tx              time:   [5.1027 ms 5.6767 ms 5.9972 ms]
                        change: [-13.073% -0.8050% +12.977%] (p = 0.91 > 0.05)
                        No change in performance detected.

@popcnt1
Copy link
Collaborator Author

popcnt1 commented Feb 28, 2024

run bench with tx_data_size = 0, no significant improvement

RUST_LOG=error TX_SIZE=0 PUREMEM=/Volumes/RAMDisk/rtmp cargo bench --bench bench_tx_write
    Finished bench [optimized + debuginfo] target(s) in 1m 37s
     Running benches/bench_tx_write.rs (/Users/templex/rooch/bodhi/rooch/target/release/deps/bench_tx_write-9809035715c4c70d)
INCLUDING DEPENDENCY MoveStdlib
INCLUDING DEPENDENCY MoveosStdlib
INCLUDING DEPENDENCY RoochFramework
BUILDING simple_blog
execute_tx              time:   [5.0727 ms 5.6405 ms 5.9602 ms]
                        change: [-12.465% -0.5425% +11.901%] (p = 0.93 > 0.05)
                        No change in performance detected.

@popcnt1

This comment was marked as outdated.

@popcnt1

This comment was marked as outdated.

popcnt1 pushed a commit that referenced this issue Mar 3, 2024
@popcnt1

This comment was marked as outdated.

@popcnt1
Copy link
Collaborator Author

popcnt1 commented Mar 10, 2024

for no_sync rocksdb (put & write_batch), I log rocksdb operation's cost and using script to output this (187us/tx):

./get_time.sh
Total put operations: 8016
Total time elapsed in put operations (in µs): 44234.4
Average time per put operation (in µs): 5.51827
Total bytes in put operations: 1431756
Average bytes per put operation: 178.612
Average number of put operations per execute_tx: 5.00062
Average time of put operations per execute_tx (in µs): 27.5948
---------
Total get operations: 128334
Total time elapsed in get operations (in µs): 103229
Average time per get operation (in µs): 0.80438
Total bytes in get operations: 43787172
Average bytes per get operation: 341.197
Average number of get operations per execute_tx: 80.0586
Average time of get operations per execute_tx (in µs): 64.3976
---------
Total write_batch operations: 4834
Total time elapsed in write_batch operations (in µs): 39028
Average time per write_batch operation (in µs): 8.07365
Total bytes in write_batch operations: 865968
Average bytes per write_batch operation: 179.141
Average number of write_batch operations per execute_tx: 3.0156
Average time of write_batch operations per execute_tx (in µs): 24.3469
---------
Total put_sync operations: 3209
Total time elapsed in put_sync operations (in µs): 116106
Average time per put_sync operation (in µs): 36.1815
Total bytes in put_sync operations: 167014
Average bytes per put_sync operation: 52.0455
Average number of put_sync operations per execute_tx: 2.00187
Average time of put_sync operations per execute_tx (in µs): 72.4307
---------

@popcnt1

This comment was marked as outdated.

@popcnt1

This comment was marked as outdated.

@popcnt1

This comment was marked as outdated.

@popcnt1
Copy link
Collaborator Author

popcnt1 commented Apr 8, 2024

execute l1 block tx for btc block after 800,000(height):

data import mode avg cost
none 4.4906s
ord 5.7325s
utxo 4.4913s

for btc block after 200,000(height) is 132.46ms:

building profile effect:

  1. overflow-checks = false codegen-units = 1
tx_exec_blk             time:   [4.0600 s 4.4255 s 4.8032 s]
                        change: [-13.149% -1.5229% +10.922%] (p = 0.81 > 0.05)
                        No change in performance detected.
  1. overflow-checks = false codegen-units = 1 lto="thin"
tx_exec_blk             time:   [3.5628 s 3.9052 s 4.2501 s]
                        change: [-22.557% -11.757% -0.5718%] (p = 0.07 > 0.05)
                        No change in performance detected.
  1. overflow-checks = false codegen-units = 256 lto="thin"
tx_exec_blk             time:   [3.7183 s 4.0718 s 4.4269 s]
                        change: [-7.5759% +4.2659% +18.680%] (p = 0.54 > 0.05)
                        No change in performance detected.

@popcnt1
Copy link
Collaborator Author

popcnt1 commented Apr 15, 2024

bench avg cost
exec empty 1.0313ms
exec transfer 1.5611ms
exec btc block (>= 800,000 height) 2.384s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

1 participant