New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: the exhausted check in BitMapZone can be lock-less. #13653

Merged
merged 1 commit into from Feb 27, 2017

Conversation

Projects
None yet
4 participants
@rzarzynski
Contributor

rzarzynski commented Feb 26, 2017

No description provided.

bluestore: the exhausted check in BitMapZone can be lock-less.
Before the patch BitMapZone::is_exhausted() required from its
callers to acquire appropriate lock. However, fulfilling this
condition is not really necessary to use the method correctly
while it can significantly hurt performance.

The change allows BitMapAreaLeaf::child_check_n_lock() to not
acquire the lock while examining zones for being exhausted.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>

@tchaikov tchaikov merged commit 2db2f05 into ceph:master Feb 27, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
@rzarzynski

This comment has been minimized.

Show comment
Hide comment
@rzarzynski

rzarzynski Feb 27, 2017

Contributor

Quick & rough performance tests on ramdisk (BlueStore: bluestore_debug_omit_block_device_write = true, FIO: nr_files=64, size=256k, bs=4k, numjobs=1):

bluestore: (groupid=0, jobs=1): err= 0: pid=9402: Mon Feb 27 11:24:50 2017
  write: IOPS=50.5k, BW=197MiB/s (207MB/s)(5913MiB/30001msec)
    clat (usec): min=172, max=9664, avg=2694.82, stdev=1371.80
     lat (usec): min=191, max=9751, avg=2713.91, stdev=1371.91
    clat percentiles (usec):
     |  1.00th=[  370],  5.00th=[  580], 10.00th=[  812], 20.00th=[ 1272],
     | 30.00th=[ 1752], 40.00th=[ 2224], 50.00th=[ 2704], 60.00th=[ 3152],
     | 70.00th=[ 3632], 80.00th=[ 4128], 90.00th=[ 4576], 95.00th=[ 4832],
     | 99.00th=[ 5152], 99.50th=[ 5280], 99.90th=[ 6176], 99.95th=[ 7776],
     | 99.99th=[ 9024]
    lat (usec) : 250=0.04%, 500=3.33%, 750=5.32%, 1000=5.34%
    lat (msec) : 2=21.30%, 4=42.60%, 10=22.07%
  cpu          : usr=99.67%, sys=0.28%, ctx=1921, majf=0, minf=12358
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=5.4%, 32=13.4%, >=64=81.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=100.0%
     issued rwt: total=0,1513797,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s), io=5913MiB (6201MB), run=30001-30001msec

 Performance counter stats for '/home/radek/fio/fio ceph-bluestore.fio':

      54436.569340      task-clock (msec)         #    1.709 CPUs utilized          
           1641154      context-switches          #    0.030 M/sec                  
            139521      cpu-migrations            #    0.003 M/sec                  
             74923      page-faults               #    0.001 M/sec                  
      182820203151      cycles                    #    3.358 GHz                      (30.84%)
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
      222185995667      instructions              #    1.22  insns per cycle          (38.45%)
       52602784734      branches                  #  966.313 M/sec                    (38.33%)
         204107207      branch-misses             #    0.39% of all branches          (38.23%)
       75610327672      L1-dcache-loads           # 1388.962 M/sec                    (32.25%)
        4067963147      L1-dcache-load-misses     #    5.38% of all L1-dcache hits    (17.18%)
        1370444835      LLC-loads                 #   25.175 M/sec                    (16.85%)
           2428397      LLC-load-misses           #    0.35% of all LL-cache hits     (23.24%)
   <not supported>      L1-icache-loads          
        3040571110      L1-icache-load-misses     #   55.855 M/sec                    (30.74%)
       75480620134      dTLB-loads                # 1386.579 M/sec                    (27.23%)
          32593876      dTLB-load-misses          #    0.04% of all dTLB cache hits   (23.32%)
         382610037      iTLB-loads                #    7.029 M/sec                    (15.71%)
          54190158      iTLB-load-misses          #   14.16% of all iTLB cache hits   (23.13%)
   <not supported>      L1-dcache-prefetches     
   <not supported>      L1-dcache-prefetch-misses

      31.857960539 seconds time elapsed

  • After the change:
bluestore: (groupid=0, jobs=1): err= 0: pid=7995: Mon Feb 27 11:03:31 2017
  write: IOPS=76.8k, BW=300MiB/s (314MB/s)(8997MiB/30001msec)
    clat (usec): min=242, max=6609, avg=1932.08, stdev=810.41
     lat (usec): min=255, max=6634, avg=1944.48, stdev=810.44
    clat percentiles (usec):
     |  1.00th=[  506],  5.00th=[  676], 10.00th=[  828], 20.00th=[ 1096],
     | 30.00th=[ 1384], 40.00th=[ 1656], 50.00th=[ 1928], 60.00th=[ 2192],
     | 70.00th=[ 2480], 80.00th=[ 2768], 90.00th=[ 3056], 95.00th=[ 3184],
     | 99.00th=[ 3312], 99.50th=[ 3344], 99.90th=[ 3792], 99.95th=[ 5024],
     | 99.99th=[ 6112]
    lat (usec) : 250=0.01%, 500=0.93%, 750=6.36%, 1000=8.98%
    lat (msec) : 2=36.28%, 4=47.35%, 10=0.09%
  cpu          : usr=99.27%, sys=0.67%, ctx=3030, majf=0, minf=12695
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=9.3%, >=64=90.5%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwt: total=0,2303151,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=300MiB/s (314MB/s), 300MiB/s-300MiB/s (314MB/s-314MB/s), io=8997MiB (9434MB), run=30001-30001msec

 Performance counter stats for '/home/radek/fio/fio ceph-bluestore.fio':

      56781.079612      task-clock (msec)         #    1.781 CPUs utilized          
           1646042      context-switches          #    0.029 M/sec                  
            119903      cpu-migrations            #    0.002 M/sec                  
             76732      page-faults               #    0.001 M/sec                  
      191463222348      cycles                    #    3.372 GHz                      (30.83%)
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
      227384545914      instructions              #    1.19  insns per cycle          (38.41%)
       49268136367      branches                  #  867.686 M/sec                    (38.46%)
         240139347      branch-misses             #    0.49% of all branches          (38.34%)
       80845407351      L1-dcache-loads           # 1423.809 M/sec                    (30.67%)
        6079323668      L1-dcache-load-misses     #    7.52% of all L1-dcache hits    (27.70%)
        1933380547      LLC-loads                 #   34.050 M/sec                    (23.54%)
           3948194      LLC-load-misses           #    0.41% of all LL-cache hits     (27.26%)
   <not supported>      L1-icache-loads          
        3828618039      L1-icache-load-misses     #   67.428 M/sec                    (31.14%)
       79634417315      dTLB-loads                # 1402.482 M/sec                    (25.72%)
          31532881      dTLB-load-misses          #    0.04% of all dTLB cache hits   (24.65%)
         532241472      iTLB-loads                #    9.374 M/sec                    (15.59%)
          28123423      iTLB-load-misses          #    5.28% of all iTLB cache hits   (22.90%)
   <not supported>      L1-dcache-prefetches     
   <not supported>      L1-dcache-prefetch-misses

      31.873763486 seconds time elapsed

CC: @markhpc.

Contributor

rzarzynski commented Feb 27, 2017

Quick & rough performance tests on ramdisk (BlueStore: bluestore_debug_omit_block_device_write = true, FIO: nr_files=64, size=256k, bs=4k, numjobs=1):

bluestore: (groupid=0, jobs=1): err= 0: pid=9402: Mon Feb 27 11:24:50 2017
  write: IOPS=50.5k, BW=197MiB/s (207MB/s)(5913MiB/30001msec)
    clat (usec): min=172, max=9664, avg=2694.82, stdev=1371.80
     lat (usec): min=191, max=9751, avg=2713.91, stdev=1371.91
    clat percentiles (usec):
     |  1.00th=[  370],  5.00th=[  580], 10.00th=[  812], 20.00th=[ 1272],
     | 30.00th=[ 1752], 40.00th=[ 2224], 50.00th=[ 2704], 60.00th=[ 3152],
     | 70.00th=[ 3632], 80.00th=[ 4128], 90.00th=[ 4576], 95.00th=[ 4832],
     | 99.00th=[ 5152], 99.50th=[ 5280], 99.90th=[ 6176], 99.95th=[ 7776],
     | 99.99th=[ 9024]
    lat (usec) : 250=0.04%, 500=3.33%, 750=5.32%, 1000=5.34%
    lat (msec) : 2=21.30%, 4=42.60%, 10=22.07%
  cpu          : usr=99.67%, sys=0.28%, ctx=1921, majf=0, minf=12358
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=5.4%, 32=13.4%, >=64=81.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=100.0%
     issued rwt: total=0,1513797,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s), io=5913MiB (6201MB), run=30001-30001msec

 Performance counter stats for '/home/radek/fio/fio ceph-bluestore.fio':

      54436.569340      task-clock (msec)         #    1.709 CPUs utilized          
           1641154      context-switches          #    0.030 M/sec                  
            139521      cpu-migrations            #    0.003 M/sec                  
             74923      page-faults               #    0.001 M/sec                  
      182820203151      cycles                    #    3.358 GHz                      (30.84%)
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
      222185995667      instructions              #    1.22  insns per cycle          (38.45%)
       52602784734      branches                  #  966.313 M/sec                    (38.33%)
         204107207      branch-misses             #    0.39% of all branches          (38.23%)
       75610327672      L1-dcache-loads           # 1388.962 M/sec                    (32.25%)
        4067963147      L1-dcache-load-misses     #    5.38% of all L1-dcache hits    (17.18%)
        1370444835      LLC-loads                 #   25.175 M/sec                    (16.85%)
           2428397      LLC-load-misses           #    0.35% of all LL-cache hits     (23.24%)
   <not supported>      L1-icache-loads          
        3040571110      L1-icache-load-misses     #   55.855 M/sec                    (30.74%)
       75480620134      dTLB-loads                # 1386.579 M/sec                    (27.23%)
          32593876      dTLB-load-misses          #    0.04% of all dTLB cache hits   (23.32%)
         382610037      iTLB-loads                #    7.029 M/sec                    (15.71%)
          54190158      iTLB-load-misses          #   14.16% of all iTLB cache hits   (23.13%)
   <not supported>      L1-dcache-prefetches     
   <not supported>      L1-dcache-prefetch-misses

      31.857960539 seconds time elapsed

  • After the change:
bluestore: (groupid=0, jobs=1): err= 0: pid=7995: Mon Feb 27 11:03:31 2017
  write: IOPS=76.8k, BW=300MiB/s (314MB/s)(8997MiB/30001msec)
    clat (usec): min=242, max=6609, avg=1932.08, stdev=810.41
     lat (usec): min=255, max=6634, avg=1944.48, stdev=810.44
    clat percentiles (usec):
     |  1.00th=[  506],  5.00th=[  676], 10.00th=[  828], 20.00th=[ 1096],
     | 30.00th=[ 1384], 40.00th=[ 1656], 50.00th=[ 1928], 60.00th=[ 2192],
     | 70.00th=[ 2480], 80.00th=[ 2768], 90.00th=[ 3056], 95.00th=[ 3184],
     | 99.00th=[ 3312], 99.50th=[ 3344], 99.90th=[ 3792], 99.95th=[ 5024],
     | 99.99th=[ 6112]
    lat (usec) : 250=0.01%, 500=0.93%, 750=6.36%, 1000=8.98%
    lat (msec) : 2=36.28%, 4=47.35%, 10=0.09%
  cpu          : usr=99.27%, sys=0.67%, ctx=3030, majf=0, minf=12695
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=9.3%, >=64=90.5%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwt: total=0,2303151,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=300MiB/s (314MB/s), 300MiB/s-300MiB/s (314MB/s-314MB/s), io=8997MiB (9434MB), run=30001-30001msec

 Performance counter stats for '/home/radek/fio/fio ceph-bluestore.fio':

      56781.079612      task-clock (msec)         #    1.781 CPUs utilized          
           1646042      context-switches          #    0.029 M/sec                  
            119903      cpu-migrations            #    0.002 M/sec                  
             76732      page-faults               #    0.001 M/sec                  
      191463222348      cycles                    #    3.372 GHz                      (30.83%)
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
      227384545914      instructions              #    1.19  insns per cycle          (38.41%)
       49268136367      branches                  #  867.686 M/sec                    (38.46%)
         240139347      branch-misses             #    0.49% of all branches          (38.34%)
       80845407351      L1-dcache-loads           # 1423.809 M/sec                    (30.67%)
        6079323668      L1-dcache-load-misses     #    7.52% of all L1-dcache hits    (27.70%)
        1933380547      LLC-loads                 #   34.050 M/sec                    (23.54%)
           3948194      LLC-load-misses           #    0.41% of all LL-cache hits     (27.26%)
   <not supported>      L1-icache-loads          
        3828618039      L1-icache-load-misses     #   67.428 M/sec                    (31.14%)
       79634417315      dTLB-loads                # 1402.482 M/sec                    (25.72%)
          31532881      dTLB-load-misses          #    0.04% of all dTLB cache hits   (24.65%)
         532241472      iTLB-loads                #    9.374 M/sec                    (15.59%)
          28123423      iTLB-load-misses          #    5.28% of all iTLB cache hits   (22.90%)
   <not supported>      L1-dcache-prefetches     
   <not supported>      L1-dcache-prefetch-misses

      31.873763486 seconds time elapsed

CC: @markhpc.

@liewegas

This comment has been minimized.

Show comment
Hide comment
@liewegas

liewegas Feb 27, 2017

Member

awesome!

Member

liewegas commented Feb 27, 2017

awesome!

@markhpc

This comment has been minimized.

Show comment
Hide comment
@markhpc

markhpc Feb 27, 2017

Member

Very nice!

Member

markhpc commented Feb 27, 2017

Very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment