Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nova-DB-sync to slow #210

Closed
mpiscaer opened this issue Dec 6, 2022 · 2 comments
Closed

Nova-DB-sync to slow #210

mpiscaer opened this issue Dec 6, 2022 · 2 comments

Comments

@mpiscaer
Copy link
Contributor

mpiscaer commented Dec 6, 2022

When installing Openstack with Atmosphere on our current Openstack env, the nova-db-sync takes 9 minutes to run.

At that moment I see ceph doing around 3MB/s wr and 190 io/s wr.

doing a write Benchmark:

root@ctl1:/tmp# rados bench -p scbench 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_ctl1_557910
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 36 20 79.997 80 0.996231 0.587963
2 16 64 48 95.9925 112 0.497236 0.573865
3 16 99 83 110.658 140 0.226508 0.506366
4 16 136 120 119.863 148 0.734474 0.498481
5 16 176 160 127.88 160 0.547849 0.475841
6 16 224 208 138.424 192 0.406813 0.453905
7 16 265 249 142.004 164 0.389167 0.435077
8 16 308 292 145.745 172 0.300354 0.429261
9 16 341 325 144.218 132 0.560436 0.4256
10 16 385 369 147.389 176 0.289641 0.425416
Total time run: 10.5751
Total writes made: 385
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 145.626
Stddev Bandwidth: 33.2238
Max bandwidth (MB/sec): 192
Min bandwidth (MB/sec): 80
Average IOPS: 36
Stddev IOPS: 8.30596
Max IOPS: 48
Min IOPS: 20
Average Latency(s): 0.426081
Stddev Latency(s): 0.19687
Max latency(s): 1.44318
Min latency(s): 0.0925033

@mpiscaer mpiscaer changed the title Nova-DB-sync Nova-DB-sync to slow Dec 6, 2022
@mnaser
Copy link
Member

mnaser commented Mar 27, 2023

I think the root cause of this is that since we run Ceph for the control plane, we've got multiple replication (a write needs to be written 9 times, 3 times to the Ceph "drives" X 3 times to the underlying storage backend).

I think the way to workaround this might be that when using Molecule, we run a cluster with size=1 and minimize the amount of writes.. unless there's other ideas around this.

@mpiscaer
Copy link
Contributor Author

@mnaser I think we can close this?

@mnaser mnaser closed this as not planned Won't fix, can't repro, duplicate, stale Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants