Skip to content

Commit 9012849

Browse files
committed
Fix issues with formula markdown incompatibility
1 parent 427a171 commit 9012849

File tree

6 files changed

+7
-13
lines changed

6 files changed

+7
-13
lines changed

Chapter4_4.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ Migrating a virtual CPU within the same scheduling domain incurs less cost compa
2121
Understanding CPU scheduling details is crucial for diagnosing MySQL performance problems. A key question is whether Linux's scheduling mechanisms can effectively manage thousands of concurrent threads in MySQL. Since MySQL operates on a thread-based model, it's important to assess how the Linux scheduler handles such a high volume of threads. Does it simply allocate CPU time evenly among them?
2222

2323
Consider a scenario where there are *N* user threads and *C* CPU cores, with each core supporting dual hyper-threading. Ideally, without considering context switch overhead, each user thread should receive the following CPU execution time per second.
24-
$$
25-
\frac{2C}{N}
26-
$$
24+
25+
![image-20240902000000002](media/image-20240902000000002.png)
26+
2727
As *N* increases, the average CPU allocation per thread decreases. For example, if *N=100000* and *C=3*, each thread would only receive about 60 microseconds of CPU time per second. Given that context switches typically incur costs in the tens of microseconds range, a significant portion of CPU time would be lost to context switching, thereby reducing overall CPU efficiency.
2828

2929
As the number of user threads increases, the Linux scheduler struggles to manage CPU time effectively, resulting in inefficiencies and performance degradation due to frequent context switches. To address this, the system enforces a minimum execution granularity, ensuring that each process runs for at least 100 microseconds before being preempted. This approach minimizes the inefficiencies of short scheduling intervals. The Completely Fair Scheduler (CFS) uses this minimum granularity to prevent excessive switching costs as the number of runnable processes grows.

Chapter4_6.md

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,7 @@ In a WAN testing scenario, the throughput remains nearly constant across differe
2020

2121
The throughput calculation formula in such scenarios simplifies to:
2222

23-
$$
24-
tpmTOTAL ≈ 60 × \frac{1}{N e t w o r k \ L a t e n c y} = 60 × \frac{1}{0 . 0 1} = 6000
25-
$$
26-
27-
$$
28-
tpmC≈tpmTOTAL×0.45=6000×0.45=2700
29-
$$
23+
![image-20240902000000001](media/image-20240902000000001.png)
3024

3125
This closely matches the test results above, where 0.45 is an empirical factor derived from extensive testing that represents the ratio of tpmC to tpmTOTAL. The tests indicate that, under a 10ms network latency with no additional bottlenecks, throughput remains consistent across different concurrency levels. This consistency is due to the serial nature of Paxos communication, as batching and pipelining are not employed. Confirmation of these findings is supported by packet capture analysis.
3226

Chapter4_8.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ For the semisynchronous test mentioned above, both throughput and response time
5454
In computer architecture, Amdahl's Law provides a formula to predict the theoretical speedup in latency for a task with a fixed workload when system resources are improved [45].
5555

5656
Although Amdahl's Law theoretically holds, it often struggles to explain certain phenomena in the practical performance improvement process of MySQL. For instance, the same program shows a 10% improvement in SMP environments but a 50% improvement in NUMA environments. Measurements were conducted in SMP environments where the optimized portion, accounting for 20% of execution time, was improved by a factor of 2 through algorithm improvements. According to Amdahl's Law, the theoretical improvement should be calculated as follows:
57-
$$
58-
\frac{1}{0.8 + (\frac {0.2} {2}) } = \frac{1}{0.8 + 0.1 } = \frac{1}{0.9 } ≈ 1.11 \ times
59-
$$
57+
58+
![image-20240902000000003](media/image-20240902000000003.png)
59+
6060
In practice, the 10% improvement in SMP environments aligns with theoretical expectations. However, the 50% improvement in NUMA environments significantly exceeds these predictions. This discrepancy is not due to a flaw in the theory or an error but rather because performance improvements in NUMA environments cannot be directly compared with those in SMP environments. Amdahl's Law is applicable strictly within the same environment.
6161

6262
Accurate measurement data is also challenging to obtain [11]. Developers typically use tools like *perf* to identify bottlenecks. The larger the bottleneck displayed by *perf*, the greater the potential for improvement. However, some bottlenecks are distributed or spread out, making it difficult to pinpoint them using *perf* and, consequently, challenging to identify optimization opportunities. For example, Profile-Guided Optimization (PGO) may not highlight specific bottlenecks causing poor performance in *perf*, yet PGO can still significantly improve performance.

media/image-20240902000000001.png

25.9 KB
Loading

media/image-20240902000000002.png

3.06 KB
Loading

media/image-20240902000000003.png

10.6 KB
Loading

0 commit comments

Comments
 (0)