You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Appendix.md
+54-50Lines changed: 54 additions & 50 deletions
Original file line number
Diff line number
Diff line change
@@ -108,8 +108,8 @@ struct lock_t {
108
108
109
109
So, what distinguishes these two concepts? Consider this metaphor [19]:
110
110
111
-
- A **latch** secures a door, gate, or window in place but does not offer protection against unauthorized access.
112
-
- A **lock**, however, restricts entry to those without the key, ensuring security and control.
111
+
- A **latch** secures a door, gate, or window in place but does not offer protection against unauthorized access.
112
+
- A **lock**, however, restricts entry to those without the key, ensuring security and control.
113
113
114
114
In MySQL, a global latch is employed to serialize specific processing procedures. For instance, the following is MySQL's description of the role of a global latch.
115
115
@@ -132,8 +132,8 @@ In MySQL, locks are integral to the transaction model, with common types includi
132
132
133
133
Understanding locks is crucial for:
134
134
135
-
-Implementing large-scale, busy, or highly reliable database applications
136
-
-Tuning MySQL performance
135
+
- Implementing large-scale, busy, or highly reliable database applications
136
+
- Tuning MySQL performance
137
137
138
138
Familiarity with InnoDB locking and the InnoDB transaction model is essential for these tasks.
**15 Maintaining Transaction Order with replica_preserve_commit_order**
169
167
170
168
In MySQL, the *replica_preserve_commit_order* configuration ensures that transactions on secondary databases are committed in the same order as they appear in the relay log. This setting lays the foundation for maintaining the causal relationship between transactions: if transaction A commits before transaction B on the primary, transaction A will also commit before transaction B on the secondary. This prevents inconsistencies where transactions could be read in the reverse order on the secondary.
@@ -229,9 +227,10 @@ In computer programming, a thread pool is a design pattern used to achieve concu
229
227
230
228
Throughput measures the number of requests a system processes within a unit of time. Common statistical indicators include:
231
229
232
-
1. **Transactions Per Second (TPS):** The number of database transactions performed per second.
233
-
2. **Queries Per Second (QPS):** The number of database queries performed per second.
234
-
3. **tpmC for TPC-C:** The rate of New-Order transactions executed per minute in TPC-C benchmarks.
230
+
1. **Transactions Per Second (TPS):** The number of database transactions performed per second.
231
+
2. **Queries Per Second (QPS):** The number of database queries performed per second.
232
+
3. **tpmC for TPC-C:** The rate of New-Order transactions executed per minute in TPC-C benchmarks.
233
+
4. **tpmTOTAL for TPC-C:** The rate of total transactions executed per minute in TPC-C benchmarks.
235
234
236
235
**31 Thundering Herd**
237
236
@@ -269,11 +268,11 @@ The TPC-C benchmark, defined by the Transaction Processing Council, is an OLTP t
269
268
270
269
This schema is used by five different transactions, each creating varied access patterns:
4. **Order and Order-Line:** Inserts with time-delayed updates, causing rows to become stale and infrequently read.
275
+
5. **History:** Insert-only.
277
276
278
277
The diverse access patterns of this small schema with a limited number of transactions contribute to TPC-C's ongoing significance as a major database benchmark. In this book, BenchmarkSQL is primarily employed to evaluate TPC-C performance in MySQL.
279
278
@@ -317,9 +316,9 @@ The preprocessor performs preliminary tasks such as verifying the existence of t
317
316
318
317
The query optimizer determines the execution plan for the SQL query. This phase includes:
319
318
320
-
- **Logical Query Rewrites:** Transforming queries into logically equivalent forms.
321
-
- **Cost-Based Join Optimization:** Evaluating different join methods to minimize execution cost.
322
-
- **Rule-Based Access Path Selection:** Choosing the best data access paths based on predefined rules.
319
+
- **Logical Query Rewrites:** Transforming queries into logically equivalent forms.
320
+
- **Cost-Based Join Optimization:** Evaluating different join methods to minimize execution cost.
321
+
- **Rule-Based Access Path Selection:** Choosing the best data access paths based on predefined rules.
323
322
324
323
The query optimizer generates the execution plan, which is then used by the query executor engine.
325
324
@@ -341,11 +340,11 @@ Since this query condition does not use an index, the optimizer chooses a full t
341
340
342
341
The execution process for the executor and storage engine is as follows:
343
342
344
-
1. The Server layer calls the storage engine's full scan interface to start reading records from the table.
345
-
2. The executor checks if the age of the retrieved record exceeds 20. Records that meet this condition are dispatched to the network write buffer if there is available space.
346
-
3. The executor requests the next record from the storage engine in a loop. Each record is evaluated against the query conditions, and those that meet the criteria are sent to the network write buffer, provided the buffer is not full.
347
-
4. Once the storage engine has read all records from the table, it notifies the executor that reading is complete.
348
-
5. Upon receiving the completion signal, the executor exits the loop and flushes the query results to the client.
343
+
1. The Server layer calls the storage engine's full scan interface to start reading records from the table.
344
+
2. The executor checks if the age of the retrieved record exceeds 20. Records that meet this condition are dispatched to the network write buffer if there is available space.
345
+
3. The executor requests the next record from the storage engine in a loop. Each record is evaluated against the query conditions, and those that meet the criteria are sent to the network write buffer, provided the buffer is not full.
346
+
4. Once the storage engine has read all records from the table, it notifies the executor that reading is complete.
347
+
5. Upon receiving the completion signal, the executor exits the loop and flushes the query results to the client.
349
348
350
349
To optimize performance, MySQL minimizes frequent write system calls by checking if the network buffer is full before sending records to the client. Records are sent only when the buffer is full or when the completion signal is received.
351
350
@@ -374,7 +373,7 @@ The execution process with an index is as follows:
374
373
2. The storage engine retrieves and returns the matching index record to the Server layer.
375
374
376
375
3. The executor checks if the record meets the additional query conditions (e.g., id \< 3).
377
-
376
+
378
377
If conditions are met, the corresponding name is added to the network buffer, unless it is full. If conditions are not met, the executor skips the record and requests the next one from the storage engine.
379
378
380
379
4. This cycle continues as the executor repeatedly requests and evaluates the next index record that matches the query condition until all relevant index records are processed.
@@ -393,23 +392,23 @@ MySQL follows the client-server architecture, which divides the system into two
393
392
394
393
### 1 Client
395
394
396
-
1. The client is an application that interacts with the MySQL database server.
397
-
2. It can be a standalone application, a web application, or any program requiring a database.
398
-
3. The client sends SQL queries to the MySQL server for processing.
395
+
1. The client is an application that interacts with the MySQL database server.
396
+
2. It can be a standalone application, a web application, or any program requiring a database.
397
+
3. The client sends SQL queries to the MySQL server for processing.
399
398
400
399
### 2 Server
401
400
402
-
1. The server is the MySQL database management system responsible for storing, managing, and processing data.
403
-
2. It receives SQL queries, processes them, and returns the result sets.
404
-
3. It manages data storage, security, and concurrent access for multiple clients.
401
+
1. The server is the MySQL database management system responsible for storing, managing, and processing data.
402
+
2. It receives SQL queries, processes them, and returns the result sets.
403
+
3. It manages data storage, security, and concurrent access for multiple clients.
405
404
406
405
The client communicates with the server over the network using the MySQL protocol, enabling multiple clients to interact concurrently. Applications use MySQL connectors to connect to the database server. MySQL also provides client tools, such as the terminal-based MySQL client, for direct interaction with the server.
407
406
408
407
The MySQL database server includes several daemon processes:
409
408
410
-
1. **SQL Interface**: Provides a standardized interface for applications to interact with the database using SQL queries.
411
-
2. **Query Parser**: Analyzes SQL queries to understand their structure and syntax, breaking them down into components for further processing.
412
-
3. **Query Optimizer**: Evaluates various execution plans for a given query and selects the most efficient one to improve performance.
409
+
1. **SQL Interface**: Provides a standardized interface for applications to interact with the database using SQL queries.
410
+
2. **Query Parser**: Analyzes SQL queries to understand their structure and syntax, breaking them down into components for further processing.
411
+
3. **Query Optimizer**: Evaluates various execution plans for a given query and selects the most efficient one to improve performance.
413
412
414
413
In MySQL, a storage engine is responsible for storage, retrieval, and management of data. MySQL's pluggable storage engine architecture allows selecting different storage engines, such as InnoDB and MyISAM, to meet specific performance and scalability requirements while maintaining a consistent SQL interface.
415
414
@@ -423,9 +422,9 @@ The most common way to create a fault-tolerant system is to use redundant compon
423
422
424
423
Replication in MySQL copies data from one server (primary) to one or more servers (secondaries), offering several advantages:
425
424
426
-
1. **Scale-out solutions**: Spreads the load among multiple secondaries to improve performance. All writes and updates occur on the primary server, while reads can occur on secondaries, enhancing read speed.
427
-
2. **Analytics**: Permits analysis on secondaries without impacting primary performance.
428
-
3. **Long-distance data distribution**: Creates local data copies for remote sites without needing constant access to the primary.
425
+
1. **Scale-out solutions**: Spreads the load among multiple secondaries to improve performance. All writes and updates occur on the primary server, while reads can occur on secondaries, enhancing read speed.
426
+
2. **Analytics**: Permits analysis on secondaries without impacting primary performance.
427
+
3. **Long-distance data distribution**: Creates local data copies for remote sites without needing constant access to the primary.
429
428
430
429
The original synchronization type is one-way asynchronous replication. The advantage of asynchronous replication is that user response time is unaffected by secondaries. However, there is a significant risk of data loss if the primary server fails and secondaries are not fully synchronized.
431
430
@@ -535,8 +534,6 @@ The testing command is as follows:
Regarding the improved Group Replication, the configuration parameters for the primary server are as follows:
587
+
Regarding the improved Group Replication, since it is similar between MySQL 8.0.32 and MySQL 8.0.40, we have provided a version available for online use at the following address: https://github.com/advancedmysql/mysql-8.0.40.
588
+
589
+
Accordingly, the configuration parameters for the primary server are as follows:
The parameter *group_replication_single_primary_fast_mode*=1 disables the traditional database certification mode. For the improved Group Replication, the configuration parameters for the secondary server are as follows:
608
+
For the improved Group Replication, the configuration parameters for the secondary server are as follows:
Please note that this patch focuses on optimizing a standalone MySQL instance. The cluster patch will be open-sourced on August 1, 2025.
640
+
This patch specifically targets optimizations for standalone MySQL instances, including:
641
+
642
+
- **MVCC ReadView** enhancements
643
+
- **Binlog group commit** improvements
644
+
- **Query execution plan** optimizations
645
+
646
+
**Cluster Source Code:**
647
+
648
+
The source code for MySQL cluster versions is available here: https://github.com/advancedmysql/mysql-8.0.40
645
649
646
-
For a MySQL standalone instance, the patch includes optimizations such as MVCC ReadView enhancements, binlog group commit improvements, and query execution plan optimizations. For cluster versions, the patch adds optimizations for Group Replication and MySQL secondary replay.
650
+
For MySQL clusters, the patch introduces further optimizationsfor **Group Replication** and **MySQL secondary replay**.
MySQL scalability can be further improved in the following areas:
20
20
21
-
1.Eliminating additional latch bottlenecks, particularly in non-partitioned environments.
22
-
2.Improving the stability of long-term performance testing.
23
-
3.Improving MySQL's NUMA-awareness in mainstream NUMA environments.
24
-
4.Addressing Performance Schema's adverse impact on NUMA environments during MySQL secondary replay processes.
21
+
1. Eliminating additional latch bottlenecks, particularly in non-partitioned environments.
22
+
2. Improving the stability of long-term performance testing.
23
+
3. Improving MySQL's NUMA-awareness in mainstream NUMA environments.
24
+
4. Addressing Performance Schema's adverse impact on NUMA environments during MySQL secondary replay processes.
25
25
26
26
## 12.5 Further Improving SQL Performance Under Low Concurrency
27
27
@@ -43,27 +43,31 @@ In mainstream NUMA environments, MySQL's primary server efficiency in handling l
43
43
44
44
Currently, jemalloc 4.5 is the best-found memory allocation tool, but it has high memory consumption and instability on ARM architecture. A key future focus could be developing a more efficient and stable memory allocation tool.
45
45
46
-
## 12.10 Introducing AI into MySQL Systems
46
+
## 12.10 Integrating a High-Performance File System
47
+
48
+
Enhancing MySQL with a better file system, especially improving the performance of MySQL secondary replay.
49
+
50
+
## 12.11 Introducing AI into MySQL Systems
47
51
48
52
Integrating AI with MySQL for automated knob tuning and learning-based database monitoring could be another key focus for the future.
49
53
50
-
### 12.10.1 Knob Tuning
54
+
### 12.11.1 Knob Tuning
51
55
52
56
Integrating AI for parameter optimization can significantly reduce DBA workload. Key parameters suitable for AI-driven optimization include:
53
57
54
-
1.Buffer pool size
55
-
2.Spin delay settings
56
-
3.Dynamic transaction throttling limits based on environment
57
-
4.Dynamic XCom cache size adjustment
58
-
5.MySQL secondary worker max queue size
59
-
6.The number of Paxos pipelining instances and the size of batching
60
-
7.Automatic parameter adjustments under heavy load to improve processing capability
58
+
1. Buffer pool size
59
+
2. Spin delay settings
60
+
3. Dynamic transaction throttling limits based on environment
61
+
4. Dynamic XCom cache size adjustment
62
+
5. MySQL secondary worker max queue size
63
+
6. The number of Paxos pipelining instances and the size of batching
64
+
7. Automatic parameter adjustments under heavy load to improve processing capability
61
65
62
-
### 12.10.2 Learning-based Database Monitoring
66
+
### 12.11.2 Learning-based Database Monitoring
63
67
64
68
AI could optimize database monitoring by determining the optimal times and methods for tracking various database metrics.
65
69
66
-
## 12.11 Summary
70
+
## 12.12 Summary
67
71
68
72
Programming demands strong logical reasoning skills, crucial for problem-solving, algorithm design, debugging, code comprehension, performance optimization, and testing. It helps in analyzing problems, creating solutions, correcting errors, and ensuring software reliability. Developing logical reasoning is essential for programmers to think systematically and build efficient, reliable software [56].
Copy file name to clipboardExpand all lines: Chapter2.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ This chapter introduces nine puzzling MySQL problems or phenomena that serve as
6
6
7
7
## 2.1 SysBench Read-Write Test Demonstrates Super-Linear Throughput Growth
8
8
9
-
In the MySQL 8.0.27 release version, for example, in a NUMA environment on x86 architecture, using SysBench to remotely test MySQL's read-write capabilities. The MySQL transaction isolation level is set to Read Committed. MySQL instances 1 and 2 are deployed on the same machine, with a testing duration of 60 seconds. The results of separate SysBench tests for MySQL instance 1 and instance 2 are shown in the following figure.
9
+
In the MySQL 8.0.27 release version, for example, in a 4-way NUMA environment on x86 architecture, using SysBench to remotely test MySQL's read-write capabilities. The MySQL transaction isolation level is set to Read Committed. MySQL instances 1 and 2 are deployed on the same machine, with a testing duration of 60 seconds. The results of separate SysBench tests for MySQL instance 1 and instance 2 are shown in the following figure.
0 commit comments