Skip to content

Commit c385f9e

Browse files
committed
Update the book based on feedback
1 parent 6fadc39 commit c385f9e

File tree

6 files changed

+102
-88
lines changed

6 files changed

+102
-88
lines changed

Appendix.md

Lines changed: 54 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,8 @@ struct lock_t {
108108
109109
So, what distinguishes these two concepts? Consider this metaphor [19]:
110110
111-
- A **latch** secures a door, gate, or window in place but does not offer protection against unauthorized access.
112-
- A **lock**, however, restricts entry to those without the key, ensuring security and control.
111+
- A **latch** secures a door, gate, or window in place but does not offer protection against unauthorized access.
112+
- A **lock**, however, restricts entry to those without the key, ensuring security and control.
113113
114114
In MySQL, a global latch is employed to serialize specific processing procedures. For instance, the following is MySQL's description of the role of a global latch.
115115
@@ -132,8 +132,8 @@ In MySQL, locks are integral to the transaction model, with common types includi
132132

133133
Understanding locks is crucial for:
134134

135-
- Implementing large-scale, busy, or highly reliable database applications
136-
- Tuning MySQL performance
135+
- Implementing large-scale, busy, or highly reliable database applications
136+
- Tuning MySQL performance
137137

138138
Familiarity with InnoDB locking and the InnoDB transaction model is essential for these tasks.
139139

@@ -163,8 +163,6 @@ static void lock_grant(lock_t *lock) {
163163
...
164164
```
165165
166-
167-
168166
**15 Maintaining Transaction Order with replica_preserve_commit_order**
169167
170168
In MySQL, the *replica_preserve_commit_order* configuration ensures that transactions on secondary databases are committed in the same order as they appear in the relay log. This setting lays the foundation for maintaining the causal relationship between transactions: if transaction A commits before transaction B on the primary, transaction A will also commit before transaction B on the secondary. This prevents inconsistencies where transactions could be read in the reverse order on the secondary.
@@ -229,9 +227,10 @@ In computer programming, a thread pool is a design pattern used to achieve concu
229227
230228
Throughput measures the number of requests a system processes within a unit of time. Common statistical indicators include:
231229
232-
1. **Transactions Per Second (TPS):** The number of database transactions performed per second.
233-
2. **Queries Per Second (QPS):** The number of database queries performed per second.
234-
3. **tpmC for TPC-C:** The rate of New-Order transactions executed per minute in TPC-C benchmarks.
230+
1. **Transactions Per Second (TPS):** The number of database transactions performed per second.
231+
2. **Queries Per Second (QPS):** The number of database queries performed per second.
232+
3. **tpmC for TPC-C:** The rate of New-Order transactions executed per minute in TPC-C benchmarks.
233+
4. **tpmTOTAL for TPC-C:** The rate of total transactions executed per minute in TPC-C benchmarks.
235234
236235
**31 Thundering Herd**
237236
@@ -269,11 +268,11 @@ The TPC-C benchmark, defined by the Transaction Processing Council, is an OLTP t
269268
270269
This schema is used by five different transactions, each creating varied access patterns:
271270
272-
1. **Item:** Read-only.
273-
2. **Warehouse, District, Customer, Stock:** Read/write.
274-
3. **New-Order:** Insert, read, and delete.
275-
4. **Order and Order-Line:** Inserts with time-delayed updates, causing rows to become stale and infrequently read.
276-
5. **History:** Insert-only.
271+
1. **Item:** Read-only.
272+
2. **Warehouse, District, Customer, Stock:** Read/write.
273+
3. **New-Order:** Insert, read, and delete.
274+
4. **Order and Order-Line:** Inserts with time-delayed updates, causing rows to become stale and infrequently read.
275+
5. **History:** Insert-only.
277276
278277
The diverse access patterns of this small schema with a limited number of transactions contribute to TPC-C's ongoing significance as a major database benchmark. In this book, BenchmarkSQL is primarily employed to evaluate TPC-C performance in MySQL.
279278
@@ -317,9 +316,9 @@ The preprocessor performs preliminary tasks such as verifying the existence of t
317316
318317
The query optimizer determines the execution plan for the SQL query. This phase includes:
319318
320-
- **Logical Query Rewrites:** Transforming queries into logically equivalent forms.
321-
- **Cost-Based Join Optimization:** Evaluating different join methods to minimize execution cost.
322-
- **Rule-Based Access Path Selection:** Choosing the best data access paths based on predefined rules.
319+
- **Logical Query Rewrites:** Transforming queries into logically equivalent forms.
320+
- **Cost-Based Join Optimization:** Evaluating different join methods to minimize execution cost.
321+
- **Rule-Based Access Path Selection:** Choosing the best data access paths based on predefined rules.
323322
324323
The query optimizer generates the execution plan, which is then used by the query executor engine.
325324
@@ -341,11 +340,11 @@ Since this query condition does not use an index, the optimizer chooses a full t
341340
342341
The execution process for the executor and storage engine is as follows:
343342
344-
1. The Server layer calls the storage engine's full scan interface to start reading records from the table.
345-
2. The executor checks if the age of the retrieved record exceeds 20. Records that meet this condition are dispatched to the network write buffer if there is available space.
346-
3. The executor requests the next record from the storage engine in a loop. Each record is evaluated against the query conditions, and those that meet the criteria are sent to the network write buffer, provided the buffer is not full.
347-
4. Once the storage engine has read all records from the table, it notifies the executor that reading is complete.
348-
5. Upon receiving the completion signal, the executor exits the loop and flushes the query results to the client.
343+
1. The Server layer calls the storage engine's full scan interface to start reading records from the table.
344+
2. The executor checks if the age of the retrieved record exceeds 20. Records that meet this condition are dispatched to the network write buffer if there is available space.
345+
3. The executor requests the next record from the storage engine in a loop. Each record is evaluated against the query conditions, and those that meet the criteria are sent to the network write buffer, provided the buffer is not full.
346+
4. Once the storage engine has read all records from the table, it notifies the executor that reading is complete.
347+
5. Upon receiving the completion signal, the executor exits the loop and flushes the query results to the client.
349348
350349
To optimize performance, MySQL minimizes frequent write system calls by checking if the network buffer is full before sending records to the client. Records are sent only when the buffer is full or when the completion signal is received.
351350
@@ -374,7 +373,7 @@ The execution process with an index is as follows:
374373
2. The storage engine retrieves and returns the matching index record to the Server layer.
375374
376375
3. The executor checks if the record meets the additional query conditions (e.g., id \< 3).
377-
376+
378377
If conditions are met, the corresponding name is added to the network buffer, unless it is full. If conditions are not met, the executor skips the record and requests the next one from the storage engine.
379378
380379
4. This cycle continues as the executor repeatedly requests and evaluates the next index record that matches the query condition until all relevant index records are processed.
@@ -393,23 +392,23 @@ MySQL follows the client-server architecture, which divides the system into two
393392
394393
### 1 Client
395394
396-
1. The client is an application that interacts with the MySQL database server.
397-
2. It can be a standalone application, a web application, or any program requiring a database.
398-
3. The client sends SQL queries to the MySQL server for processing.
395+
1. The client is an application that interacts with the MySQL database server.
396+
2. It can be a standalone application, a web application, or any program requiring a database.
397+
3. The client sends SQL queries to the MySQL server for processing.
399398
400399
### 2 Server
401400
402-
1. The server is the MySQL database management system responsible for storing, managing, and processing data.
403-
2. It receives SQL queries, processes them, and returns the result sets.
404-
3. It manages data storage, security, and concurrent access for multiple clients.
401+
1. The server is the MySQL database management system responsible for storing, managing, and processing data.
402+
2. It receives SQL queries, processes them, and returns the result sets.
403+
3. It manages data storage, security, and concurrent access for multiple clients.
405404
406405
The client communicates with the server over the network using the MySQL protocol, enabling multiple clients to interact concurrently. Applications use MySQL connectors to connect to the database server. MySQL also provides client tools, such as the terminal-based MySQL client, for direct interaction with the server.
407406
408407
The MySQL database server includes several daemon processes:
409408
410-
1. **SQL Interface**: Provides a standardized interface for applications to interact with the database using SQL queries.
411-
2. **Query Parser**: Analyzes SQL queries to understand their structure and syntax, breaking them down into components for further processing.
412-
3. **Query Optimizer**: Evaluates various execution plans for a given query and selects the most efficient one to improve performance.
409+
1. **SQL Interface**: Provides a standardized interface for applications to interact with the database using SQL queries.
410+
2. **Query Parser**: Analyzes SQL queries to understand their structure and syntax, breaking them down into components for further processing.
411+
3. **Query Optimizer**: Evaluates various execution plans for a given query and selects the most efficient one to improve performance.
413412
414413
In MySQL, a storage engine is responsible for storage, retrieval, and management of data. MySQL's pluggable storage engine architecture allows selecting different storage engines, such as InnoDB and MyISAM, to meet specific performance and scalability requirements while maintaining a consistent SQL interface.
415414
@@ -423,9 +422,9 @@ The most common way to create a fault-tolerant system is to use redundant compon
423422
424423
Replication in MySQL copies data from one server (primary) to one or more servers (secondaries), offering several advantages:
425424
426-
1. **Scale-out solutions**: Spreads the load among multiple secondaries to improve performance. All writes and updates occur on the primary server, while reads can occur on secondaries, enhancing read speed.
427-
2. **Analytics**: Permits analysis on secondaries without impacting primary performance.
428-
3. **Long-distance data distribution**: Creates local data copies for remote sites without needing constant access to the primary.
425+
1. **Scale-out solutions**: Spreads the load among multiple secondaries to improve performance. All writes and updates occur on the primary server, while reads can occur on secondaries, enhancing read speed.
426+
2. **Analytics**: Permits analysis on secondaries without impacting primary performance.
427+
3. **Long-distance data distribution**: Creates local data copies for remote sites without needing constant access to the primary.
429428
430429
The original synchronization type is one-way asynchronous replication. The advantage of asynchronous replication is that user response time is unaffected by secondaries. However, there is a significant risk of data loss if the primary server fails and secondaries are not fully synchronized.
431430
@@ -535,8 +534,6 @@ The testing command is as follows:
535534
./tpcc_start -h127.0.0.1 -P 3306 -d tpcc200 -u xxx -p "yyy" -w 200 -c 100 -r 0 -l 60 -F 1
536535
```
537536
538-
539-
540537
### 6 Configuration Parameters
541538
542539
Due to numerous tests, only typical configurations are listed here. Special configurations require corresponding parameter modifications.
@@ -587,7 +584,9 @@ slave_parallel_type=LOGICAL_CLOCK
587584
slave_preserve_commit_order=on
588585
```
589586
590-
Regarding the improved Group Replication, the configuration parameters for the primary server are as follows:
587+
Regarding the improved Group Replication, since it is similar between MySQL 8.0.32 and MySQL 8.0.40, we have provided a version available for online use at the following address: https://github.com/advancedmysql/mysql-8.0.40.
588+
589+
Accordingly, the configuration parameters for the primary server are as follows:
591590
592591
```
593592
# for mgr
@@ -600,16 +599,13 @@ loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-baaaaaaaaaab"
600599
loose-group_replication_local_address=127.0.0.1:63318
601600
loose-group_replication_group_seeds= "127.0.0.1:63318,127.0.0.1:53318,127.0.0.1:43318"
602601
loose-group_replication_member_weight=50
603-
loose-group_replication_applier_batch_size_threshold=10000
604-
loose-group_replication_single_primary_fast_mode=1
605-
loose-group_replication_flow_control_mode=disabled
606-
loose-group_replication_broadcast_gtid_executed_period=1000
602+
607603
slave_parallel_workers=256
608604
slave_parallel_type=LOGICAL_CLOCK
609605
slave_preserve_commit_order=on
610606
```
611607
612-
The parameter *group_replication_single_primary_fast_mode*=1 disables the traditional database certification mode. For the improved Group Replication, the configuration parameters for the secondary server are as follows:
608+
For the improved Group Replication, the configuration parameters for the secondary server are as follows:
613609
614610
```
615611
# for mgr
@@ -622,28 +618,36 @@ loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-baaaaaaaaaab"
622618
loose-group_replication_local_address=127.0.0.1:53318
623619
loose-group_replication_group_seeds= "127.0.0.1:63318,127.0.0.1:53318,127.0.0.1:43318"
624620
loose-group_replication_member_weight=50
625-
loose-group_replication_applier_batch_size_threshold=10000
626-
loose-group_replication_single_primary_fast_mode=1
627-
loose-group_replication_flow_control_mode=disabled
628-
loose-group_replication_broadcast_gtid_executed_period=1000
621+
629622
slave_parallel_workers=256
630623
slave_parallel_type=LOGICAL_CLOCK
631624
slave_preserve_commit_order=on
632625
```
633626
627+
Please note that we no longer provide the source code based on MySQL 8.0.32, but we do provide the source code based on MySQL 8.0.40.
628+
634629
The details related to semisynchronous replication can be found at the following address:
635630
636631
https://github.com/advancedmysql/mysql_8.0.27/blob/main/semisynchronous.txt
637632
638633
### 7 Source Code Repository
639634
640-
The patch address for "Percona Server for MySQL 8.0.27-18" is as follows:
635+
**Patch for "Percona Server for MySQL 8.0.27-18":**
641636
637+
Patch Address:
642638
https://github.com/advancedmysql/mysql_8.0.27/blob/main/book_8.0.27_single.patch
643639
644-
Please note that this patch focuses on optimizing a standalone MySQL instance. The cluster patch will be open-sourced on August 1, 2025.
640+
This patch specifically targets optimizations for standalone MySQL instances, including:
641+
642+
- **MVCC ReadView** enhancements
643+
- **Binlog group commit** improvements
644+
- **Query execution plan** optimizations
645+
646+
**Cluster Source Code:**
647+
648+
The source code for MySQL cluster versions is available here: https://github.com/advancedmysql/mysql-8.0.40
645649
646-
For a MySQL standalone instance, the patch includes optimizations such as MVCC ReadView enhancements, binlog group commit improvements, and query execution plan optimizations. For cluster versions, the patch adds optimizations for Group Replication and MySQL secondary replay.
650+
For MySQL clusters, the patch introduces further optimizations for **Group Replication** and **MySQL secondary replay**.
647651
648652
## About the Author
649653

Chapter12.md

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ Currently, Group Replication faces challenging concurrent view change problems.
1818

1919
MySQL scalability can be further improved in the following areas:
2020

21-
1. Eliminating additional latch bottlenecks, particularly in non-partitioned environments.
22-
2. Improving the stability of long-term performance testing.
23-
3. Improving MySQL's NUMA-awareness in mainstream NUMA environments.
24-
4. Addressing Performance Schema's adverse impact on NUMA environments during MySQL secondary replay processes.
21+
1. Eliminating additional latch bottlenecks, particularly in non-partitioned environments.
22+
2. Improving the stability of long-term performance testing.
23+
3. Improving MySQL's NUMA-awareness in mainstream NUMA environments.
24+
4. Addressing Performance Schema's adverse impact on NUMA environments during MySQL secondary replay processes.
2525

2626
## 12.5 Further Improving SQL Performance Under Low Concurrency
2727

@@ -43,27 +43,31 @@ In mainstream NUMA environments, MySQL's primary server efficiency in handling l
4343

4444
Currently, jemalloc 4.5 is the best-found memory allocation tool, but it has high memory consumption and instability on ARM architecture. A key future focus could be developing a more efficient and stable memory allocation tool.
4545

46-
## 12.10 Introducing AI into MySQL Systems
46+
## 12.10 Integrating a High-Performance File System
47+
48+
Enhancing MySQL with a better file system, especially improving the performance of MySQL secondary replay.
49+
50+
## 12.11 Introducing AI into MySQL Systems
4751

4852
Integrating AI with MySQL for automated knob tuning and learning-based database monitoring could be another key focus for the future.
4953

50-
### 12.10.1 Knob Tuning
54+
### 12.11.1 Knob Tuning
5155

5256
Integrating AI for parameter optimization can significantly reduce DBA workload. Key parameters suitable for AI-driven optimization include:
5357

54-
1. Buffer pool size
55-
2. Spin delay settings
56-
3. Dynamic transaction throttling limits based on environment
57-
4. Dynamic XCom cache size adjustment
58-
5. MySQL secondary worker max queue size
59-
6. The number of Paxos pipelining instances and the size of batching
60-
7. Automatic parameter adjustments under heavy load to improve processing capability
58+
1. Buffer pool size
59+
2. Spin delay settings
60+
3. Dynamic transaction throttling limits based on environment
61+
4. Dynamic XCom cache size adjustment
62+
5. MySQL secondary worker max queue size
63+
6. The number of Paxos pipelining instances and the size of batching
64+
7. Automatic parameter adjustments under heavy load to improve processing capability
6165

62-
### 12.10.2 Learning-based Database Monitoring
66+
### 12.11.2 Learning-based Database Monitoring
6367

6468
AI could optimize database monitoring by determining the optimal times and methods for tracking various database metrics.
6569

66-
## 12.11 Summary
70+
## 12.12 Summary
6771

6872
Programming demands strong logical reasoning skills, crucial for problem-solving, algorithm design, debugging, code comprehension, performance optimization, and testing. It helps in analyzing problems, creating solutions, correcting errors, and ensuring software reliability. Developing logical reasoning is essential for programmers to think systematically and build efficient, reliable software [56].
6973

Chapter2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This chapter introduces nine puzzling MySQL problems or phenomena that serve as
66

77
## 2.1 SysBench Read-Write Test Demonstrates Super-Linear Throughput Growth
88

9-
In the MySQL 8.0.27 release version, for example, in a NUMA environment on x86 architecture, using SysBench to remotely test MySQL's read-write capabilities. The MySQL transaction isolation level is set to Read Committed. MySQL instances 1 and 2 are deployed on the same machine, with a testing duration of 60 seconds. The results of separate SysBench tests for MySQL instance 1 and instance 2 are shown in the following figure.
9+
In the MySQL 8.0.27 release version, for example, in a 4-way NUMA environment on x86 architecture, using SysBench to remotely test MySQL's read-write capabilities. The MySQL transaction isolation level is set to Read Committed. MySQL instances 1 and 2 are deployed on the same machine, with a testing duration of 60 seconds. The results of separate SysBench tests for MySQL instance 1 and instance 2 are shown in the following figure.
1010

1111
<img src="media/image-20240829081346732.png" alt="image-20240829081346732" style="zoom:150%;" />
1212

0 commit comments

Comments
 (0)