- SQL is a standard language for storing, manipulating and retrieving data in databases.
- NoSQL stands for "Not Only SQL" and refers to a category of non-relational databases that offer flexibility and scalability for storing and managing data.
| Aspect | SQL (Relational Databases) | NoSQL (Non-relational Databases) |
|---|---|---|
| Definition | Structured databases using tables with predefined schemas to store data. | Flexible databases that store data in various formats without rigid schemas. |
| Data Model | Tables with rows and columns, based on relational algebra. | Key-value, document, column-family, or graph-based models. |
| Schema | Fixed, predefined schema; changes require altering the table structure. | Dynamic schema; fields can be added or modified on the fly. |
| Query Language | Uses SQL (Structured Query Language) for querying, e.g., SELECT, INSERT, UPDATE. | Varies by type; uses APIs or query languages like MongoDB Query Language, CQL, etc. |
| Scalability | Vertical scaling (adding more power to a single server). | Horizontal scaling (adding more servers, sharding, or clustering). |
| Performance | Optimized for complex queries and structured data; may struggle with massive datasets. | High performance for large-scale, unstructured data and high read/write throughput. |
| Data Structure | Structured data with fixed fields; enforces data integrity via constraints. | Handles unstructured, semi-structured, or structured data with flexibility. |
| Consistency | Strong consistency (ACID transactions: Atomicity, Consistency, Isolation, Durability). | Eventual consistency (BASE model: Basically Available, Soft state, Eventual consistency). |
| Joins | Supports complex joins across multiple tables for relational queries. | Limited or no support for joins; data is often denormalized or aggregated upfront. |
| Use Cases | Financial systems, ERP, CRM, applications requiring complex queries and transactions. | Big data, real-time analytics, IoT, content management, social media platforms. |
| Examples | MySQL, PostgreSQL, Oracle, SQL Server. | MongoDB, Cassandra, DynamoDB, Redis, Neo4j. |
| Data Integrity | Enforces strict integrity through primary keys, foreign keys, and constraints. | Relies on application logic for data integrity; less rigid enforcement. |
| Flexibility | Less flexible due to rigid schema; changes can be complex and time-consuming. | Highly flexible; accommodates evolving data structures easily. |
| Transaction Support | Strong support for multi-row transactions with rollback capabilities. | Limited transaction support; varies by database (e.g., MongoDB supports transactions). |
| Ease of Use | Requires knowledge of SQL and schema design; steeper learning curve for complex systems. | Easier to start with due to schema-less design, but query complexity may vary. |
| Data Volume | Suitable for moderate to large datasets with structured data. | Ideal for massive datasets, especially unstructured or semi-structured data. |
| Community & Maturity | Mature, with decades of development and widespread adoption. | Relatively newer but rapidly growing with strong community support. |
-
SQL Databases:
- Best for structured data with clear relationships (e.g., customer-order data).
- Use a fixed schema, making them ideal for applications where data integrity and consistency are critical (e.g., banking systems).
- Rely on SQL for querying, which is standardized and widely understood.
- Scaling vertically can be expensive as it requires more powerful hardware.
-
NoSQL Databases:
- Designed for flexibility, handling diverse data types (e.g., JSON, XML, graphs).
- Schema-less design allows rapid development and adaptation to changing requirements.
- Scales horizontally, making it cost-effective for large-scale, distributed systems.
- Trade-offs include eventual consistency, which may not suit applications needing immediate data accuracy.
- Use SQL when you need strong consistency, complex joins, or structured data (e.g., accounting software).
- Use NoSQL for high scalability, flexible data models, or real-time processing of large, unstructured datasets (e.g., social media analytics).
Click to view the complete documentation
This documentation covers the setup and execution of a Dockerized Python project that:
- Reads an existing CSV file (
MOCK_DATA.csv). - Adds 10,000 dummy data rows with unique modifications.
- Outputs a combined CSV (
MOCK_DATA_10000_more.csv) for local use.
The containerized approach ensures reproducibility and isolation from local environment issues.
The working directory /Users/gyanaranjan.mallick/Downloads/docker_local should contain:
MOCK_DATA.csv— The original data CSV file.add_dummy_data.py— Python script to add dummy rows.Dockerfile— Docker image build instructions.
docker_local/
├── MOCK_DATA.csv
├── add_dummy_data.py
└── Dockerfile
This script uses pandas to:
- Load the original CSV.
- Sample and generate 10,000 new rows,
- Update key fields (
id,email,firstname,lastname,ipaddress) to keep them unique. - Save the combined dataset back to a CSV named
MOCK_DATA_10000_more.csv.
Example script content:
import pandas as pd
def add_dummy_data(input_file='MOCK_DATA.csv', output_file='MOCK_DATA_10000_more.csv', num_new_rows=10000):
df = pd.read_csv(input_file)
new_rows = []
max_id = df['id'].max() if 'id' in df.columns else 0
for i in range(num_new_rows):
row = df.sample(n=1).iloc[0].copy()
row['id'] = max_id + i + 1
row['email'] = f'dummy{i}@example.com' if 'email' in df.columns else ''
row['firstname'] = f'FirstName{i}' if 'firstname' in df.columns else ''
row['lastname'] = f'LastName{i}' if 'lastname' in df.columns else ''
row['ipaddress'] = f'192.168.{i // 256}.{i % 256}' if 'ipaddress' in df.columns else ''
new_rows.append(row)
new_df = pd.DataFrame(new_rows)
combined_df = pd.concat([df, new_df], ignore_index=True)
combined_df.to_csv(output_file, index=False)
print(f'Successfully created {output_file} with original + {num_new_rows} dummy rows')
if __name__ == "__main__":
add_dummy_data()This defines the Docker image that will run the Python script:
FROM python:3.11-slim
WORKDIR /app
COPY MOCK_DATA.csv /app/
COPY add_dummy_data.py /app/
RUN pip install pandas
CMD ["python", "add_dummy_data.py"]FROM python:3.11-slim: Uses a minimal Python 3.11 base image.WORKDIR /app: Switches working directory inside container to/app.COPY MOCK_DATA.csv /app/andCOPY add_dummy_data.py /app/: Copy files into container.RUN pip install pandas: Installs pandas library.CMD ...: Runs the Python script at container start.
Inside /Users/gyanaranjan.mallick/Downloads/docker_local, run:
docker build -t csv_dummy_data .-t csv_dummy_datatags the image..sends the current folder as build context to Docker.
Mount The folder so the output file is saved locally:
docker run --rm -v /Users/gyanaranjan.mallick/Downloads/docker_local:/app csv_dummy_dataDetails:
--rmcleans up container after exit.-v /local/path:/appmounts folder from Mac into container's/app.- The script writes output
MOCK_DATA_10000_more.csvinto/app, which syncs to The Mac.
Check The local folder /Users/gyanaranjan.mallick/Downloads/docker_local for the file
MOCK_DATA_10000_more.csv containing the combined data.
This setup provides a reproducible way to:
- Run data augmentation inside an isolated container.
- Avoid local environment dependency issues.
- Easily share or automate data preparation.
If The original CSV or script changes, simply rebuild the image and rerun the container.
Click to view the Quick Setup
- create a database
ola_db - create lookup tables (
City,Vendor,Owner) - create the
Ola_cabtable with foreign key constraints (so MySQL enforces referential integrity) - seed the lookup tables with valid IDs
- load The CSV file
/var/lib/mysql-files/MOCK_DATA_10000_more.csvintoOla_cab(the CSV header is ignored)
Paste everything between the triple backticks into a new file, for example /tmp/ola_create_and_load.sql:
cat > /tmp/ola_create_and_load.sql <<'SQL'
-- Create database and use it
CREATE DATABASE IF NOT EXISTS ola_db CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
USE ola_db;
-- Create lookup tables for foreign keys
CREATE TABLE IF NOT EXISTS City (
id INT PRIMARY KEY,
name VARCHAR(100)
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS Vendor (
id INT PRIMARY KEY,
name VARCHAR(100)
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS Owner (
id INT PRIMARY KEY,
name VARCHAR(200)
) ENGINE=InnoDB;
-- Create main Ola_cab table (types follow The provided schema).
-- Note: ensure column order below matches the CSV header order.
CREATE TABLE IF NOT EXISTS Ola_cab (
id INT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(255),
gender VARCHAR(10),
ip_address VARCHAR(45),
city_id INT,
cab_type VARCHAR(50),
vendor_id INT,
owner_id INT,
lease_owner_id INT,
cab_color_id INT,
cab_model_id INT,
cab_segment_id INT,
installment DECIMAL(10,2),
purchase_from VARCHAR(255),
color VARCHAR(50),
model VARCHAR(100),
engine_number VARCHAR(100),
chassis_number VARCHAR(100),
total_purchase_cost DECIMAL(15,2),
total_payment DECIMAL(15,2),
policy_number VARCHAR(100),
company_name VARCHAR(255),
amount DECIMAL(15,2),
idv_value DECIMAL(15,2),
nil_depreciation_value DECIMAL(15,2),
cab_number VARCHAR(50),
tally_ledger_name VARCHAR(255),
gps_number VARCHAR(100),
registration_number VARCHAR(100),
is_ac BOOLEAN,
allow_out_station BOOLEAN,
owner_come_as_driver BOOLEAN,
driver_id INT,
booking_id INT,
status VARCHAR(50),
cab_state VARCHAR(50),
rating DECIMAL(3,2),
reduce_rating BOOLEAN,
rating_reduced_at DATETIME,
points INT,
current_points INT,
exit_initiated_at DATETIME,
exit_initiated_by_id INT,
exit_initiated_by_role VARCHAR(50),
exited_at DATETIME,
exited_by_id INT,
device_model VARCHAR(100),
os_version VARCHAR(100),
driver_app_version VARCHAR(50),
driver_app_version_updated_at DATETIME,
fc_end_date DATE,
policy_end_date DATE,
policy_start_date DATE,
purchase_date DATE,
manufacturing_year YEAR,
meter_reading INT,
permit_end_date DATE,
leased_vehicle BOOLEAN,
lease_agreement_end_date DATE,
date_of_commence DATE,
firstname VARCHAR(100),
lastname VARCHAR(100),
ipaddress VARCHAR(45),
CONSTRAINT fk_olacab_city FOREIGN KEY (city_id) REFERENCES City(id),
CONSTRAINT fk_olacab_vendor FOREIGN KEY (vendor_id) REFERENCES Vendor(id),
CONSTRAINT fk_olacab_owner FOREIGN KEY (owner_id) REFERENCES Owner(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
-- Seed lookup tables using a recursive CTE (generate 1..N rows).
-- Cities 1..100
WITH RECURSIVE seq AS (
SELECT 1 AS n
UNION ALL
SELECT n+1 FROM seq WHERE n < 100
)
INSERT INTO City (id, name)
SELECT n, CONCAT('City_', n)
FROM seq
ON DUPLICATE KEY UPDATE name = VALUES(name);
-- Vendors 1..50
WITH RECURSIVE seq2 AS (
SELECT 1 AS n
UNION ALL
SELECT n+1 FROM seq2 WHERE n < 50
)
INSERT INTO Vendor (id, name)
SELECT n, CONCAT('Vendor_', n)
FROM seq2
ON DUPLICATE KEY UPDATE name = VALUES(name);
-- Owners 1..200
WITH RECURSIVE seq3 AS (
SELECT 1 AS n
UNION ALL
SELECT n+1 FROM seq3 WHERE n < 200
)
INSERT INTO Owner (id, name)
SELECT n, CONCAT('Owner_', n)
FROM seq3
ON DUPLICATE KEY UPDATE name = VALUES(name);
-- Make sure data is committed before load (not strictly necessary in a script, but safe)
FLUSH TABLES;
-- Load CSV into Ola_cab
-- IMPORTANT: adjust the FIELDS/ENCLOSED/TERMINATED rules if The CSV format differs
LOAD DATA INFILE '/var/lib/mysql-files/MOCK_DATA_10000_more.csv'
INTO TABLE Ola_cab
CHARACTER SET utf8mb4
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(
id, first_name, last_name, email, gender, ip_address, city_id, cab_type, vendor_id, owner_id,
lease_owner_id, cab_color_id, cab_model_id, cab_segment_id, installment, purchase_from, color, model,
engine_number, chassis_number, total_purchase_cost, total_payment, policy_number, company_name, amount,
idv_value, nil_depreciation_value, cab_number, tally_ledger_name, gps_number, registration_number, is_ac,
allow_out_station, owner_come_as_driver, driver_id, booking_id, status, cab_state, rating, reduce_rating,
rating_reduced_at, points, current_points, exit_initiated_at, exit_initiated_by_id, exit_initiated_by_role,
exited_at, exited_by_id, device_model, os_version, driver_app_version, driver_app_version_updated_at,
fc_end_date, policy_end_date, policy_start_date, purchase_date, manufacturing_year, meter_reading,
permit_end_date, leased_vehicle, lease_agreement_end_date, date_of_commence, firstname, lastname, ipaddress
);
-- Basic verification queries
SELECT COUNT(*) AS total_rows FROM Ola_cab;
SELECT COUNT(*) AS cities FROM City;
SELECT COUNT(*) AS vendors FROM Vendor;
SELECT COUNT(*) AS owners FROM Owner;
SQLFrom the same shell, run:
# Run the SQL script (you will be prompted for the MySQL root password)
mysql -u root -p < /tmp/ola_create_and_load.sqlEnter the root password when prompted. The script will:
- create DB and tables
- seed City/Vendor/Owner
- import the CSV (ignoring the header line)
- print the counts at the end
If you prefer to run queries interactively after the load:
mysql -u root -p
# then inside mysql prompt:
USE ola_db;
SELECT COUNT(*) FROM Ola_cab;
SELECT id, name FROM City LIMIT 5;
SELECT id, name FROM Vendor LIMIT 5;
SELECT id, name FROM Owner LIMIT 5;
SELECT id, first_name, city_id FROM Ola_cab LIMIT 5;-
secure_file_priv The server-side
LOAD DATA INFILEis allowed only for directories allowed by MySQL’ssecure_file_priv. By default/var/lib/mysql-filesis allowed on many installations — you already have the CSV there, which is perfect. If you get an error likeThe MySQL server is running with the --secure-file-priv option so it cannot execute this statement, let me know the error text and I’ll give the fix. -
CSV format
- I used
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'andLINES TERMINATED BY '\n'. If The CSV uses\r\nline endings, replaceLINES TERMINATED BY '\r\n'. - The
IGNORE 1 LINEStells MySQL to skip the header row.
- I used
-
Foreign key failures
- If a CSV row has a
city_id,vendor_idorowner_idthat is not present in the seeded lookup tables, MySQL will reject that row (foreign key constraint). - We seeded City 1..100, Vendor 1..50, Owner 1..200. If The CSV contains IDs outside these ranges, change the seeding ranges or pre-process the CSV to fix IDs. If you prefer MySQL to accept the rows without enforcement, I can give a variant that creates the table without foreign keys.
- If a CSV row has a
-
If LOAD DATA fails with data conversion issues
- Date/time or boolean formats in the CSV can cause errors. If you hit such an error paste the error message and I’ll adapt the LOAD statement (or provide a Python loader that parses/validates).
You have a CSV with many columns. To import it properly, let's create a MySQL table that matches its structure.
Click to view the long setup
docker exec -it local-mysql bashmysql -u root -p
# Enter the password (rootpass)CREATE DATABASE IF NOT EXISTS appdb2;
USE appdb2;Based on The column list, here is a sample CREATE TABLE statement with suitable data types. You should adjust data types based on actual data characteristics (length, numeric or text, etc.):
CREATE TABLE Ola_cab (
id INT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(255),
gender VARCHAR(10),
ip_address VARCHAR(45),
city_id INT,
cab_type VARCHAR(50),
vendor_id INT,
owner_id INT,
lease_owner_id INT,
cab_color_id INT,
cab_model_id INT,
cab_segment_id INT,
installment DECIMAL(10,2),
purchase_from VARCHAR(255),
color VARCHAR(50),
model VARCHAR(100),
engine_number VARCHAR(100),
chassis_number VARCHAR(100),
total_purchase_cost DECIMAL(15,2),
total_payment DECIMAL(15,2),
policy_number VARCHAR(100),
company_name VARCHAR(255),
amount DECIMAL(15,2),
idv_value DECIMAL(15,2),
nil_depreciation_value DECIMAL(15,2),
cab_number VARCHAR(50),
tally_ledger_name VARCHAR(255),
gps_number VARCHAR(100),
registration_number VARCHAR(100),
is_ac BOOLEAN,
allow_out_station BOOLEAN,
owner_come_as_driver BOOLEAN,
driver_id INT,
booking_id INT,
status VARCHAR(50),
cab_state VARCHAR(50),
rating DECIMAL(3,2),
reduce_rating BOOLEAN,
rating_reduced_at DATETIME,
points INT,
current_points INT,
exit_initiated_at DATETIME,
exit_initiated_by_id INT,
exit_initiated_by_role VARCHAR(50),
exited_at DATETIME,
exited_by_id INT,
device_model VARCHAR(100),
os_version VARCHAR(100),
driver_app_version VARCHAR(50),
driver_app_version_updated_at DATETIME,
fc_end_date DATE,
policy_end_date DATE,
policy_start_date DATE,
purchase_date DATE,
manufacturing_year YEAR,
meter_reading INT,
permit_end_date DATE,
leased_vehicle BOOLEAN,
lease_agreement_end_date DATE,
date_of_commence DATE,
firstname VARCHAR(100),
lastname VARCHAR(100),
ipaddress VARCHAR(45)
);Notes:
BOOLEANcolumns in MySQL are typically treated as tiny integers (0/1).- Adjust column sizes or types based on The data for best results.
SHOW TABLES;
DESCRIBE Ola_cab;- Check the secure file privilege path inside container:
SHOW VARIABLES LIKE 'secure_file_priv';This will show the directory MySQL accepts secure file operations from, e.g., /var/lib/mysql-files/.
- Copy The CSV into that directory inside container, e.g.:
docker cp /Users/gyanaranjan.mallick/Downloads/docker_local/MOCK_DATA_10000_more.csv local-mysql:/var/lib/mysql-files/- Then run The import using full path inside that directory, e.g.:
LOAD DATA INFILE '/var/lib/mysql-files/MOCK_DATA_10000_more.csv'
INTO TABLE Ola_cab
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;A dump is a file (usually .sql) that contains a database’s schema and/or data — typically created using the mysqldump utility or via tools like phpMyAdmin, MySQL Workbench, or an automated backup script. MySQL doesn’t automatically create or store dumps anywhere.
It depends on how the dump was created:
| Method | Default / Common Dump Location | Notes |
|---|---|---|
mysqldump command |
Whatever path you specify with > |
Example: mysqldump -u root -p appdb > /home/gyan/appdb.sql → dump is in /home/gyan/appdb.sql |
mysqlpump command |
Whatever path you specify with --result-file |
Similar idea |
phpMyAdmin export |
Browser download directory | Usually Downloads/ folder |
| Automated backup scripts | Check /var/backups/mysql/, /backups/, /opt/mysql_backups/, etc. |
Depends on The server setup or cron jobs |
| Managed RDS instance | AWS S3 or RDS snapshots | RDS stores snapshots, not .sql dumps, unless exported manually |
If you’re on Linux, you can search for .sql dump files like this:
sudo find / -type f -name "*.sql" 2>/dev/nullOr, if you know roughly where backups might be stored:
sudo find /var/backups /home -type f -name "*.sql"mysqldump -u root -p appdb > /path/to/save/appdb.sqlYou’ll be prompted for The password, and then the dump file will be created in the location you specify.
Click to view the Theoretical Overview
In modern software development and operations, databases are often managed within containerized environments like Docker to ensure portability, scalability, and isolation. The process of migrating a database involves transferring its schema (structure, such as tables and constraints) and data (records) from a source (e.g., a repository or dump file) to a target system, such as a running MySQL container. Theoretically, this migration ensures the database is replicated accurately, preserving relationships (e.g., primary and foreign keys), data integrity, and functionality. Key concepts include:
- Schema Creation: Defining the database structure, including tables, indexes, views, and constraints.
- Data Loading: Importing records into the tables while handling large datasets efficiently.
- Integrity Verification: Checking row counts, checksums (e.g., CRC), and relationships to confirm no data loss or corruption.
- Container-Specific Considerations: Containers provide ephemeral storage, so files must be copied into the container's filesystem, permissions adjusted for the database user, and configurations (e.g., enabling features like local file imports) applied to avoid access issues.
The migration is "idempotent" if scripted properly—meaning it can be re-run without duplicating data—and handles errors like format mismatches or path issues through validation steps.
Imagine you're a DevOps engineer at a mid-sized company transitioning from a traditional server-based HR system to a cloud-native setup. The existing HR database contains employee records, department details, salary histories, and titles for 300,000+ employees, stored in a repository as schema scripts and data dump files. The goal is to migrate this entire database to a running MySQL container in a Docker environment for better scalability during peak hiring seasons.
-
Preparation Phase (Source Acquisition and Planning):
- You start by acquiring the database source from a public repository (e.g., a GitHub archive). This includes a main schema script that defines the database structure (e.g., creating tables like "employees" with columns for ID, name, birth date, and hire date, plus constraints like primary keys on employee IDs and foreign keys linking departments to managers).
- Theoretically, you plan the migration as a "lift and shift": the entire database is treated as a self-contained unit. You identify that the data dumps are in SQL insert format (e.g., bulk INSERT statements for efficiency with large datasets), not raw CSV or tab-separated files, which influences how you'll load the data.
-
Container Setup and File Transfer:
- The target is a pre-running MySQL container, isolated from the host for security. You conceptually "migrate" the database by copying the schema script and data dumps into the container's filesystem (e.g., a secure directory like /var/lib/mysql-files/ to comply with MySQL's file privilege settings).
- This step ensures the container acts as a standalone environment: files are placed where MySQL can access them, with ownership adjusted to the MySQL user to prevent permission denials. Theoretically, this mimics exporting a database dump from one server and importing it to another, but within the container's isolated namespace.
-
Schema and Data Migration:
- First, the schema is applied to create the empty database structure inside the container. This includes defining relationships (e.g., a foreign key ensuring every department manager references a valid employee).
- Next, the data is loaded by executing the dump files sequentially. Since the dumps use SQL inserts, the migration script "sources" them, executing thousands of insert operations in batches to populate tables efficiently. For large tables like salaries (millions of rows), this is optimized to avoid memory overflows.
- If issues arise (e.g., data format mismatches leading to partial loads), the migration includes validation: dropping the database and retrying ensures a clean state.
-
Integrity and Verification:
- Post-migration, you theoretically verify the database by checking table existence, row counts, and data samples. A checksum-based integrity test compares expected vs. actual records and hashes to detect corruption.
- In the scenario, if the initial migration loads only partial data (e.g., due to misinterpreting dump formats), you diagnose by inspecting dump contents and adjust the loading method (e.g., switching from raw data import to SQL sourcing).
-
Post-Migration Usage:
- The migrated database is now live in the container, ready for queries (e.g., joining employees with salaries for payroll reports). The container can be scaled (e.g., replicated in a cluster) without re-migrating, as the database is self-contained.
- In The company's scenario, this enables HR teams to query employee data seamlessly, with the container handling high loads during audits.
- Portability: Containers encapsulate the database, making it easy to deploy across environments (dev, staging, production).
- Error Handling: Migrations often face format mismatches (e.g., assuming raw data vs. SQL inserts), resolved through file inspection and method adjustment.
- Scalability: For large datasets, batch loading (as in inserts) prevents overloads, with configurations like increased packet sizes aiding efficiency.
- Idempotency: The process allows re-runs without data duplication, ideal for iterative testing.
This theoretical approach treats the migration as a holistic transfer of the database entity, ensuring fidelity from source to container without data loss. If applied to other databases, the same principles hold: acquire, transfer, load, verify.
The Employees Database is a sample MySQL database for HR/employee management, containing tables like employees, departments, dept_emp, dept_manager, salaries, and titles. It includes constraints (e.g., primary keys, foreign keys), relationships, and a large dataset (~300,024 employee records and ~2.8 million salary entries) for testing queries, optimization, and data analysis. The data is about 167 MB when exported.
This guide assumes you have basic familiarity with command-line tools. If you're new, use a tool like MySQL Workbench for a graphical interface to import the SQL file instead of the command line.
This guide provides a clean and organized process to import the Employees Database into a MySQL container (e.g., mysql:8.0), addressing the issue with the employees.sql script where source commands incorrectly reference .dump files (raw tab-separated data) instead of using LOAD DATA LOCAL INFILE. The steps account for container-specific nuances like file paths and permissions, ensuring the database (with ~300,024 employee records and ~2.8M salary entries, ~167 MB) is set up with proper constraints (primary keys, foreign keys). The import may take 30-60sec. The modification of employees.sql is moved to immediately follow cloning the repository for clarity.
Click to view Step-by-Step Guide to Import the Employees Database into a MySQL Container
- Docker and MySQL Container: A running MySQL container. Start one if needed:
Replace
docker run --name mysql-container -e MYSQL_ROOT_PASSWORD=The_password -d -p 3306:3306 mysql:8.0The_passwordwith a secure password. Use The container name (e.g.,mysql-container). - Repository Files: The
test_dbrepository from https://github.com/datacharmer/test_db, containingemployees.sqland.dumpfiles (e.g.,load_departments.dump). - User Privileges: Root user access with full privileges (e.g., CREATE, INSERT, FOREIGN KEY).
- Disk Space: At least 500 MB free for data and indexes.
- Local Infile: Enabled for
LOAD DATA LOCAL INFILE(verified in steps). - Git: Installed on the host for cloning (or download ZIP).
- Container Access: Ability to run
docker cpanddocker exec.
- Clone the Repository on the Host:
Or download the ZIP from https://github.com/datacharmer/test_db, then extract it.
git clone https://github.com/datacharmer/test_db.git - Navigate to the Folder:
cd test_db - Verify Files:
Expect
lsemployees.sql,load_departments.dump, etc. - Modify
employees.sql: The script usessourcecommands, which expect SQL files, but the.dumpfiles are raw tab-separated data. Editemployees.sqlto useLOAD DATA LOCAL INFILE.-
Backup the file:
cp employees.sql employees.sql.bak -
Edit with a text editor (e.g.,
nano employees.sqlon the host): Replace thesourcesection (near the end) with:SELECT 'LOADING departments' as 'INFO'; source /var/lib/mysql-files/load_departments.dump; SELECT 'LOADING employees' as 'INFO'; source /var/lib/mysql-files/load_employees.dump; SELECT 'LOADING dept_emp' as 'INFO'; source /var/lib/mysql-files/load_dept_emp.dump; SELECT 'LOADING dept_manager' as 'INFO'; source /var/lib/mysql-files/load_dept_manager.dump; SELECT 'LOADING titles' as 'INFO'; source /var/lib/mysql-files/load_titles.dump; SELECT 'LOADING salaries' as 'INFO'; source /var/lib/mysql-files/load_salaries1.dump; source /var/lib/mysql-files/load_salaries2.dump; source /var/lib/mysql-files/load_salaries3.dump; source /tmp/test_db/show_elapsed.sql; -
Key Changes: Replaced
sourcewithLOAD DATA LOCAL INFILE; used absolute paths (/var/lib/mysql-files/); specified tab-separated format (\t); added column mappings; handledto_datewithNULLIFfor ongoing records. -
Save and exit the editor.
-
- Check if the MySQL container is running:
Start if needed:
docker psdocker start mysql-container. - Get container details:
docker ps -a
- Copy the
test_dbfolder to the container:docker cp ./test_db mysql-container:/tmp/test_db - Access the container shell:
docker exec -it mysql-container bash - Verify files:
Expect
ls -al /tmp/test_dbemployees.sql,load_departments.dump, etc.
- Log into MySQL:
Enter the root password.
mysql -u root -p - Check
local_infile:IfSHOW GLOBAL VARIABLES LIKE 'local_infile';OFF, enable it:SET GLOBAL local_infile = 1;. - Exit MySQL:
exit;.
- Move
.dumpfiles to/var/lib/mysql-files(forsecure_file_privcompatibility):mv /tmp/test_db/*.dump /var/lib/mysql-files/ chmod -R 644 /var/lib/mysql-files/*.dump chown -R mysql:mysql /var/lib/mysql-files - Verify
secure_file_priv:If set tomysql -u root -p SHOW VARIABLES LIKE 'secure_file_priv';/var/lib/mysql-files/, proceed; if empty,/tmp/test_dbmay work.
- From the container shell:
Or from the host:
mysql -u root -p --local-infile=1 < /tmp/test_db/employees.sqldocker exec -i mysql-container mysql -u root -p --local-infile=1 < ./test_db/employees.sql - Monitor output (e.g., "LOADING departments"). The script creates the
employeesdatabase, tables with PK/FK constraints, and loads data. - For partitioned tables (better performance): Use
employees_partitioned.sqlinstead (edit similarly if needed).
- Log into MySQL:
mysql -u root -p - Switch database:
USE employees; - List tables:
SHOW TABLES;(Expect 8, including views likecurrent_dept_emp). - Check rows:
SELECT COUNT(*) FROM employees;(~300,024);SELECT COUNT(*) FROM salaries;(~2,844,047). - Run integrity test:
Expect "OK" for all tables.
mysql -u root -p -t < /tmp/test_db/test_employees_md5.sql
- Sample queries:
SELECT * FROM employees LIMIT 10; SELECT e.first_name, s.salary FROM employees e JOIN salaries s ON e.emp_no = s.emp_no LIMIT 10; - Documentation: https://dev.mysql.com/doc/employee/en/ for schema.
- Drop/reload:
DROP DATABASE employees;then repeat Step 6.
- Local Infile Errors: Add
local_infile=1to/etc/mysql/my.cnf; restart container (docker restart mysql-container). - File Not Found: Verify paths in
employees.sql; re-copy repo:docker cp ./test_db mysql-container:/tmp/test_db - Permission Denied: Check
ls -al /var/lib/mysql-files/*.dump. - Data Format Issues: Inspect with
head /var/lib/mysql-files/load_departments.dump; adjustFIELDS TERMINATED BY(e.g., to','). - Memory/Timeout: Set
SET GLOBAL max_allowed_packet = 268435456;. - Logs:
tail -n 50 /var/log/mysql/error.log. - Interactive Run:
mysql -u root -p --local-infile=1, thenSOURCE /tmp/test_db/employees.sql;. - GUI Access: Use MySQL Workbench on host, connect to
localhost:3306.
Click to view Queries after the DB Setup
Now that the Employees Database has been successfully migrated into the MySQL container with all tables correctly populated (e.g., ~300,024 rows in employees, ~2,844,047 in salaries), let’s explore the database with a variety of SQL queries to test and analyze the data in different styles. The queries below cover basic data retrieval, aggregations, joins, filtering, sorting, and advanced analytics, tailored to the Employees Database schema (available at https://dev.mysql.com/doc/employee/en/). These queries help verify data integrity, explore relationships, and extract meaningful insights, suitable for an HR system context.
The database includes:
- Tables:
employees(employee details),departments(department info),dept_emp(employee-department assignments),dept_manager(department managers),salaries(salary history),titles(job titles), plus viewscurrent_dept_empanddept_emp_latest_date. - Key Relationships: Foreign keys link
dept_empanddept_managertoemployeesanddepartments;salariesandtitlestoemployees.
Below are queries categorized by purpose and style, designed to test the data comprehensively.
These queries fetch raw data to verify table contents and explore individual records.
-
Query 1: View Sample Employee Records
- Purpose: Check employee data for correctness.
- Query:
SELECT emp_no, first_name, last_name, gender, hire_date FROM employees LIMIT 5;
- Expected Output: Displays 5 employee records, e.g.:
+--------+------------+-----------+--------+------------+ | emp_no | first_name | last_name | gender | hire_date | +--------+------------+-----------+--------+------------+ | 10001 | Georgi | Facello | M | 1986-06-26 | | 10002 | Bezalel | Simmel | F | 1985-11-21 | | 10003 | Parto | Bamford | M | 1986-08-28 | | 10004 | Chirstian | Koblick | M | 1986-12-01 | | 10005 | Kyoichi | Maliniak | M | 1989-09-12 | +--------+------------+-----------+--------+------------+
-
Query 2: List All Departments
- Purpose: Verify the
departmentstable. - Query:
SELECT dept_no, dept_name FROM departments ORDER BY dept_no;
- Expected Output: Lists 9 departments, e.g.:
+---------+------------------+ | dept_no | dept_name | +---------+------------------+ | d001 | Marketing | | d002 | Finance | | d003 | Human Resources | ... +---------+------------------+
- Purpose: Verify the
These queries combine tables to test foreign key relationships and data consistency.
-
Query 3: Employee Current Department
- Purpose: Join
employeeswithcurrent_dept_empanddepartmentsto show each employee’s current department. - Query:
SELECT e.emp_no, e.first_name, e.last_name, d.dept_name FROM employees e JOIN current_dept_emp cde ON e.emp_no = cde.emp_no JOIN departments d ON cde.dept_no = d.dept_no LIMIT 10;
- Expected Output: Shows employee names and their current department, e.g.:
+--------+------------+-----------+------------------+ | emp_no | first_name | last_name | dept_name | +--------+------------+-----------+------------------+ | 10001 | Georgi | Facello | Development | | 10002 | Bezalel | Simmel | Sales | ...
- Purpose: Join
-
Query 4: Employee Salary History
- Purpose: Join
employeeswithsalariesto view salary records. - Query:
SELECT e.emp_no, e.first_name, e.last_name, s.salary, s.from_date, s.to_date FROM employees e JOIN salaries s ON e.emp_no = s.emp_no WHERE e.emp_no = 10001 ORDER BY s.from_date;
- Expected Output: Shows salary history for employee 10001, e.g.:
+--------+------------+-----------+--------+------------+------------+ | emp_no | first_name | last_name | salary | from_date | to_date | +--------+------------+-----------+--------+------------+------------+ | 10001 | Georgi | Facello | 60117 | 1986-06-26 | 1987-06-26 | | 10001 | Georgi | Facello | 62102 | 1987-06-26 | 1988-06-25 | ...
- Purpose: Join
These queries summarize data to test counts, averages, and trends.
-
Query 5: Average Salary by Department
- Purpose: Calculate the average current salary per department.
- Query:
SELECT d.dept_name, ROUND(AVG(s.salary), 2) AS avg_salary FROM departments d JOIN current_dept_emp cde ON d.dept_no = cde.dept_no JOIN salaries s ON cde.emp_no = s.emp_no WHERE s.to_date = '9999-01-01' GROUP BY d.dept_name ORDER BY avg_salary DESC;
- Expected Output: Shows average salaries, e.g.:
+------------------+------------+ | dept_name | avg_salary | +------------------+------------+ | Sales | 80668.12 | | Marketing | 78890.45 | ... +------------------+------------+
-
Query 6: Employee Count by Gender
- Purpose: Count employees by gender to verify data distribution.
- Query:
SELECT gender, COUNT(*) AS employee_count FROM employees GROUP BY gender;
- Expected Output: Shows gender distribution, e.g.:
+--------+---------------+ | gender | employee_count | +--------+---------------+ | M | 180000 | | F | 120024 | +--------+---------------+
These queries test conditional logic and ordering.
-
Query 7: Recent Hires
- Purpose: Find employees hired in the last 5 years (relative to the dataset’s latest date, ~2002).
- Query:
SELECT emp_no, first_name, last_name, hire_date FROM employees WHERE hire_date >= '1997-01-01' ORDER BY hire_date DESC LIMIT 5;
- Expected Output: Lists recent hires, e.g.:
+--------+------------+-----------+------------+ | emp_no | first_name | last_name | hire_date | +--------+------------+-----------+------------+ | 499999 | Sachin | Tsukuda | 2000-01-28 | ...
-
Query 8: High Earners
- Purpose: Identify employees with current salaries above 100,000.
- Query:
SELECT e.emp_no, e.first_name, e.last_name, s.salary FROM employees e JOIN salaries s ON e.emp_no = s.emp_no WHERE s.salary > 100000 AND s.to_date = '9999-01-01' ORDER BY s.salary DESC LIMIT 5;
- Expected Output: Shows top earners, e.g.:
+--------+------------+-----------+--------+ | emp_no | first_name | last_name | salary | +--------+------------+-----------+--------+ | 43624 | Tokuyasu | Pesch | 158220 | ...
These queries use subqueries, window functions, or complex joins for deeper insights.
-
Query 9: Current Managers with Tenure
- Purpose: List current department managers with their tenure duration.
- Query:
SELECT d.dept_name, e.first_name, e.last_name, DATEDIFF(CURDATE(), dm.from_date) AS tenure_days FROM dept_manager dm JOIN employees e ON dm.emp_no = e.emp_no JOIN departments d ON dm.dept_no = d.dept_no WHERE dm.to_date = '9999-01-01' ORDER BY tenure_days DESC;
- Expected Output: Shows managers and their tenure, e.g.:
+------------------+------------+-----------+-------------+ | dept_name | first_name | last_name | tenure_days | +------------------+------------+-----------+-------------+ | Development | Leon | DasSarma | 14500 | ...
-
Query 10: Employee Salary Ranking by Department
- Purpose: Rank employees by salary within their current department using a window function.
- Query:
SELECT e.emp_no, e.first_name, e.last_name, d.dept_name, s.salary, RANK() OVER (PARTITION BY d.dept_no ORDER BY s.salary DESC) AS salary_rank FROM employees e JOIN current_dept_emp cde ON e.emp_no = cde.emp_no JOIN departments d ON cde.dept_no = d.dept_no JOIN salaries s ON e.emp_no = s.emp_no WHERE s.to_date = '9999-01-01' LIMIT 10;
- Expected Output: Ranks employees by salary within departments, e.g.:
+--------+------------+-----------+------------------+--------+-------------+ | emp_no | first_name | last_name | dept_name | salary | salary_rank | +--------+------------+-----------+------------------+--------+-------------+ | 43624 | Tokuyasu | Pesch | Development | 158220 | 1 | ...
These queries validate the database’s consistency.
-
Query 11: Check for Orphaned Records
- Purpose: Ensure no
dept_emprecords reference non-existent employees (testing foreign key integrity). - Query:
SELECT de.emp_no, de.dept_no FROM dept_emp de LEFT JOIN employees e ON de.emp_no = e.emp_no WHERE e.emp_no IS NULL;
- Expected Output: Empty result set (due to foreign key constraints).
- Purpose: Ensure no
-
Query 12: Count Active Employees
- Purpose: Count employees with ongoing department assignments.
- Query:
SELECT COUNT(*) AS active_employees FROM current_dept_emp WHERE to_date = '9999-01-01';
- Expected Output: ~240,124 active assignments.
- Run Sequentially: Execute each query in MySQL to confirm expected row counts and data patterns.
- Validate Relationships: Queries 3, 4, and 9 test joins and foreign keys.
- Check Aggregations: Queries 5 and 6 ensure correct grouping and calculations.
- Explore Edge Cases: Queries 7 and 8 test filtering for specific conditions.
- Use Analytics: Queries 9 and 10 leverage advanced SQL for insights.
- Verify Integrity: Queries 11 and 12 confirm data consistency.
- Log into MySQL:
docker exec -it mysql-container mysql -u root -p USE employees; - Copy and paste each query, observing the output.
- Compare results with expected counts from
test_employees_md5.sql:mysql -u root -p -t < /tmp/test_db/test_employees_md5.sql
- Schema Reference: Use https://dev.mysql.com/doc/employee/en/ for table details.
- Performance: For large joins (e.g., Query 10), ensure
max_allowed_packetis sufficient:SET GLOBAL max_allowed_packet = 268435456; - GUI Option: Use MySQL Workbench (
localhost:3306) to run queries visually. - Custom Analysis: Adapt queries for specific HR needs, e.g., salary trends or promotion history.
Click to view the Errors Faced and issue solved
Documentation: Errors Faced and Solutions for Importing the Employees Database into a MySQL Container
This documentation outlines the errors encountered while importing the Employees Database into a MySQL container (using the mysql:8.0 image) and the steps taken to resolve them. The Employees Database, sourced from https://github.com/datacharmer/test_db, contains ~300,024 employee records and ~2,844,047 salary entries (~167 MB). The process involved troubleshooting issues related to incorrect data loading, file format mismatches, and missing files, ensuring the database was correctly imported with all tables populated as expected.
- MySQL Version: 8.0.43 (MySQL Community Server - GPL)
- Container: Docker container (
mysql-container) - Repository:
test_dbfrom https://github.com/datacharmer/test_db - Files:
employees.sql,load_*.dumpfiles (e.g.,load_employees.dump,load_salaries1.dump),show_elapsed.sql,test_employees_md5.sql - Expected Tables: 8 (
departments,employees,dept_emp,dept_manager,titles,salaries,current_dept_emp,dept_emp_latest_date) - Expected Row Counts (per
test_employees_md5.sql):departments: 9employees: 300,024dept_emp: 331,603dept_manager: 24titles: 443,308salaries: 2,844,047
Description:
- After running
employees.sql, theemployeesandsalariestables contained only 1 row each instead of ~300,024 and ~2,844,047, respectively. - The integrity test (
test_employees_md5.sql) showed mismatched record counts and CRCs:+--------------+------------------+----------------------------------+ | table_name | found_records | found_crc | +--------------+------------------+----------------------------------+ | departments | 1 | 44654a97e80b0a21d8152d7340d8eee4 | | dept_emp | 0 | | | dept_manager | 0 | | | employees | 1 | 6b93bc3d003ec1d24aac3b272f0c2920 | | salaries | 1 | 77e4ce80a0e26e76bdb99088a057460c | | titles | 1 | b76e1781057c58f750cb599ff9ad3664 | +--------------+------------------+----------------------------------+ - Inspecting
employeesshowed invalid data:+--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 0 | 0000-00-00 | | | | 0000-00-00 | +--------+------------+------------+-----------+--------+------------+
Cause:
- The
employees.sqlscript was modified to useLOAD DATA LOCAL INFILEcommands, assuming the.dumpfiles (e.g.,load_employees.dump) were tab-separated raw data. - Inspection revealed the
.dumpfiles contained SQLINSERTstatements (e.g.,INSERT INTO employees VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26'),), not raw data. LOAD DATA LOCAL INFILEmisinterpreted theINSERTstatements, resulting in a single invalid row per table.
Solution:
-
Inspected
.dumpFiles:head -n 5 /var/lib/mysql-files/load_employees.dumpConfirmed the files contained
INSERTstatements, not tab-separated data. -
Reverted
employees.sql:- Backed up the modified script:
cp /tmp/test_db/employees.sql /tmp/test_db/employees.sql.bak3 - Edited
/tmp/test_db/employees.sqlto restoresourcecommands, matching the SQL format of the.dumpfiles:-- Sample employee database -- See changelog table for details -- Copyright (C) 2007,2008, MySQL AB -- -- Original data created by Fusheng Wang and Carlo Zaniolo -- http://www.cs.aau.dk/TimeCenter/software.htm -- http://www.cs.aau.dk/TimeCenter/Data/employeeTemporalDataSet.zip -- -- Current schema by Giuseppe Maxia -- Data conversion from XML to relational by Patrick Crews -- -- This work is licensed under the -- Creative Commons Attribution-Share Alike 3.0 Unported License. -- To view a copy of this license, visit -- http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to -- Creative Commons, 171 Second Street, Suite 300, San Francisco, -- California, 94105, USA. -- -- DISCLAIMER -- To the best of our knowledge, this data is fabricated, and -- it does not correspond to real people. -- Any similarity to existing people is purely coincidental. -- DROP DATABASE IF EXISTS employees; CREATE DATABASE IF NOT EXISTS employees; USE employees; SELECT 'CREATING DATABASE STRUCTURE' as 'INFO'; DROP TABLE IF EXISTS dept_emp, dept_manager, titles, salaries, employees, departments; /*!50503 set default_storage_engine = InnoDB */; /*!50503 select CONCAT('storage engine: ', @@default_storage_engine) as INFO */; CREATE TABLE employees ( emp_no INT NOT NULL, birth_date DATE NOT NULL, first_name VARCHAR(14) NOT NULL, last_name VARCHAR(16) NOT NULL, gender ENUM ('M','F') NOT NULL, hire_date DATE NOT NULL, PRIMARY KEY (emp_no) ); CREATE TABLE departments ( dept_no CHAR(4) NOT NULL, dept_name VARCHAR(40) NOT NULL, PRIMARY KEY (dept_no), UNIQUE KEY (dept_name) ); CREATE TABLE dept_manager ( emp_no INT NOT NULL, dept_no CHAR(4) NOT NULL, from_date DATE NOT NULL, to_date DATE NOT NULL, FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE, FOREIGN KEY (dept_no) REFERENCES departments (dept_no) ON DELETE CASCADE, PRIMARY KEY (emp_no,dept_no) ); CREATE TABLE dept_emp ( emp_no INT NOT NULL, dept_no CHAR(4) NOT NULL, from_date DATE NOT NULL, to_date DATE NOT NULL, FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE, FOREIGN KEY (dept_no) REFERENCES departments (dept_no) ON DELETE CASCADE, PRIMARY KEY (emp_no,dept_no) ); CREATE TABLE titles ( emp_no INT NOT NULL, title VARCHAR(50) NOT NULL, from_date DATE NOT NULL, to_date DATE, FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE, PRIMARY KEY (emp_no,title, from_date) ); CREATE TABLE salaries ( emp_no INT NOT NULL, salary INT NOT NULL, from_date DATE NOT NULL, to_date DATE NOT NULL, FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE, PRIMARY KEY (emp_no, from_date) ); CREATE OR REPLACE VIEW dept_emp_latest_date AS SELECT emp_no, MAX(from_date) AS from_date, MAX(to_date) AS to_date FROM dept_emp GROUP BY emp_no; CREATE OR REPLACE VIEW current_dept_emp AS SELECT l.emp_no, dept_no, l.from_date, l.to_date FROM dept_emp d INNER JOIN dept_emp_latest_date l ON d.emp_no=l.emp_no AND d.from_date=l.from_date AND l.to_date = d.to_date; FLUSH /*!50503 binary */ LOGS; SELECT 'LOADING departments' as 'INFO'; source /var/lib/mysql-files/load_departments.dump; SELECT 'LOADING employees' as 'INFO'; source /var/lib/mysql-files/load_employees.dump; SELECT 'LOADING dept_emp' as 'INFO'; source /var/lib/mysql-files/load_dept_emp.dump; SELECT 'LOADING dept_manager' as 'INFO'; source /var/lib/mysql-files/load_dept_manager.dump; SELECT 'LOADING titles' as 'INFO'; source /var/lib/mysql-files/load_titles.dump; SELECT 'LOADING salaries' as 'INFO'; source /var/lib/mysql-files/load_salaries1.dump; source /var/lib/mysql-files/load_salaries2.dump; source /var/lib/mysql-files/load_salaries3.dump; source /tmp/test_db/show_elapsed.sql;
- Backed up the modified script:
-
Moved
.dumpFiles: Ensured all.dumpfiles were in/var/lib/mysql-files/:mv /tmp/test_db/*.dump /var/lib/mysql-files/ chmod -R 644 /var/lib/mysql-files/*.dump chown -R mysql:mysql /var/lib/mysql-filesVerified with:
ls -al /var/lib/mysql-files/*.dump -
Dropped and Re-Imported Database:
mysql -u root -p DROP DATABASE employees; exit; mysql -u root -p < /tmp/test_db/employees.sql- Removed
--local-infile=1sincesourcedoesn’t require it. - Monitored output to confirm successful loading of each table.
- Removed
Verification:
- Checked row counts:
mysql -u root -p USE employees; SELECT COUNT(*) FROM employees; -- Returned ~300,024 SELECT COUNT(*) FROM salaries; -- Returned ~2,844,047 - Ran integrity test:
Confirmed all tables showed "OK" with matching record counts and CRCs.
mysql -u root -p -t < /tmp/test_db/test_employees_md5.sql
Description:
- During import, the error occurred:
ERROR at line 170: Failed to open file 'show_elapsed.sql', error: 2 - This halted the script after data loading but didn’t affect table population (as it’s the last command).
Cause:
- The
employees.sqlscript referencedsource show_elapsed.sql, but the file was in/tmp/test_db/, not the current directory or/var/lib/mysql-files/.
Solution:
- Verified File Location:
Confirmed the file existed in
ls -al /tmp/test_db/show_elapsed.sql/tmp/test_db/. - Updated
employees.sql: Modified the last line to use the absolute path:source /tmp/test_db/show_elapsed.sql; - Ensured Permissions:
chmod 644 /tmp/test_db/show_elapsed.sql chown mysql:mysql /tmp/test_db/show_elapsed.sql - Re-Ran Import: As part of Error 1’s solution, re-ran the import, which resolved this error since the correct path was used.
Verification:
- The import completed without the
show_elapsed.sqlerror. show_elapsed.sqlexecuted, displaying the elapsed time for the import.
Description:
- Attempting to run the import from the container shell failed:
docker exec -i mysql-container mysql -u root -p --local-infile=1 < ./test_db/employees.sql bash: ./test_db/employees.sql: No such file or directory - Additionally,
docker psfailed inside the container:bash: docker: command not found
Cause:
- The command was run inside the container, where
./test_db/refers to the container’s filesystem, but the file was on the host. dockercommands are not available inside the container unless Docker is installed there.
Solution:
- Corrected Command Execution:
- Ran the import from the container shell using the correct path:
mysql -u root -p < /tmp/test_db/employees.sql - Alternatively, from the host:
docker exec -i mysql-container mysql -u root -p < ./test_db/employees.sql
- Ran the import from the container shell using the correct path:
- Ran
docker pson Host:- Executed
docker psfrom the host to verify the container (ac5354034ecb) was running:docker ps - Confirmed the container ID matched
mysql-container.
- Executed
Verification:
- The import succeeded when using the correct path.
docker pson the host showed the container running.
After applying the solutions:
- Checked Database:
Confirmed 8 tables:
mysql -u root -p SHOW DATABASES; USE employees; SHOW TABLES;current_dept_emp,departments,dept_emp,dept_emp_latest_date,dept_manager,employees,salaries,titles. - Verified Row Counts:
SELECT COUNT(*) FROM employees; -- ~300,024 SELECT COUNT(*) FROM salaries; -- ~2,844,047 - Inspected Data:
Showed valid data, e.g.:
SELECT * FROM employees LIMIT 5;+--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 | | 10002 | 1964-06-02 | Bezalel | Simmel | F | 1985-11-21 | ... - Ran Integrity Test:
All tables showed "OK" with matching counts and CRCs.
mysql -u root -p -t < /tmp/test_db/test_employees_md5.sql
- Performance: The import took ~5-30 minutes due to the large dataset. Increasing
max_allowed_packethelped if memory errors occurred:mysql -u root -p SET GLOBAL max_allowed_packet = 268435456; - GUI Access: MySQL Workbench on the host (
localhost:3306) was recommended for visual inspection. - Other Databases: The process can be adapted for other datasets (e.g., Sakila) from https://dev.mysql.com/doc/index-other.html.
- Troubleshooting Tools:
- Logs:
tail -n 50 /var/log/mysql/error.log - Interactive import:
mysql -u root -p; SOURCE /tmp/test_db/employees.sql;
- Logs:
- Verify File Formats: Always inspect
.dumpfiles (e.g.,head load_employees.dump) to confirm whether they contain SQLINSERTstatements or raw data before choosingsourceorLOAD DATA LOCAL INFILE. - Use Absolute Paths: Ensure
sourcecommands use absolute paths (e.g.,/tmp/test_db/show_elapsed.sql) to avoid file-not-found errors. - Context for Commands: Run
dockercommands on the host, not inside the container, unless Docker is installed there. - Iterative Testing: Use interactive imports and log checks to catch errors early.
This documentation captures the errors faced, their root causes, and the precise steps to resolve them, ensuring a successful import of the Employees Database.
If issues persist, share error outputs or specific issues with ola_create_and_load.sql. For other databases (e.g., Sakila, World), adapt this process using dumps from https://dev.mysql.com/doc/index-other.html.
This documentation outlines the process of exporting (dumping) the entire employees database (~167 MB, including all 8 tables: employees, departments, dept_emp, dept_manager, titles, salaries, current_dept_emp, dept_emp_latest_date) from a running MySQL container (mysql-container, MySQL 8.0.43) and restoring it in a new container. It incorporates the successful steps from previous interactions, corrects the errors in the provided document, and retains the structure and examples for clarity. The corrected commands ensure a full database dump, addressing the issues encountered in the provided steps (e.g., ambiguous table-only commands and lack of container context). The process ensures portability, data integrity, and compatibility with MySQL’s database context requirements.
Click to view the Steps
To create a complete, portable SQL dump of the employees database, including all tables, schemas, data, views, and foreign key relationships, transfer it to the local machine, and restore it in a new MySQL container for future use or migration.
- MySQL Version: 8.0.43 (MySQL Community Server - GPL)
- Source Container:
mysql-container(running on port 3306) - New Container:
new-mysql-container(to be created on port 3307) - Database:
employees(~300,024 rows inemployees, ~2,844,047 insalaries) - Files: SQL dump file (
employees_full_dump.sql) - Tools:
mysqldumpfor export,mysqlfor import, Docker for container management
In MySQL, tables exist within a database (schema) context, and mysqldump requires specifying the database name to provide this context. The dump includes:
- Schema:
CREATE TABLEstatements defining columns, constraints, and indexes. - Data:
INSERTstatements for table rows. - Database Context:
CREATE DATABASEandUSE database;statements for proper restoration. - Dependencies: Foreign keys (e.g.,
employees.emp_noreferenced bysalaries) and views (e.g.,current_dept_emp) require the full database context to maintain integrity.
Challenges Without Schema:
- Data-only dumps (
--no-create-info) require an identical table structure in the target, risking errors if absent. - Missing foreign keys, indexes, or views can break relationships or functionality.
- Data type mismatches may occur without schema definitions.
Correct Understanding:
| Command | What Gets Dumped | Schema/Database Included? |
|---|---|---|
mysqldump --databases employees |
Entire database (all 8 tables, schema, data, views) | ✅ Database schema + all tables |
mysqldump employees employees |
employees table (schema + data) within employees database |
✅ Database context + 1 table |
mysqldump employees departments salaries |
3 tables (schema + data) within employees database |
✅ Database context + 3 tables |
- Database Context Requirement: Tables cannot be dumped without referencing their database (e.g.,
mysqldump employees_tableis invalid;mysqldump employees employeesis correct). - Foreign Key Dependencies: The
employeesdatabase has inter-table relationships (e.g.,salariesreferencesemployees.emp_no), so a full database dump ensures all dependencies are included. - Portability: Including
CREATE DATABASEsimplifies restoration in any MySQL environment. - Integrity: Dumping the entire database preserves views and constraints, critical for the
employeesdatabase’s functionality.
- Access the Source Container:
docker exec -it mysql-container bash- Purpose: Access the container’s filesystem to run
mysqldump.
- Purpose: Access the container’s filesystem to run
- Create the Full Database Dump:
mysqldump -u root -p --databases employees --single-transaction > /var/lib/mysql-files/employees_full_dump.sql- Options:
-u root: Root user (replace if different).-p: Prompts for password.--databases employees: IncludesCREATE DATABASE employeesfor easy restoration.--single-transaction: Ensures consistency for InnoDB tables without locking.- Output:
/var/lib/mysql-files/employees_full_dump.sql(~167-200 MB).
- Entered root password.
- Purpose: Generated a complete SQL dump with schema, data, views, and constraints for all 8 tables.
- Options:
- Verify the Dump File:
ls -al /var/lib/mysql-files/employees_full_dump.sql head -n 20 /var/lib/mysql-files/employees_full_dump.sql- Expected: File starts with
-- MySQL dump,CREATE DATABASE employees,USE employees;, andCREATE TABLEstatements for all tables.
- Expected: File starts with
- Set Permissions:
chmod 644 /var/lib/mysql-files/employees_full_dump.sql chown mysql:mysql /var/lib/mysql-files/employees_full_dump.sql- Purpose: Ensures MySQL can access the file.
- Exit the Container:
exit - Copy to Host:
docker cp mysql-container:/var/lib/mysql-files/employees_full_dump.sql ./employees_full_dump.sql- Purpose: Transfers the dump to the host for backup or sharing.
- Verify Locally:
ls -al ./employees_full_dump.sql head -n 20 ./employees_full_dump.sql- Size: ~167-200 MB.
- Optional: Compress for Storage:
gzip ./employees_full_dump.sql- Creates
employees_full_dump.sql.gz(decompress withgunzipbefore restoration).
- Creates
- Create a New Container:
docker run --name new-mysql-container -e MYSQL_ROOT_PASSWORD=The_password -d -p 3307:3306 mysql:8.0- Port 3307 avoids conflict with
mysql-container(port 3306). - Replace
The_passwordwith a secure password.
- Port 3307 avoids conflict with
- Verify Container:
docker ps- Confirms
new-mysql-containeris running.
- Confirms
- Copy the Dump File:
- If compressed, decompress:
gunzip ./employees_full_dump.sql.gz - Copy to the new container:
docker cp ./employees_full_dump.sql new-mysql-container:/var/lib/mysql-files/employees_full_dump.sql
- If compressed, decompress:
- Access the New Container:
docker exec -it new-mysql-container bash - Set Permissions:
chmod 644 /var/lib/mysql-files/employees_full_dump.sql chown mysql:mysql /var/lib/mysql-files/employees_full_dump.sql - Import the Dump:
mysql -u root -p < /var/lib/mysql-files/employees_full_dump.sql- Or from the host:
docker exec -i new-mysql-container mysql -u root -p < ./employees_full_dump.sql - Enter root password.
- Import time: 5-30 minutes.
- The
--databasesoption ensuresCREATE DATABASE employeesis included, so no manual database creation is needed.
- Or from the host:
- Optional: Enable Local Infile (for future operations):
mysql -u root -p SET GLOBAL local_infile = 1; exit;
- Log into MySQL:
docker exec -it new-mysql-container mysql -u root -p - Check Databases and Tables:
SHOW DATABASES; USE employees; SHOW TABLES;- Expected:
employeesdatabase with 8 tables:departments,employees,dept_emp, etc.
- Expected:
- Verify Row Counts:
SELECT COUNT(*) FROM employees; -- ~300,024 SELECT COUNT(*) FROM salaries; -- ~2,844,047 - Inspect Sample Data:
SELECT * FROM employees LIMIT 5;- Expected:
+--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 | ...
- Expected:
- Run Integrity Test:
- Copy test script:
docker cp ./test_db/test_employees_md5.sql new-mysql-container:/var/lib/mysql-files/test_employees_md5.sql - Run:
mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql - Expected: "OK" for all tables, confirming correct row counts and CRCs.
- Copy test script:
For reference, here are corrected examples for dumping specific tables (though the main process focuses on the full database):
-
Dump Single Table (e.g.,
employees):docker exec -it mysql-container bash mysqldump -u root -p employees employees > /var/lib/mysql-files/employees_only.sql docker cp mysql-container:/var/lib/mysql-files/employees_only.sql ./employees_only.sql- Output: Schema and data for the
employeestable, withUSE employees;.
- Output: Schema and data for the
-
Dump Multiple Tables (e.g.,
departmentsandsalaries):mysqldump -u root -p employees departments salaries > /var/lib/mysql-files/dept_salary_only.sql docker cp mysql-container:/var/lib/mysql-files/dept_salary_only.sql ./dept_salary_only.sql -
Dump Data Only (No Schema):
mysqldump -u root -p --no-create-info employees employees > /var/lib/mysql-files/employees_data_only.sql docker cp mysql-container:/var/lib/mysql-files/employees_data_only.sql ./employees_data_only.sql- Note: Requires identical table structure in the target.
-
Dump Department-Related Tables:
mysqldump -u root -p employees departments dept_emp dept_manager > /var/lib/mysql-files/dept_tables.sql docker cp mysql-container:/var/lib/mysql-files/dept_tables.sql ./dept_tables.sql
Restoring a Single Table Example:
- Copy to new container:
docker cp ./employees_only.sql new-mysql-container:/var/lib/mysql-files/employees_only.sql - Import (ensure
employeesdatabase exists):docker exec -it new-mysql-container mysql -u root -p employees < /var/lib/mysql-files/employees_only.sql - Handle foreign keys (if other tables reference
employees):SET FOREIGN_KEY_CHECKS = 0; SOURCE /var/lib/mysql-files/employees_only.sql; SET FOREIGN_KEY_CHECKS = 1;
- Exit and Stop (if Testing):
- Exit shell:
exit. - Stop new container (optional):
docker stop new-mysql-container
- Exit shell:
- Archive Dump:
- Store in a backup directory or repository:
mv ./employees_full_dump.sql ~/backups/ tar -czf ~/backups/employees_backup.tar.gz ~/backups/employees_full_dump.sql
- Store in a backup directory or repository:
- Future Use: Repeat Steps 3-5 for restoration in other containers.
The full database dump includes:
-- MySQL dump 10.13 Distrib 8.0.43, for Linux (x86_64)
--
-- Host: localhost Database: employees
-- ------------------------------------------------------
-- Server version 8.0.43
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!50503 SET NAMES utf8mb4 */;
--
-- Database: `employees`
--
CREATE DATABASE IF NOT EXISTS `employees`;
USE `employees`;
--
-- Table structure for table `employees`
--
DROP TABLE IF EXISTS `employees`;
CREATE TABLE `employees` (
`emp_no` int NOT NULL,
`birth_date` date NOT NULL,
`first_name` varchar(14) NOT NULL,
`last_name` varchar(16) NOT NULL,
`gender` enum('M','F') NOT NULL,
`hire_date` date NOT NULL,
PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
--
-- Dumping data for table `employees`
--
LOCK TABLES `employees` WRITE;
/*!40000 ALTER TABLE `employees` DISABLE KEYS */;
INSERT INTO `employees` VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26');
-- ... more INSERT statements
/*!40000 ALTER TABLE `employees` ENABLE KEYS */;
UNLOCK TABLES;
-- ... Similar structure for departments, dept_emp, dept_manager, salaries, titles, viewsKey Elements:
CREATE DATABASEandUSE employees;for context.CREATE TABLEfor each table’s schema.INSERTstatements for data.- Views like
current_dept_empincluded.
- Tables Belong to Databases: MySQL requires database context for tables, ensuring correct placement during import.
- Foreign Key Dependencies: The
employeesdatabase has inter-table relationships (e.g.,salariesreferencesemployees.emp_no), so a full dump preserves integrity. - Import Compatibility:
CREATE DATABASEandUSEstatements make the dump portable across environments. - Complete Backup: Includes views, constraints, and data for disaster recovery.
-
Sharing with a Colleague:
- Share
employees_full_dump.sql:mysqldump -u root -p --databases employees --single-transaction > /var/lib/mysql-files/employees_full_dump.sql docker cp mysql-container:/var/lib/mysql-files/employees_full_dump.sql ./employees_full_dump.sql - Colleague imports:
docker cp ./employees_full_dump.sql their-container:/var/lib/mysql-files/employees_full_dump.sql docker exec -i their-container mysql -u root -p < /var/lib/mysql-files/employees_full_dump.sql
- Share
-
Updating Existing Data:
- Dump data only for specific tables (if schema exists):
mysqldump -u root -p --no-create-info employees employees > /var/lib/mysql-files/employees_data_only.sql docker cp mysql-container:/var/lib/mysql-files/employees_data_only.sql ./employees_data_only.sql - Import:
docker exec -i new-mysql-container mysql -u root -p employees < ./employees_data_only.sql
- Dump data only for specific tables (if schema exists):
-
Complex Dependencies:
- Dump related tables (e.g.,
employees,salaries,titles):mysqldump -u root -p employees employees salaries titles > /var/lib/mysql-files/related_tables.sql docker cp mysql-container:/var/lib/mysql-files/related_tables.sql ./related_tables.sql
- Dump related tables (e.g.,
- Full Database Dump: Use
--databasesfor complete portability. - Single Table Export: Include schema unless the target has an identical structure.
- Related Tables: Dump all dependent tables to preserve foreign keys.
- Data-Only Updates: Use
--no-create-infofor existing schemas. - Container Context: Use
docker cpand/var/lib/mysql-files/for file management. - Verification: Always check row counts and run integrity tests post-import.
The provided steps failed due to several issues, which were resolved as follows:
-
Error: Ambiguous Table-Only Dump Commands
- Issue: Commands like
mysqldump employees_tablewere invalid, as MySQL requires the database name (e.g.,employees). - Resolution: Used
mysqldump -u root -p --databases employeesfor a full database dump, ensuring correct context. For single tables, usedmysqldump -u root -p employees employees.
- Issue: Commands like
-
Error: Confusion Over Full vs. Single-Table Dumps
- Issue: The provided steps mixed single-table examples (e.g.,
mysqldump employees employees) with the goal of a full database dump, causing confusion. - Resolution: Focused on a full database dump with
mysqldump --databases employees, retaining single-table examples for reference.
- Issue: The provided steps mixed single-table examples (e.g.,
-
Error: Missing Container Context
- Issue: Commands like
mysqldump -u username -p employeesdidn’t account for Docker’s filesystem orsecure_file_priv. - Resolution: Ran
mysqldumpinside the container, saved to/var/lib/mysql-files/, and useddocker cpfor file transfers:docker cp mysql-container:/var/lib/mysql-files/employees_full_dump.sql ./employees_full_dump.sql
- Issue: Commands like
-
Error: Incomplete Restoration Guidance
- Issue: The provided steps didn’t specify creating the database or handling permissions in the new container.
- Resolution: Used
--databasesto includeCREATE DATABASE, set permissions (chmod 644,chown mysql:mysql), and imported with:mysql -u root -p < /var/lib/mysql-files/employees_full_dump.sql
-
Potential Issue: Foreign Key Constraints
- Issue: Not encountered, but restoring single tables (e.g.,
employees) could fail due to dependencies (e.g.,salaries). - Resolution: Included all tables in the full dump. For single-table restores, disable foreign keys temporarily:
SET FOREIGN_KEY_CHECKS = 0; SOURCE /var/lib/mysql-files/employees_only.sql; SET FOREIGN_KEY_CHECKS = 1;
- Issue: Not encountered, but restoring single tables (e.g.,
-
Potential Issue: Slow Import or Memory Limits
- Issue: Large dumps (~167 MB) could hit memory limits.
- Resolution: Increased
max_allowed_packet:mysql -u root -p SET GLOBAL max_allowed_packet = 268435456;
- Post-import, verified with:
SELECT COUNT(*) FROM employees; -- ~300,024 SELECT COUNT(*) FROM salaries; -- ~2,844,047 SELECT * FROM employees LIMIT 5; mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql - Confirmed all tables, row counts, and CRCs matched expected values.
This corrected process successfully dumped the entire employees database and restored it in a new container, addressing the provided document’s errors by using proper mysqldump syntax, Docker file management, and full database context. The dump is portable for sharing, backup, or migration, with single-table examples provided for flexibility. If further issues arise, share error logs or outputs for targeted assistance.
Click to view the quick steps to execute
Now that the Employees Database is running successfully in The current MySQL container (mysql-container), we'll export (dump) the entire database (including all tables, schema, data, views, and constraints) to a SQL file on The local machine. Then, we'll restore it in a new MySQL container. This process ensures the full database (~167 MB) is portable and can be migrated without data loss. The dump will include everything: tables like employees (~300,024 rows), salaries (~2,844,047 rows), and relationships via foreign keys.
- The current container (
mysql-container) is running with theemployeesdatabase populated. - Docker is installed on the host.
- Root access to MySQL (with password set).
- Disk space: At least 200 MB free on the host for the dump file.
- Access the Container Shell:
docker exec -it mysql-container bash - Create the Full Database Dump:
Use
mysqldumpto export the entireemployeesdatabase to a SQL file:mysqldump -u root -p --databases employees --single-transaction > /var/lib/mysql-files/employees_full_dump.sql- Options Explained:
-u root: Root user (replace if different).-p: Prompts for password.--databases employees: IncludesCREATE DATABASEfor easy restore.--single-transaction: Ensures consistency without locking tables.> /var/lib/mysql-files/employees_full_dump.sql: Saves to a secure directory.
- Enter the root password when prompted.
- The dump includes schema (
CREATE TABLE), data (INSERT), and views.
- Options Explained:
- Verify the Dump File:
ls -al /var/lib/mysql-files/employees_full_dump.sql head -n 20 /var/lib/mysql-files/employees_full_dump.sql- Expected: File size ~167-200 MB; starts with
-- MySQL dumpheader, followed byCREATE DATABASE employeesand table definitions.
- Expected: File size ~167-200 MB; starts with
- Set Permissions (if needed):
chmod 644 /var/lib/mysql-files/employees_full_dump.sql chown mysql:mysql /var/lib/mysql-files/employees_full_dump.sql
- Exit the Container:
exit - Transfer the Dump File:
Copy from the container to the host:
docker cp mysql-container:/var/lib/mysql-files/employees_full_dump.sql ./employees_full_dump.sql - Verify Locally:
ls -al ./employees_full_dump.sql head -n 20 ./employees_full_dump.sql - Optional: Compress for Storage:
gzip ./employees_full_dump.sql- Creates
employees_full_dump.sql.gz(decompress withgunzipbefore restore).
- Creates
- Start a New Container:
Create a fresh MySQL container:
docker run --name new-mysql-container -e MYSQL_ROOT_PASSWORD=The_password -d -p 3307:3306 mysql:8.0- Uses port 3307 to avoid conflict with the original container (port 3306).
- Replace
The_passwordwith a secure password.
- Verify New Container:
docker ps- Look for
new-mysql-containerrunning.
- Look for
- Copy the Dump File to the New Container:
docker cp ./employees_full_dump.sql new-mysql-container:/var/lib/mysql-files/employees_full_dump.sql - Access the New Container Shell:
docker exec -it new-mysql-container bash - Set Permissions:
chmod 644 /var/lib/mysql-files/employees_full_dump.sql chown mysql:mysql /var/lib/mysql-files/employees_full_dump.sql - Import the Dump:
mysql -u root -p < /var/lib/mysql-files/employees_full_dump.sql- Enter the root password.
- This recreates the
employeesdatabase and all tables/data. - Import time: 5-30 minutes for the full dataset.
- Optional: Enable Local Infile (if needed for future operations):
mysql -u root -p SET GLOBAL local_infile = 1; exit;
- Log into MySQL in the New Container:
mysql -u root -p - Check Databases and Switch:
SHOW DATABASES; USE employees;- Expect
employeesin the list.
- Expect
- List Tables:
SHOW TABLES;- Expect 8 tables:
departments,employees,dept_emp, etc.
- Expect 8 tables:
- Check Row Counts:
SELECT COUNT(*) FROM employees; -- ~300,024 SELECT COUNT(*) FROM salaries; -- ~2,844,047 - Inspect Sample Data:
SELECT * FROM employees LIMIT 5;- Expect valid employee records.
- Run Integrity Test (Copy Test Script First)**:
- Copy
test_employees_md5.sqlto the new container:docker cp ./test_db/test_employees_md5.sql new-mysql-container:/var/lib/mysql-files/test_employees_md5.sql - Run:
mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql - Expect "OK" for all tables.
- Copy
- Exit and Stop (if Testing):
- Exit shell:
exit. - Stop the new container (optional):
docker stop new-mysql-container
- Exit shell:
- Store the Dump:
- Keep
employees_full_dump.sqlin a safe location (e.g., Git repository or cloud storage) for future restores. - To restore in another environment, repeat Steps 3-5.
- Keep
- Optional: Compress and Archive:
tar -czf employees_backup.tar.gz ./employees_full_dump.sql
- Import Errors (e.g., "Duplicate entry"): Drop the database first:
mysql -u root -p DROP DATABASE IF EXISTS employees; - Foreign Key Issues: Disable temporarily during import:
mysql -u root -p -e "SET FOREIGN_KEY_CHECKS=0;" employees < /var/lib/mysql-files/employees_full_dump.sql mysql -u root -p -e "SET FOREIGN_KEY_CHECKS=1;" employees - Large Dump Slowdown: Increase
max_allowed_packetin the new container:mysql -u root -p SET GLOBAL max_allowed_packet = 268435456; - File Not Found: Verify paths with
ls -al /var/lib/mysql-files/employees_full_dump.sql. - Logs: Check for errors:
tail -n 50 /var/log/mysql/error.log
This process creates a complete, portable dump of the Employees Database and restores it in a new container, ensuring you can replicate the setup anywhere. If you encounter issues, share the error output for further guidance!
Since we have successfully migrated the entire employees database to a new MySQL container (new-mysql-container) and verified its functionality, you now want to:
- Delete a specific table (e.g.,
employees) from theemployeesdatabase in the original container (mysql-container). - Take a backup (dump) of specific tables (e.g.,
employees,departments, andsalaries) from the original container. - Restore these tables in the new container (
new-mysql-container) to ensure they are correctly recreated.
This process builds on the previous successful full database dump and restore, focusing on table-level operations while addressing dependencies (e.g., foreign keys) and container-specific requirements. Below is a detailed guide with corrected commands, incorporating lessons from prior interactions and the provided document, ensuring the process is robust and portable.
Click to view Prerequisites and Steps
- Environment:
- Source Container:
mysql-container(MySQL 8.0.43, port 3306) with theemployeesdatabase (~300,024 rows inemployees, ~2,844,047 insalaries). - New Container:
new-mysql-container(port 3307) with theemployeesdatabase already restored. - Tables: 8 tables (
employees,departments,dept_emp,dept_manager,titles,salaries,current_dept_emp,dept_emp_latest_date).
- Source Container:
- Tools:
mysqldumpfor backup,mysqlfor import/deletion, Docker for container management. - Disk Space: ~50-100 MB for specific table dumps.
- Access: Root credentials for both containers.
- Foreign Key Dependencies: The
employeestable is referenced bysalaries,titles,dept_emp, anddept_manager. Deleting it requires disabling foreign key checks to avoid errors. - Schema Context: As noted in the provided document, table dumps must include the database context (
employees) to ensure portability and correct restoration. - Portability: Dumps include
CREATE TABLEstatements to recreate the schema, avoiding issues with missing structures in the target.
- Access the Source Container:
docker exec -it mysql-container mysql -u root -p- Enter the root password.
- Disable Foreign Key Checks:
- Since
employeesis referenced by other tables, disable foreign keys to allow deletion:SET FOREIGN_KEY_CHECKS = 0;
- Since
- Delete the
employeesTable:USE employees; DROP TABLE employees; - Re-enable Foreign Key Checks:
SET FOREIGN_KEY_CHECKS = 1; - Verify Deletion:
SHOW TABLES; SELECT COUNT(*) FROM employees;- Expect:
employeesmissing fromSHOW TABLESand an error for theSELECTquery (table doesn’t exist).
- Expect:
- Exit MySQL:
exit;
- Access the Container Shell:
docker exec -it mysql-container bash- Note: The
employeestable is deleted inmysql-container, so you’ll need to use thenew-mysql-container(where the full database was previously restored) to dump the tables.
- Note: The
- Exit and Switch to New Container:
exit docker exec -it new-mysql-container bash - Create the Table-Specific Dump:
- Dump the
employees,departments, andsalariestables:mysqldump -u root -p employees employees departments salaries --single-transaction > /var/lib/mysql-files/specific_tables_dump.sql- Options:
-u root: Root user.-p: Prompts for password.employees employees departments salaries: Specifies the database and tables.--single-transaction: Ensures consistency without locking.- Output:
/var/lib/mysql-files/specific_tables_dump.sql(~100 MB).
- Enter the root password.
- Options:
- Dump the
- Verify the Dump File:
ls -al /var/lib/mysql-files/specific_tables_dump.sql head -n 20 /var/lib/mysql-files/specific_tables_dump.sql- Expected: Starts with
-- MySQL dump,USE employees;, andCREATE TABLEstatements foremployees,departments, andsalaries.
- Expected: Starts with
- Set Permissions:
chmod 644 /var/lib/mysql-files/specific_tables_dump.sql chown mysql:mysql /var/lib/mysql-files/specific_tables_dump.sql
- Exit the Container:
exit - Copy to Host:
docker cp new-mysql-container:/var/lib/mysql-files/specific_tables_dump.sql ./specific_tables_dump.sql - Verify Locally:
ls -al ./specific_tables_dump.sql head -n 20 ./specific_tables_dump.sql - Optional: Compress for Storage:
gzip ./specific_tables_dump.sql- Creates
specific_tables_dump.sql.gz.
- Creates
- Copy the Dump File:
- If compressed, decompress:
gunzip ./specific_tables_dump.sql.gz - Copy to
mysql-container:docker cp ./specific_tables_dump.sql mysql-container:/var/lib/mysql-files/specific_tables_dump.sql
- If compressed, decompress:
- Access the Original Container:
docker exec -it mysql-container bash - Set Permissions:
chmod 644 /var/lib/mysql-files/specific_tables_dump.sql chown mysql:mysql /var/lib/mysql-files/specific_tables_dump.sql - Import the Dump:
- Log into MySQL:
mysql -u root -p - Disable foreign key checks (since
salariesreferencesemployees):SET FOREIGN_KEY_CHECKS = 0; - Import:
USE employees; SOURCE /var/lib/mysql-files/specific_tables_dump.sql; - Or from the host:
docker exec -i mysql-container mysql -u root -p employees < ./specific_tables_dump.sql - Re-enable foreign keys:
SET FOREIGN_KEY_CHECKS = 1; - Import time: ~1-5 minutes.
- Log into MySQL:
- Verify Restoration:
SHOW TABLES; SELECT COUNT(*) FROM employees; -- ~300,024 SELECT COUNT(*) FROM departments; -- ~9 SELECT COUNT(*) FROM salaries; -- ~2,844,047 SELECT * FROM employees LIMIT 5;- Expected:
+--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 | ...
- Expected:
- Run Integrity Test (Optional):
- Copy test script:
docker cp ./test_db/test_employees_md5.sql mysql-container:/var/lib/mysql-files/test_employees_md5.sql - Run:
mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql - Expected: "OK" for
employees,departments,salaries.
- Copy test script:
- Exit and Stop (if Testing):
exit docker stop mysql-container # Optional - Archive Dump:
mv ./specific_tables_dump.sql ~/backups/ tar -czf ~/backups/specific_tables_backup.tar.gz ~/backups/specific_tables_dump.sql
- Foreign Key Errors:
- If import fails due to dependencies:
SET FOREIGN_KEY_CHECKS = 0; SOURCE /var/lib/mysql-files/specific_tables_dump.sql; SET FOREIGN_KEY_CHECKS = 1;
- If import fails due to dependencies:
- File Not Found:
- Verify paths:
ls -al /var/lib/mysql-files/specific_tables_dump.sql
- Verify paths:
- Slow Import:
- Increase memory limit:
mysql -u root -p SET GLOBAL max_allowed_packet = 268435456;
- Increase memory limit:
- Logs:
tail -n 50 /var/log/mysql/error.log
- Dependencies: The
salariestable depends onemployees. Dumping and restoring them together avoids foreign key issues. - Portability: The dump includes
USE employees;for correct database context. - Future Use: The dump can be restored in any MySQL container by repeating Step 4.
This process successfully deletes the employees table, backs up specific tables, and restores them, ensuring data integrity and compatibility with the employees database’s schema and relationships. If issues arise, share error outputs for further assistance!
Click to view for the particular table backup and restore
The previous response included employees, departments, and salaries in the dump for the following reasons (based on the provided document and database structure):
- Foreign Key Dependencies: The
employeestable is referenced bysalaries,titles,dept_emp, anddept_managervia foreign keys. Dumping related tables ensures that dependencies are preserved, especially if other tables might be affected or needed in the target environment. - Context from Provided Document: The document emphasized scenarios for dumping multiple tables (e.g.,
mysqldump employees employees departments salaries), suggesting a broader approach for robustness. - Practical Scenario: In real-world cases, restoring a single table like
employeesoften requires related tables to maintain referential integrity, especially in a database likeemployeeswith interconnected relationships.
However, since the task specifically involves deleting and restoring only the employees table, we can streamline the process to focus solely on this table. We’ll use the new-mysql-container (where the full database exists) to dump the employees table and restore it in mysql-container where it was deleted. We’ll also handle foreign key constraints carefully, as other tables depend on employees.
Delete the employees table in the original container (mysql-container), back up only the employees table from the new container (new-mysql-container), and restore it in the original container, ensuring data integrity and handling foreign key dependencies.
- Source Container:
mysql-container(MySQL 8.0.43, port 3306) with theemployeesdatabase, where theemployeestable will be deleted. - New Container:
new-mysql-container(port 3307) with the fullemployeesdatabase intact. - Table:
employees(~300,024 rows). - Tools:
mysqldump,mysql, Docker. - Disk Space: ~20 MB for the
employeestable dump.
- Access MySQL in
mysql-container:docker exec -it mysql-container mysql -u root -p- Enter the root password.
- Disable Foreign Key Checks:
- Since
employeesis referenced bysalaries,titles,dept_emp, anddept_manager, disable foreign keys:SET FOREIGN_KEY_CHECKS = 0;
- Since
- Delete the
employeesTable:USE employees; DROP TABLE employees; - Re-enable Foreign Key Checks:
SET FOREIGN_KEY_CHECKS = 1; - Verify Deletion:
SHOW TABLES; SELECT COUNT(*) FROM employees;- Expect:
employeesmissing fromSHOW TABLES;SELECTquery fails withTable 'employees.employees' doesn't exist.
- Expect:
- Exit MySQL:
exit;
- Access the New Container Shell:
docker exec -it new-mysql-container bash- The
new-mysql-containerhas the fullemployeesdatabase, including theemployeestable.
- The
- Create the Table Dump:
- Dump only the
employeestable:mysqldump -u root -p employees employees --single-transaction > /var/lib/mysql-files/employees_only.sql- Options:
-u root: Root user.-p: Prompts for password.employees employees: Specifies the database and table.--single-transaction: Ensures consistency without locking.- Output:
/var/lib/mysql-files/employees_only.sql(~17-20 MB).
- Enter the root password.
- Options:
- Dump only the
- Verify the Dump File:
ls -al /var/lib/mysql-files/employees_only.sql head -n 20 /var/lib/mysql-files/employees_only.sql- Expected: Starts with
-- MySQL dump,USE employees;,CREATE TABLE employees, andINSERT INTO employeesstatements.
- Expected: Starts with
- Set Permissions:
chmod 644 /var/lib/mysql-files/employees_only.sql chown mysql:mysql /var/lib/mysql-files/employees_only.sql
- Exit the Container:
exit - Copy to Host:
docker cp new-mysql-container:/var/lib/mysql-files/employees_only.sql ./employees_only.sql - Verify Locally:
ls -al ./employees_only.sql head -n 20 ./employees_only.sql - Optional: Compress for Storage:
gzip ./employees_only.sql- Creates
employees_only.sql.gz.
- Creates
- Copy the Dump File:
- If compressed, decompress:
gunzip ./employees_only.sql.gz - Copy to
mysql-container:docker cp ./employees_only.sql mysql-container:/var/lib/mysql-files/employees_only.sql
- If compressed, decompress:
- Access the Original Container:
docker exec -it mysql-container bash - Set Permissions:
chmod 644 /var/lib/mysql-files/employees_only.sql chown mysql:mysql /var/lib/mysql-files/employees_only.sql - Import the Dump:
- Log into MySQL:
mysql -u root -p - Disable foreign key checks (to avoid conflicts with dependent tables like
salaries):SET FOREIGN_KEY_CHECKS = 0; - Import:
USE employees; SOURCE /var/lib/mysql-files/employees_only.sql; - Or from the host:
docker exec -i mysql-container mysql -u root -p employees < ./employees_only.sql - Re-enable foreign keys:
SET FOREIGN_KEY_CHECKS = 1; - Import time: ~1-2 minutes.
- Log into MySQL:
- Verify Restoration:
SHOW TABLES; SELECT COUNT(*) FROM employees; -- ~300,024 SELECT * FROM employees LIMIT 5;- Expected:
+--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 | ...
- Expected:
- Run Integrity Test (Optional):
- Copy test script:
docker cp ./test_db/test_employees_md5.sql mysql-container:/var/lib/mysql-files/test_employees_md5.sql - Run:
mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql - Expected: "OK" for the
employeestable.
- Copy test script:
- Exit and Stop (if Testing):
exit docker stop mysql-container # Optional - Archive Dump:
mv ./employees_only.sql ~/backups/ tar -czf ~/backups/employees_table_backup.tar.gz ~/backups/employees_only.sql
- Foreign Key Errors:
- If dependent tables (e.g.,
salaries) cause issues, ensureSET FOREIGN_KEY_CHECKS = 0during import.
- If dependent tables (e.g.,
- File Not Found:
- Verify paths:
ls -al /var/lib/mysql-files/employees_only.sql
- Verify paths:
- Slow Import:
- Increase memory limit:
mysql -u root -p SET GLOBAL max_allowed_packet = 268435456;
- Increase memory limit:
- Logs:
tail -n 50 /var/log/mysql/error.log
- Dependencies: The
employeestable is critical forsalaries,titles,dept_emp, anddept_manager. Restoring onlyemployeesis safe as long as foreign key checks are disabled during import. - Schema Inclusion: The dump includes
CREATE TABLE employeesto ensure the table structure is recreated correctly. - Future Use: The
employees_only.sqlfile can be restored in any MySQL container with anemployeesdatabase.
This revised process focuses on the employees table, addressing the specific task while ensuring compatibility with the database’s dependencies. If you need to back up or restore additional tables or encounter issues, let me know!
To delete specific data (rows) from a table in The MySQL employees database (or any table within it, such as employees, departments, etc.), you can use the DELETE statement with a WHERE clause to target only the rows you want to remove. Below, I’ll explain how to delete specific data, provide examples tailored to The context, and cover best practices to ensure you don’t accidentally delete unintended data.
Click to view Steps to Delete Specific Data from a Table
-
Identify the Table and Rows:
- Determine the table (e.g.,
employees,salaries, etc.) from TheSHOW TABLESoutput. - Specify the condition in the
WHEREclause to select the rows you want to delete based on column values.
- Determine the table (e.g.,
-
Use the
DELETEStatement:- Syntax:
DELETE FROM table_name WHERE condition;
- The
WHEREclause is critical to avoid deleting all rows in the table.
- Syntax:
-
Verify Before Deleting:
- Run a
SELECTquery with the sameWHEREclause to preview the rows that will be deleted. - Example:
SELECT * FROM table_name WHERE condition;
- Run a
-
Execute the Delete:
- Run the
DELETEstatement to remove the targeted rows. - Optionally, check the number of affected rows to confirm the operation.
- Run the
-
Handle Constraints:
- If the table has foreign key relationships (e.g.,
employeeslinked todept_emporsalaries), you may need to address constraints to avoid errors.
- If the table has foreign key relationships (e.g.,
Based on The SHOW TABLES output (employees, departments, dept_emp, etc.), let’s assume you want to delete specific rows from the employees table. Here are some common scenarios:
To delete a single employee with emp_no = 10001:
-- Preview the data to be deleted
SELECT * FROM employees WHERE emp_no = 10001;
-- Delete the row
DELETE FROM employees WHERE emp_no = 10001;- Check Affected Rows:
After running the
DELETE, MySQL will report the number of rows affected:Query OK, 1 row affected (0.01 sec)
To delete employees hired before January 1, 1990:
-- Preview the rows
SELECT emp_no, first_name, last_name, hire_date
FROM employees
WHERE hire_date < '1990-01-01';
-- Delete the rows
DELETE FROM employees
WHERE hire_date < '1990-01-01';If you want to delete employees from a specific department (e.g., department d001 from the dept_emp table):
-- Preview employees in department d001
SELECT e.emp_no, e.first_name, e.last_name, de.dept_no
FROM employees e
JOIN dept_emp de ON e.emp_no = de.emp_no
WHERE de.dept_no = 'd001';
-- Delete from dept_emp (remove department assignment)
DELETE FROM dept_emp
WHERE dept_no = 'd001';
-- Optionally, delete the employees entirely if they have no other department assignments
DELETE FROM employees
WHERE emp_no IN (
SELECT emp_no
FROM dept_emp
WHERE dept_no = 'd001'
);- Note: Be cautious with foreign key constraints. Deleting from
employeesmay fail ifemp_nois referenced indept_emp,salaries, or other tables unless foreign key checks are disabled or cascading deletes are configured.
To delete only a specific number of rows (e.g., the first 10 employees with a certain condition):
-- Preview the rows
SELECT * FROM employees
WHERE gender = 'M'
LIMIT 10;
-- Delete the rows
DELETE FROM employees
WHERE gender = 'M'
LIMIT 10;- Note: The
LIMITclause inDELETEis supported in MySQL but not in all SQL databases.
The employees database has tables like dept_emp, salaries, and titles that likely reference the employees table via foreign keys (e.g., emp_no). Deleting rows from employees may cause errors if referenced data exists. Here’s how to handle this:
-
Check Foreign Key Constraints:
- View the table’s structure to identify foreign keys:
Look for
SHOW CREATE TABLE dept_emp;
FOREIGN KEYdefinitions, e.g.,FOREIGN KEY (emp_no) REFERENCES employees(emp_no).
- View the table’s structure to identify foreign keys:
-
Option 1: Delete Dependent Rows First:
- Delete rows from dependent tables (e.g.,
dept_emp,salaries) before deleting fromemployees:-- Delete related data first DELETE FROM dept_emp WHERE emp_no = 10001; DELETE FROM salaries WHERE emp_no = 10001; DELETE FROM titles WHERE emp_no = 10001; -- Then delete from employees DELETE FROM employees WHERE emp_no = 10001;
- Delete rows from dependent tables (e.g.,
-
Option 2: Temporarily Disable Foreign Key Checks:
- Disable foreign key checks to allow deletion, then re-enable:
SET FOREIGN_KEY_CHECKS = 0; DELETE FROM employees WHERE emp_no = 10001; SET FOREIGN_KEY_CHECKS = 1;
- Caution: This can leave orphaned data in dependent tables (e.g.,
dept_emprows with no correspondingemployeesrow), so use carefully.
- Disable foreign key checks to allow deletion, then re-enable:
-
Option 3: Use Cascading Deletes:
- If the foreign keys are defined with
ON DELETE CASCADE, deleting fromemployeeswill automatically delete related rows indept_emp,salaries, etc. Check the schema:IfSHOW CREATE TABLE dept_emp;
ON DELETE CASCADEis not set, you’d need to modify the schema (not recommended for production databases without careful planning).
- If the foreign keys are defined with
-
Always Preview with
SELECT:- Run a
SELECTquery with the sameWHEREclause to confirm which rows will be deleted:SELECT * FROM employees WHERE condition;
- Run a
-
Use Transactions for Safety:
- Wrap the
DELETEin a transaction to allow rollback if something goes wrong:START TRANSACTION; DELETE FROM employees WHERE emp_no = 10001; -- Verify the result SELECT * FROM employees WHERE emp_no = 10001; -- Commit or rollback COMMIT; -- or ROLLBACK;
- Wrap the
-
Backup Before Deleting:
- Create a backup of the table or database before deleting data:
mysqldump -u username -p employees employees > employees_backup.sql - This ensures you can restore data if you accidentally delete the wrong rows.
- Create a backup of the table or database before deleting data:
-
Test in a Non-Production Environment:
- Try the
DELETEoperation on a copy of the database first to avoid irreversible mistakes.
- Try the
-
Use Precise Conditions:
- Ensure the
WHEREclause is specific enough to target only the intended rows. Without aWHEREclause,DELETE FROM table_name;will delete all rows in the table.
- Ensure the
-
Check Affected Rows:
- After running
DELETE, MySQL reports the number of rows affected. Verify this matches The expectations:Query OK, 5 rows affected (0.02 sec)
- After running
- Missing
WHEREClause:- Running
DELETE FROM employees;without aWHEREclause deletes all rows in the table. Always double-check The query.
- Running
- Foreign Key Errors:
- Deleting from
employeesmay fail ifemp_nois referenced elsewhere. Address dependent tables or disable foreign key checks as needed.
- Deleting from
- Incorrect Conditions:
- A poorly written
WHEREclause (e.g.,WHERE hire_date > '1990-01-01'when you meant<=) can delete unintended rows. Always preview withSELECT.
- A poorly written
- Performance with Large Tables:
- Deleting many rows from a large table (e.g., millions of rows) can be slow due to indexes and foreign key checks. Consider deleting in smaller batches using
LIMITor disabling indexes temporarily.
- Deleting many rows from a large table (e.g., millions of rows) can be slow due to indexes and foreign key checks. Consider deleting in smaller batches using
If you need to delete data across related tables (e.g., remove an employee and their salary records):
-- Preview the data
SELECT e.emp_no, e.first_name, e.last_name, s.salary
FROM employees e
JOIN salaries s ON e.emp_no = s.emp_no
WHERE e.emp_no = 10001;
-- Delete from dependent table first
DELETE FROM salaries WHERE emp_no = 10001;
-- Then delete from employees
DELETE FROM employees WHERE emp_no = 10001;To delete specific data from a table in The employees database:
- Use
DELETE FROM table_name WHERE condition;to target specific rows. - Preview with
SELECTto verify the rows to be deleted. - Handle foreign key constraints by deleting dependent rows first, disabling checks, or ensuring cascading deletes are configured.
- Use transactions and backups for safety.
- Example for The
employeestable:START TRANSACTION; SELECT * FROM employees WHERE emp_no = 10001; -- Preview DELETE FROM dept_emp WHERE emp_no = 10001; -- Delete dependent data DELETE FROM salaries WHERE emp_no = 10001; -- Delete dependent data DELETE FROM employees WHERE emp_no = 10001; -- Delete target row COMMIT;
If you have a specific table or condition in mind (e.g., delete employees with certain names or from a specific department), let me know, and I can provide a tailored query!
- Do You Have a Backup?: You need a recent
.sqldump file (e.g.,employees_table.sqlor a full database dump) that contains the table’s data before the deletion occurred. - Can You Identify Missing Rows?: If you don’t know which rows were deleted, you’ll need to compare the current table with the backup to find the differences.
- Foreign Key Constraints: The
employeesdatabase has related tables (e.g.,dept_emp,salaries). Restoring rows in one table may require restoring related rows in others to maintain data integrity. - Alternative Recovery Options: If you don’t have a dump file, other methods like binary logs or point-in-time recovery may be available, depending on The MySQL configuration.
Click to view the options and steps to restore
You can compare the current table with a dump file to identify and restore only the missing rows. This is the preferred approach to avoid overwriting existing data. Here’s how to do it:
- Ensure you have a
.sqldump file (e.g.,employees_table.sql) created withmysqldumpbefore the deletion. This file should contain theCREATE TABLEstatement (optional) andINSERTstatements for the table’s data. - Example command that created the dump:
mysqldump -u username -p employees employees > employees_table.sql
- Import the dump file into a temporary database to compare its contents with the current table.
- Create a new database (e.g.,
temp_restore):CREATE DATABASE temp_restore;
- Restore the dump file into the temporary database:
mysql -u username -p temp_restore < employees_table.sql - This creates a copy of the
employeestable (and any other dumped tables) intemp_restorewith the data as it was at the time of the backup.
-
Compare the current table (e.g.,
employees.employees) with the restored table (e.g.,temp_restore.employees) to find rows that exist in the backup but not in the current table. -
Assuming
emp_nois the primary key in theemployeestable, you can use a query like:SELECT t.emp_no, t.first_name, t.last_name, t.birth_date, t.gender, t.hire_date FROM temp_restore.employees t LEFT JOIN employees.employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL;
- This query finds rows in
temp_restore.employeesthat don’t exist inemployees.employees(i.e., the deleted rows). - The
LEFT JOINwithWHERE e.emp_no IS NULLidentifies rows present in the backup but missing in the current table.
- This query finds rows in
-
Save the Missing Rows: Export the missing rows to a new
.sqlfile for restoration:mysqldump -u username -p --no-create-info --skip-lock-tables --where="emp_no IN ( SELECT t.emp_no FROM temp_restore.employees t LEFT JOIN employees.employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL )" temp_restore employees > missing_employees.sql
- The
--whereoption filters the dump to include only the missing rows. - The resulting
missing_employees.sqlcontainsINSERTstatements for the deleted rows.
- The
- Import the
missing_employees.sqlfile into the original database:mysql -u username -p employees < missing_employees.sql - This inserts only the missing rows back into the
employeestable, preserving any existing data.
- If the deleted rows in
employeeshad related data in tables likedept_emp,salaries, ortitles, repeat the process for those tables:- Identify missing rows in
dept_emp:SELECT t.emp_no, t.dept_no, t.from_date, t.to_date FROM temp_restore.dept_emp t LEFT JOIN employees.dept_emp e ON t.emp_no = e.emp_no AND t.dept_no = e.dept_no WHERE e.emp_no IS NULL;
- Export and restore missing rows for
dept_emp,salaries, etc., as needed.
- Identify missing rows in
- Check the restored data:
SELECT * FROM employees WHERE emp_no IN ( SELECT emp_no FROM temp_restore.employees );
- Compare row counts before and after:
SELECT COUNT(*) FROM employees.employees; SELECT COUNT(*) FROM temp_restore.employees;
- Drop the temporary database:
DROP DATABASE temp_restore;
If identifying and restoring only missing rows is too complex or you’re confident the backup contains all necessary data, you can delete all rows and restore the entire table from the dump. However, this is less desirable because it overwrites existing data.
-
Backup Current Data (for safety):
mysqldump -u username -p employees employees > employees_current.sql -
Truncate the Table:
TRUNCATE TABLE employees;
- Or use
DELETE:DELETE FROM employees;
- Note:
TRUNCATEresets auto-increment counters, whileDELETEdoes not. UseDELETEif you need to preserve the counter.
- Or use
-
Handle Foreign Keys:
- If
employeesis referenced bydept_emp,salaries, etc., you may need to truncate or delete related tables first or disable foreign key checks:SET FOREIGN_KEY_CHECKS = 0; TRUNCATE TABLE employees; SET FOREIGN_KEY_CHECKS = 1;
- If
-
Restore the Full Table:
mysql -u username -p employees < employees_table.sql -
Verify:
SELECT COUNT(*) FROM employees;
Downsides:
- Overwrites any new or modified rows added since the backup.
- Requires restoring related tables (e.g.,
dept_emp,salaries) to maintain consistency, which can be disruptive. - More destructive than restoring only missing rows.
If you don’t have a dump file or want to recover without a full restore, you can use MySQL’s binary logs (if enabled) to identify and replay transactions before the deletion.
-
Check Binary Logging:
SHOW VARIABLES LIKE 'log_bin';
- If
log_binisON, binary logs are enabled.
- If
-
Identify the Deletion:
- Use
mysqlbinlogto inspect binary logs and find theDELETEstatement(s):mysqlbinlog /path/to/binary-log.000001 > binlog.txt - Look for the
DELETEquery or use timestamps to narrow down the event.
- Use
-
Replay Transactions:
- Extract and replay
INSERTstatements from before the deletion usingmysqlbinlogwith a time range:mysqlbinlog --start-datetime="2025-10-23 00:00:00" --stop-datetime="2025-10-24 09:00:00" /path/to/binary-log.000001 | mysql -u username -p employees
- Adjust the time range to before the deletion occurred.
- Extract and replay
-
Limitations:
- Requires binary logging to be enabled before the deletion.
- Complex to pinpoint exact transactions without knowing the deletion time.
- Not practical if you don’t have binary logs or if they’ve been purged.
- Preserves Existing Data: New or modified rows since the backup are retained.
- Less Disruptive: Avoids wiping out the entire table or related tables.
- Targeted Recovery: Only restores what was lost, maintaining database consistency.
- Feasible with Dump: A
.sqldump file makes it easy to compare and extract missing rows.
- Requires a Backup: You need a
.sqldump file from before the deletion. - Primary Key Needed: Comparing tables relies on a unique identifier (e.g.,
emp_noinemployees). - Foreign Key Complexity: Restoring rows in
employeesmay require restoring related rows indept_emp,salaries, etc., to avoid constraint violations. - Time-Consuming: Comparing large tables can be slow, especially for millions of rows.
-
Always Have Backups:
- Regularly create dump files:
mysqldump -u username -p employees > employees_backup_$(date +%F).sql
- Regularly create dump files:
-
Enable Binary Logging:
- Enable binary logging for point-in-time recovery:
SET GLOBAL log_bin = 'ON';
- Configure in
my.cnf:[mysqld] log_bin = /var/log/mysql/mysql-bin.log
- Enable binary logging for point-in-time recovery:
-
Use Transactions for Deletes:
- Wrap
DELETEoperations in transactions to allow rollback:START TRANSACTION; DELETE FROM employees WHERE condition; -- Verify ROLLBACK; -- or COMMIT;
- Wrap
-
Test Restores:
- Test the restore process in a temporary database to ensure it works as expected.
-
Check Foreign Keys:
- Before restoring, verify foreign key relationships:
SHOW CREATE TABLE dept_emp;
- Before restoring, verify foreign key relationships:
Assume you accidentally deleted rows from the employees table and have a dump file employees_table.sql.
-
Create Temporary Database:
CREATE DATABASE temp_restore;
mysql -u username -p temp_restore < employees_table.sql -
Find Missing Rows:
SELECT t.emp_no, t.first_name, t.last_name FROM temp_restore.employees t LEFT JOIN employees.employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL;
-
Export Missing Rows:
mysqldump -u username -p --no-create-info --where="emp_no IN ( SELECT t.emp_no FROM temp_restore.employees t LEFT JOIN employees.employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL )" temp_restore employees > missing_employees.sql
-
Restore Missing Rows:
mysql -u username -p employees < missing_employees.sql -
Handle Related Tables:
- Repeat for
dept_emp,salaries, etc., if needed:mysqldump -u username -p --no-create-info --where="emp_no IN ( SELECT t.emp_no FROM temp_restore.dept_emp t LEFT JOIN employees.dept_emp e ON t.emp_no = e.emp_no AND t.dept_no = e.dept_no WHERE e.emp_no IS NULL )" temp_restore dept_emp > missing_dept_emp.sql mysql -u username -p employees < missing_dept_emp.sql
- Repeat for
-
Verify:
SELECT COUNT(*) FROM employees.employees;
You don’t need to delete all rows and restore the entire table. Instead, you can use a .sql dump file to:
- Compare the current table with the backup in a temporary database.
- Identify missing rows using a
LEFT JOINquery. - Export and restore only the missing rows with
mysqldump --where. This approach preserves existing data and is more precise. If binary logs are enabled, you can also explore point-in-time recovery, but a dump file is usually easier to work with. Always maintain regular backups and test The restore process to ensure data integrity.
The best approach depends on whether you can identify the affected table and the nature of the missing data (e.g., missing emp_no rows in employees or missing department assignments in dept_emp). Since you’re unsure which department’s data is missing, we’ll assume the employees table or dept_emp (department assignments) is affected, as these are central to department-related data. The strategy will:
- Identify missing rows by comparing the current table with the backup.
- Extract relevant rows from the full dump (
employees_full_dump.sql). - Restore only those rows, handling foreign key constraints.
Given the complexity of pinpointing deleted rows without specific details, we’ll use a practical method to restore missing data by leveraging the backup and MySQL tools, minimizing disruption to the existing database. Below is a step-by-step guide, tailored to The containerized setup, with alternatives if needed.
- Leverages Existing Backup: The
employees_full_dump.sqlcontains all data (~300,024 rows inemployees, ~2,844,047 insalaries, etc.), making it a reliable source to recover missing rows. - Preserves Existing Data: Instead of dropping the database, we selectively insert missing rows to avoid overwriting valid data.
- Handles Dependencies: Disables foreign key checks to manage relationships (e.g.,
dept_empreferencesemployees). - Scalable: Works for any table (e.g.,
employeesordept_emp) once the affected table is identified.
Click to view Prerequisites and steps
- Source Container:
mysql-container(MySQL 8.0.43, port 3306) with theemployeesdatabase, where data is missing. - Backup File:
employees_full_dump.sql(~167-200 MB) in~/backups/on the host. - Reference Container:
new-mysql-container(port 3307) with the full database (optional, for comparison). - Tools:
mysqldump,mysql, Docker,grep/sedfor processing dumps. - Disk Space: ~20 MB for temporary table dumps.
- Access MySQL in
mysql-container:docker exec -it mysql-container mysql -u root -p- Enter the root password.
- Check Table Row Counts:
- Compare current counts with expected counts (from the backup or schema documentation):
USE employees; SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'employees';- Expected counts (approximate):
employees: ~300,024dept_emp: ~331,603salaries: ~2,844,047departments: ~9dept_manager: ~24titles: ~443,308
- Expected counts (approximate):
- If
employeesordept_empshows significantly fewer rows (e.g.,employeeshas 200,000 rows), it’s likely affected.
- Compare current counts with expected counts (from the backup or schema documentation):
- Check Department Data (since you mentioned departments):
SELECT dept_no, dept_name, (SELECT COUNT(*) FROM dept_emp WHERE dept_no = d.dept_no) AS employee_count FROM departments d;- Look for departments with unexpectedly low
employee_count(e.g.,d001should have ~20,000+).
- Look for departments with unexpectedly low
- Assume
employeesis Affected:- If unsure, focus on
employees(primary table) ordept_emp(department assignments), as these are critical for department data.
- If unsure, focus on
- Exit MySQL:
exit
Since the full dump (employees_full_dump.sql) contains all tables, we’ll extract the employees table’s data to restore missing rows.
- Copy the Full Dump to Host (if not already there):
ls -al ~/backups/employees_full_dump.sql- If compressed, decompress:
gunzip ~/backups/employees_full_dump.sql.gz
- If compressed, decompress:
- Extract
employeesTable Data:- On the host, use
grepandsedto isolate theemployeestable’sINSERTstatements:grep -A 1000000 "Table structure for table \`employees\`" ~/backups/employees_full_dump.sql | sed -n '/INSERT INTO `employees`/p' > employees_data_only.sql- This creates
employees_data_only.sqlwith onlyINSERT INTO employeesstatements (noCREATE TABLE).
- This creates
- Verify:
head -n 20 employees_data_only.sql- Expect: Lines starting with
INSERT INTOemployeesVALUES ....
- Expect: Lines starting with
- On the host, use
- Copy Data to Container:
docker cp ./employees_data_only.sql mysql-container:/var/lib/mysql-files/employees_data_only.sql - Access MySQL:
docker exec -it mysql-container mysql -u root -p - Create a Temporary Table:
- Load the backup data into a temporary table to compare:
USE employees; CREATE TABLE temp_employees LIKE employees; SET FOREIGN_KEY_CHECKS = 0; SOURCE /var/lib/mysql-files/employees_data_only.sql; SET FOREIGN_KEY_CHECKS = 1; - This loads all ~300,024 rows into
temp_employees.
- Load the backup data into a temporary table to compare:
- Find Missing Rows:
- Identify
emp_novalues missing inemployeesbut present intemp_employees:SELECT t.emp_no FROM temp_employees t LEFT JOIN employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL; - Save missing
emp_novalues (e.g., to a file or note count):SELECT t.emp_no INTO OUTFILE '/var/lib/mysql-files/missing_employees.txt' FROM temp_employees t LEFT JOIN employees e ON t.emp_no = e.emp_no WHERE e.emp_no IS NULL;
- Identify
- Extract Missing Rows from Backup:
- On the host, filter
employees_data_only.sqlfor missingemp_novalues (requires scripting if many rows):- Copy
missing_employees.txtto host:docker cp mysql-container:/var/lib/mysql-files/missing_employees.txt ./missing_employees.txt - Use a script (e.g., Python) to filter
INSERTstatements:#!/usr/bin/python3 missing_ids = set() with open('missing_employees.txt', 'r') as f: missing_ids = {line.strip() for line in f} with open('employees_data_only.sql', 'r') as f, open('employees_missing_rows.sql', 'w') as out: for line in f: if any(f"('{id}'," in line for id in missing_ids): out.write(line)
- Run:
python3 filter_missing_rows.py - Creates
employees_missing_rows.sqlwithINSERTstatements for missing rows.
- Copy
- On the host, filter
- Copy to Container:
docker cp ./employees_missing_rows.sql mysql-container:/var/lib/mysql-files/employees_missing_rows.sql - Import Missing Rows:
docker exec -it mysql-container mysql -u root -pUSE employees; SET FOREIGN_KEY_CHECKS = 0; SOURCE /var/lib/mysql-files/employees_missing_rows.sql; SET FOREIGN_KEY_CHECKS = 1;- Or from host:
docker exec -i mysql-container mysql -u root -p employees < ./employees_missing_rows.sql
- Or from host:
- Check Row Counts:
SELECT COUNT(*) FROM employees; -- Should be ~300,024 - Verify Department Data:
SELECT d.dept_name, COUNT(de.emp_no) AS employee_count FROM departments d LEFT JOIN dept_emp de ON d.dept_no = de.dept_no GROUP BY d.dept_name;- Compare with expected counts (e.g.,
Development~85,000).
- Compare with expected counts (e.g.,
- Run Integrity Test:
docker cp ~/test_db/test_employees_md5.sql mysql-container:/var/lib/mysql-files/test_employees_md5.sql mysql -u root -p -t < /var/lib/mysql-files/test_employees_md5.sql- Expect: "OK" for
employees.
- Expect: "OK" for
- Drop Temporary Table:
DROP TABLE temp_employees;
- Exit MySQL:
exit - Archive Files:
mv ./employees_missing_rows.sql ~/backups/ tar -czf ~/backups/employees_recovery.tar.gz ~/backups/employees_missing_rows.sql - Prevent Future Loss:
- Enable binary logging:
- Edit
/etc/mysql/my.cnfin the container:docker cp mysql-container:/etc/mysql/my.cnf ./my.cnf # Add: log_bin = /var/log/mysql/mysql-bin.log docker cp ./my.cnf mysql-container:/etc/mysql/my.cnf docker restart mysql-container
- Edit
- Schedule daily backups:
0 2 * * * docker exec mysql-container mysqldump -u root -pThe_PASSWORD --databases employees > /var/lib/mysql-files/backup_$(date +%Y%m%d).sql
- Enable binary logging:
- Foreign Key Errors:
- Ensure
SET FOREIGN_KEY_CHECKS = 0during import.
- Ensure
- Duplicate Rows:
- Use
INSERT IGNOREor check for duplicates:INSERT IGNORE INTO employees SELECT * FROM temp_employees;
- Use
- File Not Found:
- Verify:
ls -al /var/lib/mysql-files/employees_missing_rows.sql
- Verify:
- Slow Import:
- Increase memory:
SET GLOBAL max_allowed_packet = 268435456;
- Increase memory:
- Logs:
- Check:
tail -n 50 /var/log/mysql/error.log
- Check:
If the missing data is in dept_emp (department assignments):
- Extract
dept_empfrom the full dump:grep -A 1000000 "Table structure for table \`dept_emp\`" ~/backups/employees_full_dump.sql | sed -n '/INSERT INTO `dept_emp`/p' > dept_emp_data_only.sql - Compare and restore as in Steps 3-4, using
temp_dept_emptable.
- Targeted Restoration: Restores only missing rows, preserving existing data.
- Uses Existing Backup: Leverages
employees_full_dump.sqlwithout needing binlogs. - Handles Dependencies: Disables foreign keys to avoid conflicts.
- Verifiable: Integrity tests ensure correctness.
To monitor the number of connections (open, closed, and active) in a MySQL instance running inside a container, you can use MySQL commands, variables, and tools available within the container. Below are the steps to achieve this:
Click to view the Steps to Achieve the Result
First, you need to access the MySQL instance running inside the container. Assuming you’re using Docker, you can enter the container’s shell:
docker exec -it <container_name> bashReplace <container_name> with the name or ID of The MySQL container. Once inside, you can connect to the MySQL server:
mysql -u root -pEnter the root password when prompted, or use another user with sufficient privileges.
MySQL provides several ways to monitor connections through its system variables and status commands. Here’s how you can check open, closed, and active connections:
Open connections are the currently active or idle connections to the MySQL server. Use the SHOW STATUS command to view connection-related metrics:
SHOW STATUS LIKE 'Threads_%';Key variables to focus on:
Threads_connected: Number of currently open connections.Threads_running: Number of connections actively running queries (active connections).Threads_cached: Number of cached threads for quick reuse.Threads_created: Number of threads created to handle connections.
Example output:
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Threads_cached | 2 |
| Threads_connected | 10 |
| Threads_created | 15 |
| Threads_running | 3 |
+-------------------+-------+
Here:
Threads_connected(10) indicates open connections.Threads_running(3) indicates active connections executing queries.
To see the total number of connections (including those that have been closed), use:
SHOW STATUS LIKE 'Connections';This shows the total number of connection attempts (successful or not) since the server started.
Example output:
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Connections | 150 |
+---------------+-------+
The Connections value represents all connection attempts, including those that are now closed.
To estimate closed connections, you can calculate:
Closed Connections = Connections - Threads_connected
So, if Connections = 150 and Threads_connected = 10, then closed connections ≈ 140.
To see details of all open connections, including user, host, and status:
SHOW PROCESSLIST;Example output:
+----+------+-----------+------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+-------+------------------+
| 5 | root | localhost | test | Query | 0 | init | SHOW PROCESSLIST |
| 6 | user | 172.17.0.1| app | Sleep | 120 | | NULL |
+----+------+-----------+------+---------+------+-------+------------------+
Command = Query: Indicates an active connection running a query.Command = Sleep: Indicates an idle connection.- The number of rows in the output reflects the total open connections (
Threads_connected).
To ensure you’re not hitting connection limits, check the maximum allowed connections:
SHOW VARIABLES LIKE 'max_connections';Example output:
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 151 |
+-----------------+-------+
If Threads_connected approaches max_connections, you may need to increase this limit or optimize The application.
To continuously monitor connections, you can run the above commands in a loop inside the container. For example:
watch -n 1 'mysql -u root -p<The_password> -e "SHOW STATUS LIKE \"Threads_%\"; SHOW STATUS LIKE \"Connections\"; SHOW PROCESSLIST;"'This updates the output every second. Replace <The_password> with The MySQL root password.
If you need more detailed monitoring, consider these tools available in the MySQL container:
Use mysqladmin to check status:
mysqladmin -u root -p statusExample output:
Uptime: 3600 Threads: 10 Questions: 500 Slow queries: 0 Opens: 100 Flush tables: 1 Open tables: 50 Queries per second avg: 0.138
Threads: Number of open connections.Questions: Total queries executed, which can help infer connection activity.
If the Performance Schema is enabled (check with SHOW VARIABLES LIKE 'performance_schema';), you can query detailed connection metrics:
SELECT * FROM performance_schema.threads WHERE PROCESSLIST_ID IS NOT NULL;This provides detailed thread information, including connection states.
For a more robust solution, you can use external tools to monitor MySQL connections from outside the container:
- Prometheus + MySQL Exporter: Deploy a MySQL exporter in The containerized environment to scrape metrics like
mysql_global_status_threads_connectedand visualize them in Prometheus or Grafana. - MySQL Workbench: Connect to the MySQL instance running in the container and use its performance dashboard to monitor connections.
- Zabbix or Nagios: Configure these tools to monitor MySQL metrics via the container’s exposed port (default: 3306).
To expose MySQL metrics, ensure the container’s MySQL port is mapped to the host (e.g., -p 3306:3306 in The docker run command).
To automate monitoring, you can write a script inside the container. For example, create a script to log connection stats:
#!/bin/bash
while true; do
mysql -u root -p<The_password> -e "SHOW STATUS LIKE 'Threads_%'; SHOW STATUS LIKE 'Connections';" >> /var/log/mysql_connections.log
sleep 10
doneSave this as monitor_connections.sh, make it executable (chmod +x monitor_connections.sh), and run it in the background.
- Open Connections:
Threads_connectedfromSHOW STATUS LIKE 'Threads_%'; - Active Connections:
Threads_runningfromSHOW STATUS LIKE 'Threads_%'; - Closed Connections: Derived as
Connections - Threads_connectedfromSHOW STATUS LIKE 'Connections'; - Detailed Connection Info:
SHOW PROCESSLIST;
Click to view steps and guide
Connect to MySQL:
mysql -u root -p-
General Query Log:
SHOW VARIABLES LIKE 'general_log%';
- Expected if disabled:
general_log = OFF, file:/var/lib/mysql/88e7e755eccf.log.
- Expected if disabled:
-
Slow Query Log:
SHOW VARIABLES LIKE 'slow_query_log%';
- Expected if disabled:
slow_query_log = OFF, file:/var/lib/mysql/88e7e755eccf-slow.log.
- Expected if disabled:
-
Log Output:
SHOW VARIABLES LIKE 'log_output%';
- Expected:
log_output = FILE.
- Expected:
-
Error Log:
SHOW VARIABLES LIKE 'log_error%';
- Expected:
log_error = stderr,log_error_verbosity = 2.
- Expected:
-
Test Activity (No entries if disabled):
- General:
SELECT 1;thentail -f /var/lib/mysql/88e7e755eccf.log. - Slow:
SELECT SLEEP(2);thentail -f /var/lib/mysql/88e7e755eccf-slow.log. - Error:
journalctl -u mysql -f.
- General:
SET GLOBAL general_log = 'ON';
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
SET GLOBAL log_queries_not_using_indexes = 'ON';
SET GLOBAL log_error = '/var/lib/mysql/error.log'; -- Redirect error log- Permissions:
sudo chown mysql:mysql /var/lib/mysql/88e7e755eccf*.log /var/lib/mysql/error.log sudo chmod 660 /var/lib/mysql/88e7e755eccf*.log /var/lib/mysql/error.log
Re-run queries from Step 1:
general_log = ON,slow_query_log = ON.- Test:
SELECT 1;→ Checktail -f /var/lib/mysql/88e7e755eccf.logfor entry. - Slow:
SELECT SLEEP(2);→ Check slow log. - Error:
tail -f /var/lib/mysql/error.log.
Edit /etc/mysql/my.cnf under [mysqld]:
[mysqld]
general_log = 1
general_log_file = /var/lib/mysql/88e7e755eccf.log
slow_query_log = 1
slow_query_log_file = /var/lib/mysql/88e7e755eccf-slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
log_output = FILE
log_error = /var/lib/mysql/error.log
log_error_verbosity = 3
max_connections = 500 -- For heavy loadRestart:
sudo systemctl restart mysql- Re-run
SHOW VARIABLESfrom Step 1: All should showON(except error to file). - Test activity as in Step 1 (entries should appear in logs).
Click to view other logs to enable and monitor
To optimize MySQL performance monitoring, I'll focus on reconfiguring the logging setup (moving relevant logs to /var/log/mysql for better organization and accessibility), enabling/enhancing monitoring for dropped connections, memory utilization, active connections, slow queries, and query execution times. I'll explain the rationale, suggest specific variable changes (some can be set dynamically via SQL, others require editing my.cnf and restarting MySQL), and provide queries to monitor these metrics ongoing.
The current logs are scattered in /var/lib/mysql/ with container-like prefixes (e.g., 88e7e755eccf.log). We'll move the general log, slow query log, and error log to /var/log/mysql/ for centralized management. Binlogs and relay logs are typically kept in data directories for replication/recovery, so I'll leave them unless you insist on moving (not recommended as it can break replication).
-
Dynamic Changes (via SQL, no restart needed): Run these in The MySQL shell:
SET GLOBAL general_log_file = '/var/log/mysql/general.log'; SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';- This relocates the files immediately. Ensure the directory
/var/log/mysqlexists and MySQL has write permissions (e.g.,chown mysql:mysql /var/log/mysqlandchmod 755 /var/log/mysqlon the host).
- This relocates the files immediately. Ensure the directory
-
Persistent Changes (edit
my.cnformy.iniunder[mysqld]section, then restart MySQL): Add/update these:general_log_file = /var/log/mysql/general.log slow_query_log_file = /var/log/mysql/slow.log log_error = /var/log/mysql/error.log # Changes from stderr to a file for easier tailing/monitoring SHOW VARIABLES LIKE '%log%';- Restart:
sudo systemctl restart mysqld(or equivalent). - Rationale: Centralized logs make it easier to use tools like
tail -f, logrotate, or external monitors (e.g., ELK stack, Prometheus).
- Restart:
The general_log and slow_query_log are already ON, which is good. If general log becomes too verbose (it logs all queries), consider toggling it OFF for production and relying on slow query log + performance schema.
Dropped connections (aborted connects/clients) often indicate network issues, auth failures, or resource exhaustion.
-
Enable/Enhance Logging:
- The
log_error_verbosity = 2is moderate; set to 3 for more details on errors (including connection issues):Add toSET GLOBAL log_error_verbosity = 3;my.cnffor persistence:log_error_verbosity = 3. - Ensure
log_warnings(deprecated in newer MySQL, but if available) or use performance schema for connection events.
- The
-
Monitoring Queries: Run these periodically (e.g., via cron or monitoring tool):
SHOW GLOBAL STATUS LIKE 'Aborted_connects'; -- Counts failed connection attempts (e.g., bad credentials, timeouts) SHOW GLOBAL STATUS LIKE 'Aborted_clients'; -- Counts connections closed abnormally (e.g., timeouts, max_packet_size exceeded) SHOW GLOBAL STATUS LIKE 'Connection_errors%'; -- Breakdown of error types (e.g., Connection_errors_max_connections)- To reset counters:
FLUSH STATUS;. - Threshold: If Aborted_connects > 1% of Connections, investigate (e.g., increase
max_connectionsif hitting limits).
- To reset counters:
-
Optimization Tips:
- Check
max_connections(not in The log vars; runSHOW VARIABLES LIKE 'max_connections';). Default is 151; increase if needed:SET GLOBAL max_connections = 200;(and inmy.cnf). - For timeouts: Adjust
wait_timeoutandinteractive_timeout(defaults 28800s; lower if idle connections are an issue).
- Check
MySQL memory is dominated by InnoDB buffer pool, query cache (if enabled), and thread stacks.
-
Key Variables to Check/Set:
- The
innodb_log_buffer_size = 16MBandinnodb_log_file_size = 48MBare reasonable, but monitor usage. - Run
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';(not in The list; default ~128MB on small servers). To optimize, set to 50-70% of total RAM: e.g.,innodb_buffer_pool_size = 1Ginmy.cnf(restart required).
- The
-
Monitoring Queries:
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages%'; -- Utilization: (total_pages - free_pages) / total_pages * 100% SHOW GLOBAL STATUS LIKE 'Innodb_os_log_written'; -- Log write activity; high values may need larger log buffer SHOW ENGINE INNODB STATUS; -- Detailed InnoDB stats, including memory breakdown SELECT * FROM information_schema.MEMORY_GLOBAL_TOTAL; -- Total allocated memory (MySQL 8+)- External: Use
top,htop, orfree -hon the host for OS-level MySQL process memory. Tools like MySQLTuner or Percona Monitoring can automate this. - Optimization: If utilization >80%, increase buffer pool. Watch for swapping (bad for perf).
- External: Use
-
Monitoring Queries:
SHOW GLOBAL STATUS LIKE 'Threads_connected'; -- Current active connections SHOW GLOBAL STATUS LIKE 'Threads_running'; -- Actively running threads (non-sleeping) SHOW GLOBAL STATUS LIKE 'Max_used_connections';-- Peak connections since last restart SHOW PROCESSLIST; -- Detailed list of connections and what they're doing- If Threads_connected approaches max_connections, scale up or optimize queries.
-
Optimization:
- Enable performance_schema if not (run
SHOW VARIABLES LIKE 'performance_schema';). Set inmy.cnf:performance_schema = ON(restart). - Query:
SELECT * FROM performance_schema.threads;for thread details.
- Enable performance_schema if not (run
-
Enhance Slow Query Logging:
- The
slow_query_log = ONis good. Setlong_query_time = 1(seconds) to catch more:Add toSET GLOBAL long_query_time = 1;my.cnf:long_query_time = 1. - Enable
log_slow_extra = ON(if MySQL 8.0.14+): Logs extra info like rows examined. - The
log_queries_not_using_indexes = ONis enabled; good for spotting inefficient queries.
- The
-
Monitor Query Times:
- General log (already ON) includes execution times for all queries, but disable in prod if overhead is high:
SET GLOBAL general_log = OFF;. - Use slow query log for targeted monitoring. Parse the log with
pt-query-digest(Percona Toolkit) for summaries. - For real-time:
SHOW GLOBAL STATUS LIKE 'Slow_queries'; -- Count of slow queries SELECT * FROM performance_schema.events_statements_summary_global_by_event_name WHERE EVENT_NAME LIKE 'statement/sql%' ORDER BY AVG_TIMER_WAIT DESC LIMIT 10; -- Top slow query types (requires performance_schema)
- General log (already ON) includes execution times for all queries, but disable in prod if overhead is high:
-
General Performance Optimization and Monitoring:
- Enable query logging details: Set
log_slow_admin_statements = ONandlog_slow_replica_statements = ONinmy.cnfif relevant. - Use EXPLAIN on suspect queries to check execution plans.
- Key status queries:
SHOW GLOBAL STATUS LIKE 'Handler_read%'; -- Index usage; high Handler_read_rnd_next means full table scans SHOW GLOBAL STATUS LIKE 'Created_tmp%'; -- Temp tables; high on disk means increase tmp_table_size SHOW GLOBAL STATUS LIKE 'Select%'; -- Query types; optimize full joins/scans - Tools: Install MySQL Workbench, phpMyAdmin, or Prometheus + Grafana for dashboards. Run MySQLTuner script for automated suggestions.
- Binlog/InnoDB Logs: The settings look fine (e.g.,
sync_binlog=1for durability). If high write load, considerinnodb_flush_log_at_trx_commit=2for perf (less safe).
- Enable query logging details: Set
Apply changes gradually, monitor with SHOW GLOBAL STATUS; before/after, and test in a non-prod environment.
Click to view Step-by-Step Guide to Setting Up MySQL Proxy (Using ProxySQL) with Your Running MySQL Container
I'll explain this in detail as a beginner-friendly guide, assuming you're using ProxySQL (a popular MySQL proxy for load balancing, query routing, and monitoring) with your existing MySQL 8.0 container (mysql-container). We'll follow your specified approach: pulling the ProxySQL image, creating a configuration file locally (proxysql.cnf), running the ProxySQL container, and then copying the configuration file into the running container using docker cp. After copying, we'll reload the configuration inside ProxySQL without restarting the container.
This setup places ProxySQL as an intermediary between clients (e.g., your sysbench container or applications) and your MySQL container. Clients will connect to ProxySQL on port 6033, and ProxySQL will forward requests to MySQL on port 3306.
Assumptions:
- Your MySQL container (
mysql-container) is running on themysql-sysbench-netDocker network. - MySQL root user/password:
root/root(adjust if different). - You're working on a host with Docker installed (e.g., your local machine).
- We'll use the official
proxysql/proxysqlimage. - Commands are run in your host terminal (e.g., Bash on macOS/Linux).
If any step fails, check docker logs <container_name> for errors.
First, download the official ProxySQL image from Docker Hub. This ensures you have the latest version compatible with MySQL 8.0.
- Open your terminal.
- Run the following command:
docker pull proxysql/proxysql:latest- This pulls the image (size ~200MB). Output will show progress and confirm "latest" tag.
- Verify: Run
docker images | grep proxysqlto see the image listed.
ProxySQL needs a configuration file (proxysql.cnf) to define how it connects to your MySQL backend, handles users, and listens for connections.
-
Create a directory for the config (optional but organized):
mkdir proxysql-config cd proxysql-config -
Use a text editor (e.g., Vim, Nano, or VS Code) to create
proxysql.cnf. For example, with Vim:vim proxysql.cnf- Press
ito insert mode, paste the following configuration, thenEsc+:wqto save and exit.
Configuration content (copy-paste this):
# proxysql.cnf - Basic setup for single MySQL backend datadir="/var/lib/proxysql" admin_variables= { admin_credentials="admin:admin" # ProxySQL admin login (change for security) mysql_ifaces="0.0.0.0:6032" # Admin interface port refresh_interval=2000 } mysql_variables= { threads=4 # Number of worker threads (adjust based on your CPU cores, e.g., run 'nproc' in container) max_connections=2048 # Max client connections (higher than MySQL's 151) default_query_delay=0 default_query_timeout=36000000 have_compress=true poll_timeout=2000 interfaces="0.0.0.0:6033" # Proxy listener port (clients connect here) default_schema="employees" # Default database (your employees DB) stacksize=1048576 server_version="8.0.43" # Match your MySQL version connect_timeout_server=10000 monitor_history=60000 monitor_connect_interval=200000 monitor_ping_interval=200000 ping_interval_server_msec=10000 commands_stats=true sessions_sort=true } mysql_servers= ( { address="mysql-container", port=3306, hostgroup=10 } # Your MySQL container as backend ) mysql_users= ( { username="root", password="root", default_hostgroup=10 } # MySQL user for ProxySQL to connect ) mysql_query_rules= ( # Optional: Route queries to employees DB tables { match_pattern="^SELECT .* FROM employees\\..*", destination_hostgroup=10, apply=1 } )- Key explanations:
admin_credentials: Login for ProxySQL's admin interface.interfaces: ProxySQL listens on port 6033 for MySQL clients.mysql_servers: Points to your MySQL container by name (resolved via Docker network).mysql_users: ProxySQL uses this to authenticate and connect to MySQL.- Customize: Change passwords, threads, or add rules for specific queries.
- Press
-
Verify the file exists:
ls -l proxysql.cnf
Now, start the ProxySQL container without the configuration (we'll copy it in later). We'll connect it to the same network as your MySQL container.
-
Run the container:
docker run -d \ --name proxysql \ --network mysql-sysbench-net \ -p 6033:6033 \ -p 6032:6032 \ # Expose admin port to host (optional for monitoring) proxysql/proxysql:latest-d: Run in background.--network mysql-sysbench-net: Allows communication withmysql-container.-p 6033:6033: Expose proxy port to host (clients connect tolocalhost:6033or container name).-p 6032:6032: Expose admin port (for Step 5).
-
Verify it's running:
docker ps | grep proxysql- Should show status "Up".
- Check logs:
docker logs proxysql(it may warn about missing config, but that's okay for now).
Use docker cp to copy your local proxysql.cnf into the container's /etc/proxysql/ directory (ProxySQL's default config path).
-
Ensure you're in the directory with
proxysql.cnf(e.g.,./proxysql-config). -
Run the copy command:
docker cp proxysql.cnf proxysql:/etc/proxysql/proxysql.cnf -
Verify the copy inside the container:
docker exec -it proxysql bash ls -l /etc/proxysql/proxysql.cnf # Should show the file cat /etc/proxysql/proxysql.cnf # Optional: Check content exit
ProxySQL doesn't auto-reload configs; we need to use its admin interface to load the new config at runtime (no restart needed).
-
Connect to ProxySQL's admin interface:
docker exec -it proxysql mysql -u admin -padmin -h127.0.0.1 -P6032 --prompt='Admin> '- User/password:
admin/admin(from config; change in production). - If on host port 6032:
mysql -u admin -padmin -h localhost -P6032.
- User/password:
-
Inside the admin console (prompt:
Admin>), load the config:Admin> LOAD MYSQL SERVERS FROM CONFIG; Admin> LOAD MYSQL USERS FROM CONFIG; Admin> LOAD MYSQL QUERY RULES FROM CONFIG; Admin> LOAD MYSQL VARIABLES FROM CONFIG; Admin> LOAD ADMIN VARIABLES FROM CONFIG; Admin> SAVE MYSQL SERVERS TO DISK; Admin> SAVE MYSQL USERS TO DISK; Admin> SAVE MYSQL QUERY RULES TO DISK; Admin> SAVE MYSQL VARIABLES TO DISK; Admin> SAVE ADMIN VARIABLES TO DISK; Admin> LOAD MYSQL SERVERS TO RUNTIME; Admin> LOAD MYSQL USERS TO RUNTIME; Admin> LOAD MYSQL QUERY RULES TO RUNTIME; Admin> LOAD MYSQL VARIABLES TO RUNTIME; Admin> LOAD ADMIN VARIABLES TO RUNTIME; -
Verify the backend (your MySQL container):
Admin> SELECT * FROM mysql_servers;- Should show
mysql-containeras a hostgroup=10 entry.
- Should show
-
Exit:
exit;
Now, test by connecting through the proxy.
-
From your host:
mysql -h localhost -P6033 -u root -proot -e "SHOW DATABASES;"- Should list databases like
employees(proxied via ProxySQL).
- Should list databases like
-
From your
sysbenchcontainer (enter it withdocker exec -it ee89c8bb871e bash):mysql -h proxysql -P6033 -u root -proot -e "SELECT 1 FROM employees.employees LIMIT 1;" -
Update sysbench commands to use the proxy:
sysbench oltp_read_write --mysql-host=proxysql --mysql-port=6033 --mysql-user=root --mysql-password=root --mysql-db=employees --threads=128 --time=60 run
- Monitor ProxySQL:
docker logs proxysqlor use the admin console (SELECT * FROM stats_mysql_connection_pool;). - Stop/Remove:
docker stop proxysql; docker rm proxysql. - Security: Change admin/root passwords in
proxysql.cnffor production. - Troubleshooting: If ProxySQL can't connect to MySQL, check MySQL permissions (
GRANT ALL ON *.* TO 'root'@'%';) and network (docker network inspect mysql-sysbench-net).
This completes the setup. Now, all traffic goes through ProxySQL, and you can add advanced features like query caching by editing proxysql.cnf and reloading. If you encounter errors (e.g., during copy or reload), share the output for debugging!
This guide describes how to benchmark a MySQL database using Sysbench inside Docker containers connected through a custom Docker network.
Click to view the guide
Create an isolated Docker network to allow the MySQL and Sysbench containers to communicate directly.
docker network create mysql-sysbench-netVerify the network:
docker network lsAssuming you already have a running MySQL container named mysql-container:
docker network connect mysql-sysbench-net mysql-containerThis ensures the Sysbench container can access MySQL using the container name mysql-container as hostname.
Create a Dockerfile for a lightweight Sysbench image.
# Use Debian Bullseye as the base image (ARM-compatible)
FROM debian:bullseye
# Install Sysbench and MySQL client
RUN apt-get update && \
apt-get install -y sysbench default-mysql-client && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Keep container alive
CMD ["tail", "-f", "/dev/null"]Build the image:
docker build -t sysbench:debian .Start a container from the built image and attach it to the same Docker network.
docker run -it -d --name sysbench-debian --network mysql-sysbench-net sysbench:debianVerify container is running:
docker psEnter the running container:
docker exec -it sysbench-debian bashCheck installed versions:
sysbench --version
mysql --versionCheck CPU thread count (for parallel test threads):
nprocCreate a Lua script for random read operations on the employees table.
cat > /employees_read.lua << 'EOF'
function thread_init()
drv = sysbench.sql.driver()
con = drv:connect()
end
function event()
con:query("SELECT * FROM employees.employees WHERE emp_no = " .. sysbench.rand.uniform(10001, 500000))
end
function thread_done()
con:disconnect()
end
EOFRun the read-only benchmark for 60 seconds, using all available CPU threads.
sysbench /employees_read.lua \
--mysql-host=mysql-container \
--mysql-user=root \
--mysql-password=rootpass \
--mysql-db=employees \
--threads=$(nproc) \
--time=60 \
--report-interval=10 \
run| Parameter | Description |
|---|---|
--mysql-host |
Hostname of MySQL container (from Docker network) |
--mysql-user |
MySQL username |
--mysql-password |
MySQL password |
--mysql-db |
Target database name |
--threads |
Number of concurrent threads (based on CPU cores) |
--time |
Duration of the benchmark in seconds |
--report-interval |
Interval (in seconds) for progress reports |
After the test, connect to your MySQL instance and run:
SHOW GLOBAL STATUS LIKE 'Aborted_%';
SHOW STATUS LIKE 'Connections';
SHOW VARIABLES LIKE 'max_connections';| Metric | Meaning |
|---|---|
Aborted_connects |
Failed connection attempts (authentication errors, resource limits) |
Aborted_clients |
Connections terminated unexpectedly by clients |
Connections |
Total number of connection attempts (successful + failed) |
max_connections |
The configured upper limit for concurrent connections |
| Step | Description |
|---|---|
| 1 | Create Docker network |
| 2 | Connect MySQL to network |
| 3 | Build Sysbench image |
| 4 | Run Sysbench container |
| 5 | Verify setup |
| 6 | Create benchmark Lua script |
| 7 | Run benchmark |
| 8 | Analyze MySQL performance metrics |
Click to view the guide and setup
+---------------------------+
| Sysbench Container |
| (Load Generator Client) |
+-------------+-------------+
|
| TCP 6033 (ProxySQL listener)
v
+---------------------------+
| ProxySQL Layer |
| Port 6033 -> MySQL Proxy |
| Port 6032 -> Admin Panel |
+-------------+-------------+
|
| TCP 3306 (MySQL port)
v
+---------------------------+
| MySQL Container |
| Backend DB: employees |
+---------------------------+
Ensure you already have:
- A Docker network (
mysql-sysbench-net) - A running MySQL container named
mysql-container - The Sysbench container configured (from the previous guide)
Check network:
docker network lsdocker pull proxysql/proxysql:latestCreate a configuration file named proxysql.cnf:
vim proxysql.cnfPaste the following configuration:
# proxysql.cnf - Basic setup for single MySQL backend
datadir="/var/lib/proxysql"
admin_variables=
{
admin_credentials="admin:admin" # ProxySQL admin login (change for security)
mysql_ifaces="0.0.0.0:6032" # Admin interface port
refresh_interval=2000
}
mysql_variables=
{
threads=4 # Worker threads (adjust using 'nproc')
max_connections=2048 # Max client connections
default_query_delay=0
default_query_timeout=36000000
have_compress=true
poll_timeout=2000
interfaces="0.0.0.0:6033" # Proxy listener port (Sysbench connects here)
default_schema="employees" # Default DB
stacksize=1048576
server_version="8.0.43" # Match MySQL version
connect_timeout_server=10000
monitor_history=60000
monitor_connect_interval=200000
monitor_ping_interval=200000
ping_interval_server_msec=10000
commands_stats=true
sessions_sort=true
}
mysql_servers=
(
{ address="mysql-container", port=3306, hostgroup=10 } # MySQL backend
)
mysql_users=
(
{ username="root", password="root", default_hostgroup=10 } # ProxySQL connects using this user
)
mysql_query_rules=
(
# Optional routing rule for employees DB
{ match_pattern="^SELECT .* FROM employees\\..*", destination_hostgroup=10, apply=1 }
)Start ProxySQL and attach it to the same network as MySQL and Sysbench:
docker run -d --name proxysql \
--network mysql-sysbench-net \
-p 6033:6033 -p 6032:6032 \
proxysql/proxysql:latestCheck container status:
docker psView logs:
docker logs proxysqldocker cp proxysql.cnf proxysql:/etc/proxysql.cnf(Optional) verify:
docker exec -it proxysql ls -l /etc/proxysql.cnf
docker exec -it proxysql cat /etc/proxysql.cnfAccess ProxySQL’s admin interface:
docker exec -it proxysql mysql -u admin -padmin -h 127.0.0.1 -P6032 --prompt='Admin> 'Execute the following commands:
Admin> LOAD MYSQL SERVERS FROM CONFIG;
Admin> LOAD MYSQL USERS FROM CONFIG;
Admin> LOAD MYSQL QUERY RULES FROM CONFIG;
Admin> LOAD MYSQL VARIABLES FROM CONFIG;
Admin> LOAD ADMIN VARIABLES FROM CONFIG;
Admin> SAVE MYSQL SERVERS TO DISK;
Admin> SAVE MYSQL USERS TO DISK;
Admin> SAVE MYSQL QUERY RULES TO DISK;
Admin> SAVE MYSQL VARIABLES TO DISK;
Admin> SAVE ADMIN VARIABLES TO DISK;
Admin> LOAD MYSQL SERVERS TO RUNTIME;
Admin> LOAD MYSQL USERS TO RUNTIME;
Admin> LOAD MYSQL QUERY RULES TO RUNTIME;
Admin> LOAD MYSQL VARIABLES TO RUNTIME;
Admin> LOAD ADMIN VARIABLES TO RUNTIME;Validate loaded configuration:
Admin> SELECT * FROM mysql_servers;
Admin> SELECT * FROM runtime_mysql_servers;Enter your MySQL container:
docker exec -it mysql-container bashGrant access to the root user from all hosts (or just ProxySQL):
mysql -u root -p -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' WITH GRANT OPTION; FLUSH PRIVILEGES;"Verify connectivity:
mysql -h localhost -P3306 -u root -p -e "SHOW DATABASES;"In your Sysbench container (sysbench-debian):
docker exec -it sysbench-debian bashTest connection via ProxySQL:
mysql -h proxysql -P6033 -u root -pThen run a Sysbench benchmark through ProxySQL:
sysbench /employees_read.lua \
--mysql-host=proxysql \
--mysql-port=6033 \
--mysql-user=root \
--mysql-password=root \
--mysql-db=employees \
--threads=$(nproc) \
--time=60 \
--report-interval=10 \
runYou can monitor connection and query stats via the admin interface:
Admin> SELECT * FROM stats_mysql_connection_pool;
Admin> SELECT * FROM stats_mysql_query_digest;
Admin> SELECT * FROM stats_mysql_commands_counters;| Component | Purpose | Port |
|---|---|---|
| MySQL | Backend database | 3306 |
| ProxySQL (Admin) | Management interface | 6032 |
| ProxySQL (Listener) | Client-facing MySQL proxy | 6033 |
| Sysbench | Load-testing client | — |
Workflow:
Sysbench (client)
↓
ProxySQL (port 6033)
↓
MySQL (port 3306)
ProxySQL Setup with MySQL and Sysbench (Docker) Tuning Connection Pool Parameters and Verification Flow
Click to view the Architecture Overview, Steps and Guide
+---------------------------+
| Sysbench Container |
| (Load Generator Client) |
+-------------+-------------+
|
| TCP 6033 (ProxySQL listener)
v
+---------------------------+
| ProxySQL Layer |
| Port 6033 -> MySQL Proxy |
| Port 6032 -> Admin Panel |
+-------------+-------------+
|
| TCP 3306 (MySQL port)
v
+---------------------------+
| MySQL Container |
| Backend DB: employees |
+---------------------------+
Docker Network: mysql-sysbench-net
All three containers communicate internally through this bridge network.
docker network create mysql-sysbench-netdocker network connect mysql-sysbench-net mysql-containerCreate proxysql.cnf:
datadir="/var/lib/proxysql"
admin_variables=
{
admin_credentials="admin:admin"
mysql_ifaces="0.0.0.0:6032"
refresh_interval=2000
}
mysql_variables=
{
threads=4
max_connections=2048
interfaces="0.0.0.0:6033"
default_schema="employees"
server_version="8.0.43"
commands_stats=true
sessions_sort=true
}
mysql_servers=
(
{ address="mysql-container", port=3306, hostgroup=10 }
)
mysql_users=
(
{ username="root", password="root", default_hostgroup=10 }
)
mysql_query_rules=
(
{ match_pattern="^SELECT .* FROM employees\\..*", destination_hostgroup=10, apply=1 }
)docker run -d --name proxysql \
--network mysql-sysbench-net \
-p 6033:6033 -p 6032:6032 \
proxysql/proxysql:latestCopy config into container:
docker cp proxysql.cnf proxysql:/etc/proxysql.cnfAccess admin console:
docker exec -it proxysql mysql -u admin -padmin -h 127.0.0.1 -P6032 --prompt='Admin> 'Run:
Admin> LOAD MYSQL SERVERS FROM CONFIG;
Admin> LOAD MYSQL USERS FROM CONFIG;
Admin> LOAD MYSQL QUERY RULES FROM CONFIG;
Admin> LOAD MYSQL VARIABLES FROM CONFIG;
Admin> LOAD ADMIN VARIABLES FROM CONFIG;
Admin> SAVE MYSQL SERVERS TO DISK;
Admin> SAVE MYSQL USERS TO DISK;
Admin> SAVE MYSQL QUERY RULES TO DISK;
Admin> SAVE MYSQL VARIABLES TO DISK;
Admin> SAVE ADMIN VARIABLES TO DISK;
Admin> LOAD MYSQL SERVERS TO RUNTIME;
Admin> LOAD MYSQL USERS TO RUNTIME;
Admin> LOAD MYSQL QUERY RULES TO RUNTIME;
Admin> LOAD MYSQL VARIABLES TO RUNTIME;
Admin> LOAD ADMIN VARIABLES TO RUNTIME;
Admin> SELECT * FROM runtime_mysql_servers;Grant root user remote access from ProxySQL:
docker exec -it mysql-container bash
mysql -u root -p -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' WITH GRANT OPTION; FLUSH PRIVILEGES;"Access Sysbench container:
docker exec -it sysbench-debian bashRun a benchmark (reads via ProxySQL):
sysbench /employees_read.lua \
--mysql-host=proxysql \
--mysql-port=6033 \
--mysql-user=root \
--mysql-password=root \
--mysql-db=employees \
--threads=$(nproc) \
--time=60 \
--report-interval=10 \
runInside ProxySQL admin interface (port 6032):
Admin> SELECT * FROM stats_mysql_connection_pool;
Admin> SELECT * FROM stats_mysql_query_digest;
Admin> SELECT * FROM stats_mysql_commands_counters;
Admin> SELECT * FROM mysql_server_connect_log ORDER BY time_start_us DESC LIMIT 5;ProxySQL can maintain persistent connections to MySQL and reuse them for new clients, improving throughput and latency.
Below are recommended tuning variables for heavy load testing:
| Parameter | Description | Recommended |
|---|---|---|
mysql-threads |
Number of worker threads | Set to nproc (e.g. 4–8) |
mysql-max_connections |
Max client connections | 2–4× MySQL’s max_connections |
mysql-default_query_timeout |
Query timeout in ms | 36000000 (10h) |
mysql-ping_interval_server_msec |
Health check frequency | 10000 (10s) |
mysql-monitor_connect_interval |
Backend connectivity test interval | 200000 (200s) |
mysql-monitor_ping_interval |
Backend ping interval | 200000 (200s) |
mysql-monitor_history |
Monitor history retention | 60000 |
mysql-connection_max_age_ms |
Max age of pooled connections before refresh | 1800000 (30m) |
mysql-max_transactions_per_connection |
Reuse limit per connection | 10000 |
mysql-session_idle_timeout |
Idle session close timeout | 300000 (5m) |
To modify at runtime:
Admin> UPDATE global_variables SET variable_value='1800000' WHERE variable_name='mysql-connection_max_age_ms';
Admin> LOAD MYSQL VARIABLES TO RUNTIME;
Admin> SAVE MYSQL VARIABLES TO DISK;| Action | Command |
|---|---|
| Check backend status | Admin> SELECT * FROM mysql_servers; |
| Check connection pool stats | Admin> SELECT * FROM stats_mysql_connection_pool; |
| Check live traffic | Admin> SELECT * FROM stats_mysql_query_digest ORDER BY sum_time DESC LIMIT 10; |
| Validate MySQL privileges | mysql -h mysql-container -u root -p |
- ProxySQL helps handle burst connections by pooling MySQL sessions.
- Sysbench results may show higher TPS and lower connection latency compared to direct MySQL connection.
- It is ideal for production-like testing where connection reuse and load balancing behavior must be simulated.
After running your Sysbench benchmark through ProxySQL, you will get reports of TPS, latency, and connection metrics. Understanding these is crucial to identify bottlenecks and tune the system.
Click to view Steps and Guide
Threads started: 8
Time limit: 60s
...
transactions: 12000 (200.00 per sec)
queries: 24000 (400.00 per sec)
ignored errors: 0
reconnects: 0
| Metric | Meaning | Interpretation |
|---|---|---|
transactions/sec (TPS) |
Completed transactions per second | Higher is better; indicates throughput |
queries/sec |
Queries per second | Sum of all SQL statements executed; proxy may reduce latency |
avg latency |
Average time per query (ms) | Lower latency indicates efficient pooling & connection reuse |
min/max latency |
Min/max response time | Spikes in max latency may indicate contention or slow queries |
reconnects |
Number of new connections due to pool exhaustion | Should ideally be 0 with ProxySQL pooling |
ignored errors |
Failed queries | Non-zero indicates issues in SQL, permissions, or load |
SHOW GLOBAL STATUS LIKE 'Aborted_%';
SHOW STATUS LIKE 'Connections';
SHOW VARIABLES LIKE 'max_connections';| Variable | Meaning | What to Watch |
|---|---|---|
Aborted_clients |
Clients disconnected improperly | High values indicate client timeouts or crashes |
Aborted_connects |
Failed connection attempts | Should be minimal; ProxySQL helps reduce these |
Connections |
Total connection attempts | Compare with max_connections to see if limits are reached |
max_connections |
MySQL max concurrent connections | Ensure ProxySQL max_connections does not exceed this by too much |
Inside ProxySQL admin interface:
Admin> SELECT * FROM stats_mysql_connection_pool;
Admin> SELECT * FROM stats_mysql_query_digest ORDER BY sum_time DESC LIMIT 10;
Admin> SELECT * FROM stats_mysql_commands_counters;| Table | Key Metrics | Interpretation |
|---|---|---|
stats_mysql_connection_pool |
srv_hostgroup, hostgroup, status, used_connections |
High used_connections vs max_connections → pool may need tuning |
stats_mysql_query_digest |
Query sum_time, count_star, avg_time |
Shows which queries are slow; identify bottlenecks |
stats_mysql_commands_counters |
Counters for SELECT, INSERT, UPDATE | Helps analyze read/write load distribution |
| Bottleneck | Symptom | Recommended Fix |
|---|---|---|
| Connection limits | High reconnects or Aborted_connects |
Increase ProxySQL pool, check MySQL max_connections |
| Slow queries | High avg_time or max_time |
Optimize SQL, add indexes, adjust ProxySQL rules |
| CPU saturation | Low TPS despite low latency | Reduce threads or increase CPU resources |
| ProxySQL saturation | High used_connections |
Increase threads or tune connection pool parameters |
-
ProxySQL Thread Count
Admin> SET mysql-threads = 8; Admin> LOAD MYSQL VARIABLES TO RUNTIME;
-
Connection Pool Size
Admin> UPDATE global_variables SET variable_value=4096 WHERE variable_name='mysql-max_connections'; Admin> LOAD MYSQL VARIABLES TO RUNTIME;
-
Connection Lifetime and Idle Timeout
Admin> UPDATE global_variables SET variable_value=1800000 WHERE variable_name='mysql-connection_max_age_ms'; Admin> UPDATE global_variables SET variable_value=300000 WHERE variable_name='mysql-session_idle_timeout'; Admin> LOAD MYSQL VARIABLES TO RUNTIME;
-
Query Routing / Rules Optimize query rules for heavy-read tables (e.g., route read-only queries to dedicated hostgroup).
If a 60-second read benchmark shows:
TPS: 2000
Avg Latency: 25 ms
Reconnections: 0
Aborted_connects: 5
Interpretation:
- TPS is decent → ProxySQL pooling is effective
- Latency is acceptable → queries served quickly
- 0 reconnects → pool sizing is correct
- 5
Aborted_connects→ minor transient failures; can monitor and check logs
This workflow allows you to benchmark, monitor, and tune your MySQL + ProxySQL setup in a repeatable and controlled Docker environment.
Click to view the Cheat Sheet
+---------------------------+
| Sysbench Container |
| (Load Generator Client) |
| Benchmarking Lua scripts |
+-------------+-------------+
|
| TCP 6033 (ProxySQL listener)
v
+---------------------------+
| ProxySQL Layer |
| Client-facing MySQL proxy|
| Admin Interface: 6032 |
| Listener Port: 6033 |
| Connection Pooling |
| Query Routing Rules |
+-------------+-------------+
|
| TCP 3306 (MySQL port)
v
+---------------------------+
| MySQL Container |
| Backend Database: employees |
| Max Connections: 151 |
| Aborted Connects/Clients |
+---------------------------+
| Component | Port | Purpose |
|---|---|---|
| Sysbench | — | Runs benchmark Lua scripts |
| ProxySQL Admin | 6032 | Configuration & monitoring |
| ProxySQL Listener | 6033 | Client connections (Sysbench) |
| MySQL | 3306 | Backend database |
- TPS (Transactions/sec)
- Queries/sec
- Avg / Min / Max latency
- Reconnects
- Ignored errors
SHOW STATUS LIKE 'Connections';SHOW GLOBAL STATUS LIKE 'Aborted_%';SHOW VARIABLES LIKE 'max_connections';
stats_mysql_connection_pool→ used connections, hostgroupstats_mysql_query_digest→ slowest queriesstats_mysql_commands_counters→ query type counts
| Parameter | Recommended Value | Purpose |
|---|---|---|
| mysql-threads | nproc |
Worker threads |
| mysql-max_connections | 2–4× MySQL max_connections | Max pooled client connections |
| mysql-connection_max_age_ms | 1800000 (30m) | Recycle pooled connections |
| mysql-session_idle_timeout | 300000 (5m) | Close idle connections |
| mysql-max_transactions_per_connection | 10000 | Max reuse per connection |
-
Sysbench container generates load (Lua script queries).
-
ProxySQL intercepts queries:
- Pools connections
- Applies query routing rules
- Manages connection limits
-
MySQL executes queries and returns results through ProxySQL.
-
Monitor TPS, latency, and connection stats in Sysbench, MySQL, and ProxySQL.
Tip: Always validate:
- MySQL user has privileges for ProxySQL connections
- ProxySQL config is loaded into runtime (
LOAD ... TO RUNTIME) - Connection pool is sized correctly for expected load