# 📊 Automating the Collection of EXPLAIN Plans and Runtime Metrics in Db2 LUW
This notebook automates the process of running a SQL query in Db2 and capturing its execution details using EXPLAIN tables and activity event monitors. The workflow includes:

1. **Setup of Activity Event Monitor:** Configures an event monitor in Db2 to collect runtime metrics such as execution time and resource usage during query execution.
2. **Creation of EXPLAIN Tables:** Sets up EXPLAIN tables in Db2 to store detailed execution plans for SQL queries.
3. **Query Execution:** Runs a SQL query to gather execution metrics and analyze its performance.
4. **EXPLAIN Plan Generation:** Captures the execution plan of the query to understand how Db2 will process the operation.
5. **Exporting Data:** Exports the collected EXPLAIN and activity monitor data as CSV files for external analysis or reporting.

For detailed setup instructions and guidance on running this notebook, refer to the [README.md](./README.md) file in the same directory.

This notebook provides a fully automated approach for analyzing query performance and gathering runtime metrics, helping database administrators and developers optimize their queries effectively.


In [1]:
import os
from dotenv import dotenv_values

Loading Db2 Magic Commands Notebook Extension

In [2]:
# Enable Db2 Magic Commands Extensions for Jupyter Notebook
if not os.path.isfile('db2.ipynb'):
    os.system('wget https://raw.githubusercontent.com/IBM/db2-jupyter/master/db2.ipynb')
%run db2.ipynb

  firstCommand = "(?:^\s*)([a-zA-Z]+)(?:\s+.*|$)"
  pattern = "\?\*[0-9]+"


Db2 Extensions Loaded. Version: 2024-09-16


Connect to Db2

In [3]:
db2creds = dotenv_values('.env')
%sql CONNECT CREDENTIALS db2creds

Connection successful. tpcds @ localhost 


In [4]:
%sql CALL ADMIN_CMD("UPDATE DATABASE CONFIGURATION USING SECTION_ACTUALS BASE")

b. Deactivate event monitor, `ACTEVMON`

In [5]:
%sql SET EVENT MONITOR ACTEVMON STATE 0

Command completed.


b. Drop Existing Tables for event monitor `ACTEVMON`

In [6]:
%%sql -q
DROP TABLE ACTIVITYMETRICS_ACTEVMON;
DROP TABLE ACTIVITYSTMT_ACTEVMON;
DROP TABLE ACTIVITYVALS_ACTEVMON;
DROP TABLE ACTIVITY_ACTEVMON;
DROP TABLE CONTROL_ACTEVMON;

c. Install Explain Tables

In [7]:
%%sql
CALL SYSINSTALLOBJECTS('EXPLAIN', 'D', NULL, 'DB2INST1');
CALL SYSINSTALLOBJECTS('EXPLAIN', 'C', NULL, 'DB2INST1');

Command completed.


d. Alter Workload to Collect Activity Data


In [8]:
%sql ALTER WORKLOAD SYSDEFAULTUSERWORKLOAD COLLECT ACTIVITY DATA ON ALL WITH DETAILS, SECTION

Command completed.


e. Drop and re-create a New Event Monitor, `ACTEVMON`

In [9]:
%%sql 
DROP EVENT MONITOR ACTEVMON;
CREATE EVENT MONITOR ACTEVMON FOR ACTIVITIES WRITE TO TABLE;

Command completed.


f. Activate event monitor

In [10]:
%sql SET EVENT MONITOR ACTEVMON STATE 1

Command completed.


In [11]:
%sql CALL WLM_SET_CLIENT_INFO(NULL,NULL,NULL,'queryid1',NULL)

[None, None, None, 'queryid1', None]

# 🔍 Enter Your Query Below — The Query for Which You Want to Collect EXPLAIN Plan and Runtime Metrics

In [12]:
%%capture
%%sql -q
SELECT STORE_SALES.SS_WHOLESALE_COST, STORE_RETURNS.SR_NET_LOSS
FROM STORE_SALES
INNER JOIN STORE_RETURNS
ON STORE_RETURNS.SR_TICKET_NUMBER = STORE_SALES.SS_TICKET_NUMBER
AND STORE_RETURNS.SR_ITEM_SK = STORE_SALES.SS_ITEM_SK
WHERE STORE_SALES.SS_STORE_SK = 2
AND STORE_RETURNS.SR_RETURNED_DATE_SK = 2451680

# Deactivate the event monitor to ensure its data is written to the activity tables

In [13]:
%sql SET EVENT MONITOR ACTEVMON STATE 0

Command completed.


In [14]:
result = %sql SELECT a.APPL_ID, a.UOW_ID, a.ACTIVITY_ID \
    FROM ACTIVITY_ACTEVMON a \
    WHERE a.ACTIVITY_TYPE = 'READ_DML' AND a.TPMON_ACC_STR = 'queryid1'
print(result)

                        APPL_ID  UOW_ID  ACTIVITY_ID
0  127.0.0.1.53114.250223220851      78            1


In [15]:
appl_id = result['APPL_ID'].iloc[0]
uow_id = result.at[0, 'UOW_ID'].item()
activity_id = result.at[0, 'ACTIVITY_ID'].item()
event_monitor = 'ACTEVMON'
schema = 'DB2INST1'

print('appl_id: ', appl_id)
print('uow_id: ', uow_id)
print('activity_id: ', activity_id)

appl_id:  127.0.0.1.53114.250223220851
uow_id:  78
activity_id:  1


In [16]:
%%capture explain_output
sql = f'''"CALL EXPLAIN_FROM_ACTIVITY('{appl_id}', '{uow_id}', '{activity_id}', '{event_monitor}', '{schema}', null, null, null, null, null)"'''
_ = ! db2 "connect to TPCDS" 

explain = %system db2 {sql}

In [17]:
explain

['',
 '  Value of output parameters',
 '  --------------------------',
 '  Parameter Name  : EXPLAIN_SCHEMA',
 '  Parameter Value : DB2INST1',
 '',
 '  Parameter Name  : EXPLAIN_REQUESTER',
 '  Parameter Value : DB2INST1',
 '',
 '  Parameter Name  : EXPLAIN_TIME',
 '  Parameter Value : 2025-02-23-14.09.05.892652',
 '',
 '  Parameter Name  : SOURCE_NAME',
 '  Parameter Value : SYSSH200',
 '',
 '  Parameter Name  : SOURCE_SCHEMA',
 '  Parameter Value : NULLID  ',
 '',
 '  Parameter Name  : SOURCE_VERSION',
 '  Parameter Value : ',
 '',
 '  Return Status = 0']

In [18]:
# Initialize variables
explain_time = None
source_name = None
source_schema = None

# Iterate through the list
for i in range(len(explain)):
    if "EXPLAIN_TIME" in explain[i]:
        explain_time = explain[i + 1].split(":")[-1].strip()
    elif "SOURCE_NAME" in explain[i]:
        source_name = explain[i + 1].split(":")[-1].strip()
    elif "SOURCE_SCHEMA" in explain[i]:
        source_schema = explain[i + 1].split(":")[-1].strip()

# Print extracted values
print("EXPLAIN_TIME:", explain_time)
print("SOURCE_NAME:", source_name)
print("SOURCE_SCHEMA:", source_schema)

EXPLAIN_TIME: 2025-02-23-14.09.05.892652
SOURCE_NAME: SYSSH200
SOURCE_SCHEMA: NULLID


In [19]:
explain_time

'2025-02-23-14.09.05.892652'

In [20]:
from dotenv import dotenv_values

# Load environment variables from the .env file
db2creds = dotenv_values('.env')

# Extract the database name
database_name = db2creds.get("database")  # Use .get() to avoid KeyError if the key is missing

# Print the extracted database name
print("Database Name:", database_name)

Database Name: tpcds


# Generate Explain and Export the Explain output

In [21]:
import os
import shutil

# Define the output directory path
outputdir = os.path.join(os.getcwd(), "output")  # Creates "explain" directory in the current working directory

# If the directory exists, delete its contents; otherwise, create it
if os.path.exists(outputdir):
    # Remove all contents inside the directory
    for filename in os.listdir(outputdir):
        file_path = os.path.join(outputdir, filename)
        if os.path.isfile(file_path) or os.path.islink(file_path):
            os.unlink(file_path)  # Remove files and symlinks
        elif os.path.isdir(file_path):
            shutil.rmtree(file_path)  # Remove subdirectories
else:
    os.makedirs(outputdir)

# Define output file
explain_output = f"{explain_time}.txt"

db2exfmt_cmd = f'''db2exfmt -d "{database_name}" -w "{explain_time}" -n "{source_name}" -s "{source_schema}" -# 0 -o "{outputdir}/{explain_output}"'''
print(db2exfmt_cmd)

# Uncomment the following line to execute the command (if running in a shell environment)
os.system(db2exfmt_cmd)

db2exfmt -d "tpcds" -w "2025-02-23-14.09.05.892652" -n "SYSSH200" -s "NULLID" -# 0 -o "/home/db2inst1/db2-labs/explain/single-query/output/2025-02-23-14.09.05.892652.txt"


DB2 Universal Database Version 11.5, 5622-044 (c) Copyright IBM Corp. 1991, 2019
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connect to Database Successful.


Connecting to the Database.


Output is in /home/db2inst1/db2-labs/explain/single-query/output/2025-02-23-14.09.05.892652.txt.
Executing Connect Reset -- Connect Reset was Successful.


0

In [22]:
import json
# Load the dictionary from the JSON file
with open("export_sql.json", "r") as json_file:
    export_sql_statements = json.load(json_file)

print("SQL statements have been loaded from sql_statements.json")


SQL statements have been loaded from sql_statements.json


# Export EXPLAIN and ACTIVITY tables as CSV

In [23]:
import os

# Define the output directory path
outputdir = os.path.join(os.getcwd(), "explain")  # Creates "explain" directory in the current working directory

# Ensure the output directory exists
os.makedirs(outputdir, exist_ok=True)

# Loop through each table in the dictionary
for table_name, query in export_sql_statements.items():
    print(f"Processing table: {table_name}")

    # Execute the dynamically generated SQL using %sql magic
    df_result = %sql {query}

    # Convert result to Pandas DataFrame
    # df_result = df_result.DataFrame()

    # Save to CSV without including the index in the "explain" directory
    output_file_path = os.path.join(outputdir, f"{table_name}.csv")
    df_result.to_csv(output_file_path, index=False)

    print(f"Saved {table_name} data to {output_file_path}")

Processing table: ACTIVITYSTMT_ACTEVMON
Saved ACTIVITYSTMT_ACTEVMON data to /home/db2inst1/db2-labs/explain/single-query/explain/ACTIVITYSTMT_ACTEVMON.csv
Processing table: ACTIVITY_ACTEVMON
Saved ACTIVITY_ACTEVMON data to /home/db2inst1/db2-labs/explain/single-query/explain/ACTIVITY_ACTEVMON.csv
Processing table: EXPLAIN_ACTUALS
Saved EXPLAIN_ACTUALS data to /home/db2inst1/db2-labs/explain/single-query/explain/EXPLAIN_ACTUALS.csv
Processing table: EXPLAIN_INSTANCE
Saved EXPLAIN_INSTANCE data to /home/db2inst1/db2-labs/explain/single-query/explain/EXPLAIN_INSTANCE.csv
Processing table: EXPLAIN_OBJECT
Saved EXPLAIN_OBJECT data to /home/db2inst1/db2-labs/explain/single-query/explain/EXPLAIN_OBJECT.csv
Processing table: EXPLAIN_OPERATOR
Saved EXPLAIN_OPERATOR data to /home/db2inst1/db2-labs/explain/single-query/explain/EXPLAIN_OPERATOR.csv
Processing table: EXPLAIN_PREDICATE
Saved EXPLAIN_PREDICATE data to /home/db2inst1/db2-labs/explain/single-query/explain/EXPLAIN_PREDICATE.csv
Processi

In [24]:
%sql CONNECT RESET

Connection closed.
