# Using WorkloadTools to Modernize your SQL Server Instances

These demos rely on WorkladTools, a collection of open source tools to capture, analyze and compare SQL Server workloads.

You can download WorkloadTools from GitHub: [https://github.com/spaghettidba/workloadtools](https:\github.com\spaghettidba\workloadtools), where you will also find extensive documentation.

In [None]:
# Setup: create some variables for SQL Server instances
# and paths

$SourceInstance = "SQL2016"
$TargetInstance = "SQL2019"

$currentFolder = "c:\demo\ModernizeWithWorkloadTools"

# empty some tables in the analysis databases
Invoke-DbaQuery -SqlInstance $TargetInstance -Database "SqlWorkload01" -Query "
DROP TABLE IF EXISTS capture.Errors
DROP TABLE IF EXISTS capture.WorkloadDetails
DROP TABLE IF EXISTS capture.Applications
DROP TABLE IF EXISTS capture.Databases
DROP TABLE IF EXISTS capture.Hosts
DROP TABLE IF EXISTS capture.Logins
DROP TABLE IF EXISTS capture.Intervals
DROP TABLE IF EXISTS capture.NormalizedQueries
DROP TABLE IF EXISTS capture.PerformanceCounters
DROP TABLE IF EXISTS capture.WaitStats
DROP TABLE IF EXISTS capture.WorkloadSummary
" 

## Analyzing a Workload

With this demo we will analyze a workload to identify top resource consumers and the correlation between queries (the cause) and performance metrics (the consequence).

First of all we need a workload. In the real world this would be your workload from the ERP or any other production application that you want to measure. This is a demo, so no production workload here. We will use a synthetic workload created many years ago by Dell to benchmark their hardware. The tool is called [DVD Store](https:\github.com\dvdstore\ds3\) or simply DS and simulates the activity of an online DVD store. It kind of shows its age, this would be something about Netflix or Prime Video today, but I digress...

Let's start the workload!

In [None]:
# Start the synthetic workload
start-process "C:\demo\DVDSTORE\driver_lite.cmd"

Now that the workload is running, we can use WorkloadTools to analyze all the queries submitted by the application, to see how they perform over time. **SqlWorkload** is the tool that we will use. Queries are captured using the streaming API for SQL Server or (for older versions) a server-side SQL Trace. Queries are then normalized and aggregated by hash. then the performance data is written to a SQL Server database, aggregated by intervals (customizable duration, default is 1 minute).

In [None]:
# Start WorkloadTools to run the analysis
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\analyze\analyze.json"

In [None]:
# let's have a look at the json file for WorkloadTools
code $currentFolder\analyze\analyze.json

Once the data is in the database, you can visualize it using multiple tools:

- Custom query
- Built-in WorkloadViewer
- PowerBI dashboard, using the template included
- Build your own Dashboard (Grafana?)

You can query the analysis database while the capture is running, there is no need to wait for the capture to end and there are no files to process: everything happens is real-time.

In [None]:
& "$($env:PROGRAMFILES)\workloadtools\WorkloadViewer.exe" `
    --BaselineServer $TargetInstance `
    --BaselineDatabase SqlWorkload01 `
    --BaselineSchema capture 

In [None]:
& "$($env:PROGRAMFILES)\workloadtools\WorkloadViewer.exe" `
    --BaselineServer $TargetInstance `
    --BaselineDatabase SqlWorkload01 `
    --BaselineSchema sample 

## Capturing a Workload

To perform a SQL Server migration, we need more than just one analysis of the workload: we will need two, in order to compare the source SQL Server instance with the destination SQL Server instance.

There are many ways to achieve this: what I will show you is the way I feel this is done best.

  

First, we will need to capture the production workload to a file. This file will contain both the queries and the performance data. We _could_ analyze the source workload at this stage, but this is probably not what we really want. More on that later.

  

Here is how you capture a workload. Again you need SqlWorkload.exe and a .json file to configure what it will do. In this case, we simply want SqlWorkload to write a workoad on a SqLite database.

In [None]:
# Make sure the synthetic workload is still running
if(-not (get-process | ? {$_.name -eq "ds3sqlserverdriver"})) 
{
    start-process "C:\demo\DVDSTORE\driver_lite.cmd"
}


# Delete any files from previous runs
Remove-Item $currentFolder\capture\sqlworkload.sqlite

# Start WorkloadTools to run the capture
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\capture\capture.json"

In [None]:
# let's inspect the json file for WorkloadTools
code $currentFolder\capture\capture.json

In [None]:
# Display the contents of the sqlite file
Invoke-Item $currentFolder\capture\sqlworkload.sqlite

## Replaying a Workload

In order to replay a workload we need to have a few things ready:

1. The source workload (we can capture that with the method described above)
2. A target environment

### Preparing the Target Environment

If we want the queries to run under the same exact conditions captured in the source workload, the database needs to be in the same exact state. To ensure this, you need to back up the database as soon as the capture starts. SqlWorkload tries to perform a marked transaction called "WorkloadTools" on the source database before starting the capture: you can use it to perform the restore on the test environment.

If the source database is in simple recovery, you will need to perform a backup manually. Also, if the target environment does not support restores from a regular backup (Azure SQL Database or Managed Instance), you will have to set the state of the database by deploying a bacpac. 

This is a time consuming task, so make sure you don't have to do it again: create a database snapshot if you're running on-prem or use point-in-time restores to revert the database to the initial desired state.

In [None]:
# Set things up
Invoke-DbaQuery -SqlInstance $TargetInstance -Database "SqlWorkload03" -File "$currentFolder\replay\drop_tables.sql"

In [None]:
# Make sure the synthetic workload is still running
if(-not (get-process | ? {$_.name -eq "ds3sqlserverdriver"})) 
{
    start-process "C:\demo\DVDSTORE\driver_lite.cmd"
}

# Start WorkloadTools to analyze the workload on the target environment
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\replay\analyze.json"

In [None]:
# At the same time, start WorkloadTools to run the replay:
# the other instance of WorkloadTools will analyze it
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\replay\replay.json"

In [None]:
# Let's inspect the json files:
code $currentFolder\replay\replay.json
code $currentFolder\replay\analyze.json

In [None]:
& $currentFolder\replay\report.cmd

In [None]:
# Open PowerBI to visualize the data in a different way
# Unfortunately there is no way to pass the parameters via command line
Invoke-Item "$($env:PROGRAMFILES)\workloadtools\reports\WorkloadTools Report - Template.pbit"

## Real-time Replay

Sometimes the workload is too big to think about capturing it before the replay. This can happen when the business cycle is very long (1 week, 1 month) and you want to make sure that you captured all the queries in the workload.

In these cases you can set up a real-time replay: the queries are replayed to the target environment as soon as they are captured and, as usual, we have another instance on SqlWorkload running to analyze the queries in real-time.

In [None]:
# Start WorkloadTools to analyze the workload on the target environment
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\realtimereplay\analyze.json"

In [None]:
# At the same time, start WorkloadTools to run the replay:
# the other instance of WorkloadTools will analyze it
Start-Process "$($env:PROGRAMFILES)\workloadtools\sqlworkload.exe" `
    -ArgumentList "--File", "$currentFolder\realtimereplay\replay.json"

In [None]:
# Let's inspect the json files:
code $currentFolder\realtimereplay\replay.json
code $currentFolder\realtimereplay\analyze.json

In [None]:
# Display the workload comparison
& $currentFolder\realtimereplay\report.cmd

In [None]:
# Display the workload comparison
& "$($env:PROGRAMFILES)\workloadtools\WorkloadViewer.exe" `
    --BaselineServer $TargetInstance `
    --BaselineDatabase SqlWorkload_DEMO `
    --BaselineSchema baseline `
    --BenchmarkServer $TargetInstance `
    --BenchmarkDatabase SqlWorkload_DEMO `
    --BenchmarkSchema replay