Skip to content

Scale Up and Scale Out

EdVassie edited this page Sep 7, 2021 · 13 revisions
Previous Development, Test and Production Environments SSAS and PowerBI Optimisation Next

Choosing when and how to Scale Up and to Scale Out are important decisions in optimising the financial and computing resources devoted to the SQL Server estate. This section gives advice on how to do this, using the following scenarios that are based on the number of CPU cores that need a SQL License:

Scenario Description
Single Server Lowest possible cost for a small (4-core) SQL License
Clustered SQL with dAG Entry-level low RTO for a 8-core SQL License
Non-Clustered SQL with dAG Initial scale-out for a 16-core SQL License
Windows Stretch Cluster Expandable scale-out for a 32-core SQL License
Large Scale-Out Scenarios Asymmetric scale-out for an over 64-core SQL License

The architectural options shown are intended to provide the best possible service to the business for the minimum cost, and show how SQL FineBuild can deliver the SQL Server elements needed.

  • All scenarios are designed to minimise data loss and therfore have a low RPO
  • All options apart from Single Server are designed to allow the shortest possible RTO by using Always On capability
  • The best option to choose depends greatly on the number of cores the organisation needs to license
  • The options shown assume the workload has already been divided into physical Roles (as described in Design Site SQL Server Estate)

The cost of minimising RPO and RTO has declined greatly since 2015. Prior to that the cost of a 5-second RPO could easily be 10 times that for a 5-minute RPO, and a 0.5 second RPO easily cost over 100 times that for a 5-minute RPO. The capabilities of SQL Server Availability Groups from SQL 2016 onwards has changed this cost basis. The latest SQL Server version can provide a RPO of under 0.5 seconds at a cost only perhaps twice that of a 5-minute RPO.

The RTO can be similarly reduced, but the automation needed for even a 5-second RTO is often found to get triggered by false positives. Many organisations re-phrase the RTO to cover the time between making a management decision to invoke BCO and the time it is delivered.

A balanced server will normally be provisioned with between 8GB and 16GB per CPU core, although some situations safely allow a higher memory allocation.

  • Too little memory will normally result in a low CPU utilisation, giving a poor return for the costs of SQL Server Licensing
  • Too much memory will normally result in SQL Server not making use of all the memory that has been provided

Each organisation will need to decide which option is best for them, and it is likely that different service tiers will use a different option. If necessary an organisation should develop a solution specific to their requirements.

Top


Single Server

This option provides BCO capability at the lowest cost, but it also has the most limited capabilities in that only a single server is active at any given time. The characteristics of this option are:

  • Provides low RPO but RTO could take an hour
  • Requires SQL Express Edition or above
  • Requires 2 data centres
  • Best suited for a small 1-core to 4-core SQL License
  • Disadvantage: Limited resilience and availability, difficult to use data centres as Peers
  • Disadvantage: Requires Hyper-V replication to provide BCO capability
  • Advantage: Uses 100% of SQL Server capacity

Single Server Configuration Details

The following diagram shows details of this configuration:

Single Server Architecture

The key design points that allow this configuration to work are:

  • The limitations of a non-Enterprise SQL Server License greatly restrict the reliability and availability options that are possible
  • All Licensing is allocated to a single SQL Server instance
  • The number of cores needed to be licensed must cover all write and all of of the combined workload for SQL read and SSAS, SSIS, PowerBI, etc
  • Design patterns for optimising SSAS and PowerBI workload are given in SSAS and PowerBI Optimisation

Where an organisation needs only 4 cores to run a given tier of its database stack and does not need the capabilities that Always-On offers, this option is possibly the best technical solution available. However the reliability and availability is limited, and Hyper-V replication is required to provide BCO capability. Although it is possible to cluster Standard Edition to improve reliability, this is not possible with lower editions of SQL Server. Where 8 or more cores are required to support the workload, or where low RTO and BCO resilience is needed, this is best served using Clustered SQL with dAG.

It is recommended to provision all servers as Hyper-V guests (other hypervisors are available). This allows the following capabilities:

  • Live Migration can be used to move a given Windows image to alternative hardware within the Hyper-V Host Cluster
  • Hyper-V replication can maintain a backup copy of the server image and all data even when the host runs Windows 2019 Standard Edition

Single Server FineBuild Parameters

The following SQL FineBuild parameters will are needed to provide Single Server capability:

Parameter Value Description
/SetupSQLDB: Yes Install SQL Server
/SetupSQLAS: Yes Install SSAS
/SetupSSIS: Yes Install SSIS
/SetupAlwaysOn: No Do not configure Always On for the server

Top


Clustered SQL with dAG

This option provides the entry-level build for a low RTO. It makes use of separate Windows and SQL Server Clusters at each data centre, both running Availability Groups and joined by a Distributed Availability Group (dAG). It provides a high level of resilience and allows the two data centres to act as Peers. The characteristics of this option are:

  • Provides low RPO and RTO
  • Requires SQL Enterprise Edition
  • Requires SQL 2017 or above
  • Requires 2 data centres
  • Requires separate Windows cluster at each data centre
  • Requires Clustered SQL Server at each data centre
  • Separate Write and Read connections lay the foundation to use ApplicationIntent=ReadOnly for Scale-Out
  • Hyper-V replication can optionally provide an additional layer of protection
  • Best suited to a 8-core to 12-core SQL License, but can be used with a 4-core SQL License
  • Deployed as a totally symmetrical configuration across two data centres, with each Windows Cluster running Active/Passive and the two clusters joined by the Active/Passive dAG
  • The option design facilitates keeping to license restrictions by allowing only a single SQL instance to host a workload at any one time
    • All workload is processed only on the Primary node of the Availability Group on the active Windows cluster
  • Advantage: High level of resilience and availability, data centres can act as Peers
  • Disadvantage: 75% of SQL Server capacity not useable

Clustered SQL with dAG Configuration Details

The following diagram shows details of this configuration. It is important to also review Always On Operation:

Clustered SQL with dAG Architecture

The key design points that allow this configuration to work are:

  • The SQL Server License allows one SQL instance to be used in an active/passive cluster, plus allows replication to a second data centre for BCO purposes
  • The passive data centre cannot run any user workload or database backups but can run Integrity Check jobs and a few other limited items
  • All Licensing is allocated to a single SQL Server instance
  • The number of cores needed to be licensed must cover all write and all of of the combined workload for SQL read and SSAS, SSIS, PowerBI, etc
  • Design patterns for optimising SSAS and PowerBI workload are given in SSAS and PowerBI Optimisation

Where an organisation needs only 8 to 12 cores to run a given tier of its database stack and wants the capabilities that Always-On offers, this option is possibly the best technical solution available. It provides excellent reliability and availability including for BCO, but only one of the four cluster nodes can be processing workload at any one time. A small organisation may fit all of its service tiers on to a single SQL instance. For an 8 to 12 core server the overhead of using only 25% of available resources is something that can be tolerated, but higher core counts are best served using Non-Clustered SQL with dAG.

It is recommended to provision all cluster nodes as Hyper-V guests (other hypervisors are available). This allows the following capabilities:

  • Live Migration can be used to move a given Windows image to alternative hardware within the Hyper-V Host Cluster
  • Hyper-V replication can maintain a backup copy of the server image where the host runs Windows 2019 Standard Edition
  • Hyper-V replication can maintain a backup copy of the server image plus all data stored on shared disks where the host runs Windows 2019 Datacentre Edition and shared disks are defined as VHD Sets

Clustered SQL with dAG FineBuild Parameters

The following SQL FineBuild parameters will are needed to provide Clustered SQL with dAG capability:

Parameter Value Description
/SetupSQLDBCluster: Yes Install SQL Server as a Cluster
/SetupSQLASCluster: Yes Install SSAS as a Cluster
/SetupSSISCluster: Yes Install SSIS as a Cluster
/SetupAlwaysOn: Yes Configure Always On for the server
/SetupAlwaysOnProcs: Yes Install SQL FineBuild Always On Maintenance Procedures
/AGdAGName: Site-Defined Name of Distributed Availability Group
/SetupPowerBI: Yes Install PowerBI Server in place of SSRS
/SetupRSAlias: Yes Set up a DNS Alias for PowerBI
/RSAlias: Site-Defined Name of RS Alias

Top


Non-Clustered SQL with dAG

This option provides the first-level of scale out as server cores increase. It makes use of separate Windows Clusters at each data centre, with each cluster node runing a Non-Clustered SQL Server instance. The SQL instances on each cluster are joined by an Availability Group, and joined to the other data centre by a Distributed Availability Group. It provides a high level of resilience and allows the two data centres to act as Peers, and also improves cost-effectiveness by allowing read-only workload to be distributed to all nodes on the active cluster. The characteristics of this option are:

  • Provides low RPO and RTO
  • Requires SQL Enterprise Edition
  • Requires SQL 2017 or above
  • Requires Windows cluster at each data centre
  • Does not use Clustered SQL Server
  • Requires separate Write and Read connections to use ApplicationIntent=ReadOnly for Scale-Out
  • Hyper-V replication can optionally provide an additional layer of protection
  • Best suited to a 12-core to 24-core SQL License deployed
  • Deployed as a totally symmetrical configuration across two data centres, with each Windows cluster running Active/Active and the two clusters joined by the Active/Passive dAG
  • The option design facilitates keeping to license restrictions by allowing only two SQL instances to host a workload at any one time
    • SQL DB write workload is processed only on the Primary node of the Availability Group on the active Windows cluster
    • SQL DB read workload can be processed on all nodes of the active Windows cluster
    • All workload for SSAS is processed only on the active node of the Windows cluster
  • Advantage: High level of resilience and availability, data centres can act as Peers
  • Disadvantage: 50% of SQL Server capacity not useable

Non-Clustered SQL with dAG Configuration Details

The following diagram shows details of this configuration. It is important to also review Always On Operation:

Non-Clustered SQL with dAG Architecture

The key design points that allow this configuration to work are:

  • The SQL Server License allows one SQL instance to be used in an active/passive cluster, plus allows replication to a second data centre for BCO purposes
  • The passive data centre cannot run any user workload or database backups but can run Integrity Check jobs and a few other limited items
  • The Lisensing is divided equally between the two nodes of the Windows cluster running the Active end of the dAG
  • The number of cores needed to be licensed for each node must cover all write and 50% of the combined workload for SQL read and SSAS, SSIS, PowerBI, etc
  • Design patterns for optimising SSAS and PowerBI workload are given in SSAS and PowerBI Optimisation

Where an organisation needs 12 to 24 cores to run a given tier of its database stack, this option provides possibly the best technical solution. As well as providing excellent reliability and availability including for BCO, it allows two out of the four cluster nodes to be processing workload at any one time. Because servers using this number of cores cost proportionally more than an 8-core server, the increase in utilisation gives a more cost-effective solution than Clustered SQL with dAG. However, higher core counts are best served using Windows Stretch Cluster.

It is recommended to provision all cluster nodes as Hyper-V guests (other hypervisors are available). This allows the following capabilities:

  • Live Migration can be used to move a given Windows image to alternative hardware within the Hyper-V Host Cluster
  • Hyper-V replication can maintain a backup copy of the server image where the host runs Windows 2019 Standard Edition
  • Hyper-V replication can maintain a backup copy of the server image plus all data stored on SSAS shared disks where the host runs Windows 2019 Datacentre Edition and shared disks are defined as VHD Sets

Non-Clustered SQL with dAG FineBuild Parameters

The following SQL FineBuild parameters will are needed to provide Non-Clustered SQL with dAG capability:

Parameter Value Description
/SetupSQLDB: Yes Install SQL Server
/SetupSQLDBCluster: No Do not install SQL Server as a Cluster
/SetupSQLASCluster: Yes Install SSAS as a Cluster
/SetupSSISCluster: No Do not install SSIS as a Cluster
/SetupAlwaysOn: Yes Configure Always On for the server
/SetupAlwaysOnProcs: Yes Install SQL FineBuild Always On Maintenance Procedures
/AGdAGName: Site-Defined Name of Distributed Availability Group
/SetupPowerBI: Yes Install PowerBI Server in place of SSRS
/SetupRSAlias: Yes Set up a DNS Alias for PowerBI
/RSAlias: Site-Defined Name of RS Alias

Top


Windows Stretch Cluster

This option makes use of a single Windows Clusters with nodes defined at all data centres, with each cluster node runing a Non-Clustered SQL Server instance. The SQL instances on the cluster are joined by an Availability Group that spans all the nodes. It provides a high level of resilience and allows the multiple data centres to act as Peers, and improves cost-effectiveness by allowing read-only workload to be distributed to all nodes at all data centres. The characteristics of this option are:

  • Provides low RPO and RTO
  • Requires SQL Enterprise Edition
  • Requires SQL 2017 or above
  • Requires Windows stretch cluster covering both data centres
  • Does not use Clustered SQL Server
  • Requires separate Write and Read connections to use ApplicationIntent=ReadOnly for Scale-Out
  • Hyper-V replication can optionally provide an additional layer of protection
  • Best suited to a 24-core to 48-core SQL License
  • Deployed as a totally symmetrical configuration deployed across two or three data centres, with all Windows cluster nodes running Active/Active and available to run workload
  • The option design facilitates keeping to license restrictions by allowing all SQL instances to host a workload at all times
    • SQL DB write workload is processed only on the Primary node of the Availability Group on the active Windows cluster
    • SQL DB read workload can be processed on all nodes of the all Windows clusters
    • All workload for SSAS is processed only on the active node of the Windows cluster
  • Advantage: Highest level of resilience and availability, data centres can act as Peers
  • Advantage: Uses 100% of SQL Server capacity

Windows Stretch Cluster Configuration Details

The following diagram shows details of this configuration. It is important to also review Always On Operation:

Stretch Cluster Architecture

The key design points that allow this configuration to work are:

  • The SQL Server License allows one SQL instance to be used in an active/passive cluster, plus allows replication to a second data centre for BCO purposes
  • The passive data centre cannot run any user workload or database backups but can run Integrity Check jobs and a few other limited items
  • The Lisensing is divided equally between all nodes of the Windows cluster that runs across all data centres
  • The number of cores needed to be licensed for each node must cover all write and the relevant portion of the combined workload for SQL read and SSAS, SSIS, PowerBI, etc
  • Design patterns for optimising SSAS and PowerBI workload are given in SSAS and PowerBI Optimisation

This allows workload to be processed as follows:

  • SQL DB write workload is processed only on the Primary node of the Availability Group on the active Windows cluster
  • SQL DB read workload for can be processed on all nodes of the active Windows cluster
  • All workload for SSAS is processed only on the active node of the Windows cluster Where an organisation needs 24 to 48 cores to run a given tier of its database stack, this option provides possibly the best technical solution. The cost of provisioning servers of this capability is proportionally higher than servers of a lower core count, which demands a solution that allows all servers to be processing workload. Additionally, there are now enough SQL cores licensed to provide a solution that allows all servers to be processing workload. However, higher core counts are best served using the asymmetrical setup given in Large Scale-Out Scenarios.

It is recommended to provision all cluster nodes as Hyper-V guests (other hypervisors are available). This allows the following capabilities:

  • Live Migration can be used to move a given Windows image to alternative hardware within the Hyper-V Host Cluster
  • Hyper-V replication can maintain a backup copy of the server image where the host runs Windows 2019 Standard Edition
  • Hyper-V replication can maintain a backup copy of the server image plus all data stored on SSAS shared disks where the host runs Windows 2019 Datacentre Edition and shared disks are defined as VHD Sets

Windows Stretch Cluster FineBuild Parameters

The following SQL FineBuild parameters will are needed to provide Windows Stretch Cluster capability:

Parameter Value Description
/SetupSQLDB: Yes Install SQL Server
/SetupSQLDBCluster: No Do not install SQL Server as a Cluster
/SetupSQLASCluster: Yes Install SSAS as a Cluster
/SetupSSISCluster: No Do not install SSIS as a Cluster
/SetupAlwaysOn: Yes Configure Always On for the server
/SetupAlwaysOnProcs: Yes Install SQL FineBuild Always On Maintenance Procedures
/SetupPowerBI: Yes Install PowerBI Server in place of SSRS
/SetupRSAlias: Yes Set up a DNS Alias for PowerBI
/RSAlias: Site-Defined Name of RS Alias

Top


Large Scale-Out Scenarios

This option makes use of a single Windows Clusters with nodes defined at all data centres, with each cluster node runing a Non-Clustered SQL Server instance. The SQL instances on the cluster are joined by an Availability Group that spans all the nodes. It provides a high level of resilience and allows the multiple data centres to act as Peers, and provides best cost-effectiveness by allowing read-only workload to be distributed to all nodes at all data centres using an asymmetrical configuration. The characteristics of this option are:

  • Provides low RPO and RTO
  • Requires SQL Enterprise Edition
  • Requires SQL 2017 or above
  • Requires Windows stretch cluster covering both data centres
  • Does not use Clustered SQL Server
  • Requires separate Write and Read connections to use ApplicationIntent=ReadOnly for Scale-Out
  • Hyper-V replication can optionally provide an additional layer of protection
  • Best suited to a more than 48-core SQL License
  • Deployed as an asymmetrical configuration deployed across three or more data centres, with all Windows cluster nodes running Active/Active and available to run workload
  • The option design facilitates keeping to license restrictions by allowing all SQL instances to host a workload at all times
    • SQL DB write workload is processed only on the Primary node of the Availability Group on the active Windows cluster
    • Sharding the most active databases allows SQL DB write workload to be distributed to the Primary node of each shard
    • SQL DB read workload can be processed on all nodes of the all Windows clusters
    • All workload for SSAS is processed only on the active node of the Windows cluster
  • Advantage: Highest level of resilience and availability, data centres can act as Peers
  • Advantage: Uses 100% of SQL Server capacity

Large Scale-Out Scenarios Configuration Details

The following diagram shows details of this configuration. It is important to also review Always On Operation:

Large Scale-Out Architecture

The key design points that allow this configuration to work are:

  • The SQL Server License allows one SQL instance to be used in an active/passive cluster, plus allows replication to a second data centre for BCO purposes
  • The passive data centre cannot run any user workload or database backups but can run Integrity Check jobs and a few other limited items
  • The Lisensing is divided equally between all nodes of the Windows cluster that runs across all data centres
  • The number of cores needed to be licensed for each node must cover all write and the relevant portion of the combined workload for SQL read and SSAS, SSIS, PowerBI, etc, plus some servers also need to have additional licences to cover the write workload
  • Design patterns for optimising SSAS and PowerBI workload are given in SSAS and PowerBI Optimisation

Where an organisation needs more than 48 cores to run a given tier of its database stack, this option provides possibly the best technical solution. The key factor governing server core count is the number of cores needed to satisfy the write workload, and this gives an opportunity for an asymmetrical design for scaleout. Where two datacentres are used to host the stretch cluster, then two SQL Server instances at each data centre must be sized to support the write workload, in order to provide resilience if one data center becomes unavailable. However when three or more datacentres are used to host the stretch cluster, then only one SQL Server instance at each data centre needs to be sized to support the write workload, as a second write-capable server will be available at another data centre if one of the data centres becomes unavailable. Aditional SQL Server instances at each data center need to be sized only large enough to support their portion of the read workload.

It may be that the write workload becomes sufficiently large that the cost of multiple write-capable servers that are used only for read workload when they are not the primary write server becomes an issue. In this situation database sharding should be used to allow the write workload to be distributed across multiple write-capable servers, which in turn decreases the size needed for each write-capable server.

It is recommended to provision all cluster nodes as Hyper-V guests (other hypervisors are available). This allows the following capabilities:

  • Live Migration can be used to move a given Windows image to alternative hardware within the Hyper-V Host Cluster
  • Hyper-V replication can maintain a backup copy of the server image where the host runs Windows 2019 Standard Edition
  • Hyper-V replication can maintain a backup copy of the server image plus all data stored on SSAS shared disks where the host runs Windows 2019 Datacentre Edition and shared disks are defined as VHD Sets

Large Scale-Out Scenarios FineBuild Parameters

The following SQL FineBuild parameters will are needed to provide Large Scale-Out Scenarios capability:

Parameter Value Description
/SetupSQLDB: Yes Install SQL Server
/SetupSQLDBCluster: No Do not install SQL Server as a Cluster
/SetupSQLASCluster: Yes Install SSAS as a Cluster
/SetupSSISCluster: No Do not install SSIS as a Cluster
/SetupAlwaysOn: Yes Configure Always On for the server
/SetupAlwaysOnProcs: Yes Install SQL FineBuild Always On Maintenance Procedures
/SetupPowerBI: Yes Install PowerBI Server in place of SSRS
/SetupRSAlias: Yes Set up a DNS Alias for PowerBI
/RSAlias: Site-Defined Name of RS Alias

Copyright FineBuild Team © 2021. License and Acknowledgements

Previous Development, Test and Production Environments Top SSAS and PowerBI Optimisation Next

Key SQL FineBuild Links:

SQL FineBuild supports:

  • All SQL Server versions from SQL 2019 through to SQL 2005
  • Clustered, Non-Clustered and Core implementations of server operating systems
  • Availability and Distributed Availability Groups
  • 64-bit and (where relevant) 32-bit versions of Windows

The following Windows versions are supported:

  • Windows 2022
  • Windows 11
  • Windows 2019
  • Windows 2016
  • Windows 10
  • Windows 2012 R2
  • Windows 8.1
  • Windows 2012
  • Windows 8
  • Windows 2008 R2
  • Windows 7
  • Windows 2008
  • Windows Vista
  • Windows 2003
  • Windows XP
Clone this wiki locally