SQL Server : High Availability & Disaster Recovery

[HA] High Availability -> minimum downtime [DR] Disaster Recovery -> Quick time to recover

SQL Server Transaction Log:

Purpose:

Critical to the integrity of DB
Critical for the mission of HA/DR
Critical for the operation of HA/DR

Transaction: An individual unit of work (ACID - Atomic, Consistent, Isolated, Durable)

DB Recovery:

Recovery is a process by which SQL Server ensures a DB is brought online in a consistent state after:
- Crash Recovery
- Restore from Backup

Recovery Types: [Recovery Model] in SQL Server

Simple Recovery
- Log is mainly used for crash recovery
- Transaction logs are marked inactive after checkpoint
- Supports Full Backup & Differential Backup
- Full Backup :: All data pages copied + Tail of log backup (transaction log data - at the time of backup)
- Differential Backup :: Only Changed data pages are copied (after full data backup)
- Easy to Manage
- Loss of all transactions that have occurred after last Full/Diff backup
Full Recovery
- Log supports crash recovery and log backups
- After Full Backup, transaction logs are backed up regularly (at a certain time). The transaction logs are marked active until it is backed up
- Allows us to restore DB to a specific point in time OR to a Specific LSN (Log Sequence Number)
- As backups are regular, smaller data loss windows
- Best: Full backups weekly, Differential backups nightly, Log backups every 15 minutes (backup files are moved off-site)
- Point in Time Restores
  - BACKUP DATABASE [db] TO DISK = ['full recovery path']; GO
  - BACKUP LOG [db] TO DISK = ['log backup path']; GO
  - RESTORE DATABASE [db] FROM DISK = ['full recovery path'] WITH NORECOVERY; GO
  - RESTORE LOG [db] FROM DISK = ['full recovery path'] WITH NORECOVERY [, STOPAT='08/02/2017 23:25:01']; GO
Bulk-Logged Recovery
- Offers better performance and a smaller log footprint during bulk load processes
- Recovery to the point of each log backup is supported offering small recovery windows
- Point in time recovery is not supported
- Can be switched between Full & Bulk Recovery model
File Based Backups
- Supports partial database restores
- Helpful in scenarios where only a single file or filegroup was damaged or lost as it avoids the time to do a complete db restore
- Managing file based backups and restore is a complex topic

Maintenance Plans

Helps to configure the Backups (scheduling)
DB Server > Management > Maintenance Plans

Managing Backups

SQL Agent / Custom TSQL Scripts
Ola Hallengren's Scripts *
3rd Party Management Tools Ex: RedGate, Idera, Dell and others
Store backups off site
Test your backups regularly. Assume that any backup that hasn't been tested is faulty
- Option :: Backup with Checksum > restore to DB > And run CheckDB?

Disaster Recovery from Backups:

Simple Recovery : can take a very long time to restore large databases
Full Recovery : can take even longer..unless you have a DR Server
- Full Backup > Apply to DR Server
- Log backup > Apply to DR Server
- Log backup > Apply to DR Server... (goes on at regular intervals)
- In case of Disaster, let the DR Server serve the application
- Can be automated easily: Log Shipping

Failover Clustering - High Availability

Windows Failover Clustering
SQL Server Always ON = SQL Server on Windows Fail Over Clustering
Failover Clustering = OS Redundancy
- Protection against:
  - OS System Crashes
  - Motherboard failures
  - Single Server Power Failure
  - Memory Errors
  - Blue Screens
  - Outages due to : Server Maintenance, Service Packs, Patching, Server Reboots
Failover will handle only the Server switch, Data storage should be handled by other means

Pre-Requisites

Same Architecture (32bit / 64bit)
Same OS Version
Same configuration, patch level, etc (RAM, Storage, etc..) (close to same . But Same is preferred)
2+ Servers
SQL Server
- 2 nodes / shared network (standard)
- 2+ nodes / multi-subnet (enterprise)
Storage - both nodes can talk to
- iSCSI SAN, Fiber SAN, Serial Attached SCSI
- Redundant protection - No single drive failure should bring down DB instance
Redundant Network - no single netcard / switch failure should bring down SQL Server Instance
- 2 NICs per node on separate networks (separate switches)
- 2 NICs on logical network w/ NIC teaming
Virtual vs Physical
- Virtual is supported but requires additional expertise : So far -> SysAdmin, ServerAdmin, SQLServerDBA

Notes

Database Log shipping - Database level - DB corrupt - Disaster Recovery
Failover Clustering - Server Level - Server failure - High Availability

*suggested

Uh oh!

SQL Server : High Availability & Disaster Recovery

SQL Server Transaction Log:

DB Recovery:

Recovery Types: [Recovery Model] in SQL Server

Maintenance Plans

Managing Backups

Disaster Recovery from Backups:

Failover Clustering - High Availability

Pre-Requisites

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally