<img src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/solutions-microsoft-logo-small.png?raw=true" alt="Microsoft">
<br>

# SQL Server 2019 big data cluster Tutorial
## 01 - SQL Server Master Instance Queries

In this tutorial you will learn how to run standard SQL Server Queries against the Master Instance (MI) in a SQL Server big data cluster. 

We'll start with a simple set of queries to explore the Instance: 



In [1]:
/* Instance Version */
SELECT @@VERSION; 
GO

/* General Configuration */
USE master;  
GO  
EXEC sp_configure;
GO

/* Databases on this Instance */
SELECT db.name AS 'Database Name'
, Physical_Name AS 'Location on Disk'
, Cast(Cast(Round(cast(mf.size as decimal) * 8.0/1024000.0,2) as decimal(18,2)) as nvarchar) 'Size (GB)'
FROM sys.master_files mf
INNER JOIN 
    sys.databases db ON db.database_id = mf.database_id
WHERE mf.type_desc = 'ROWS';
GO

SELECT * from sys.master_files


(No column name)
Microsoft SQL Server 2019 (CTP2.3) - 15.0.1300.359 (X64) Feb 15 2019 23:50:43 Copyright (C) 2019 Microsoft Corporation 	Developer Edition (64-bit) on Linux (Ubuntu 16.04.6 LTS) <X64>


name,minimum,maximum,config_value,run_value
allow polybase export,0,1,0,0
allow updates,0,1,0,0
backup checksum default,0,1,0,0
backup compression default,0,1,0,0
clr enabled,0,1,1,1
column encryption enclave type,0,1,0,0
contained database authentication,0,1,0,0
cross db ownership chaining,0,1,0,0
default language,0,9999,0,0
external scripts enabled,0,1,0,0


Database Name,Location on Disk,Size (GB)
master,/var/opt/mssql/data/master.mdf,0.0
tempdb,/var/opt/mssql/data/tempdb.mdf,0.01
model,/var/opt/mssql/data/model.mdf,0.01
msdb,/var/opt/mssql/data/MSDBData.mdf,0.01
DWDiagnostics,/var/opt/mssql/data/DWDiagnostics.mdf,1.0
DWConfiguration,/var/opt/mssql/data/DWConfiguration.mdf,0.01
DWQueue,/var/opt/mssql/data/DWQueue.mdf,0.01
WideWorldImporters,/var/opt/mssql/data/WideWorldImporters.mdf,1.02
WideWorldImporters,/var/opt/mssql/data/WideWorldImporters_UserData.ndf,2.05
WideWorldImportersDW,/var/opt/mssql/data/WideWorldImportersDW.mdf,2.05


database_id,file_id,file_guid,type,type_desc,data_space_id,name,physical_name,state,state_desc,size,max_size,growth,is_media_read_only,is_read_only,is_sparse,is_percent_growth,is_name_reserved,is_persistent_log_buffer,create_lsn,drop_lsn,read_only_lsn,read_write_lsn,differential_base_lsn,differential_base_guid,differential_base_time,redo_start_lsn,redo_start_fork_guid,redo_target_lsn,redo_target_fork_guid,backup_lsn,credential_id
1,1,,0,ROWS,1,master,/var/opt/mssql/data/master.mdf,0,ONLINE,512,-1,10,0,0,0,1,0,0,,,,,,,,,,,,,
1,2,,1,LOG,0,mastlog,/var/opt/mssql/data/mastlog.ldf,0,ONLINE,288,-1,10,0,0,0,1,0,0,,,,,,,,,,,,,
2,1,,0,ROWS,1,tempdev,/var/opt/mssql/data/tempdb.mdf,0,ONLINE,1024,-1,8192,0,0,0,0,0,0,,,,,,,,,,,,,
2,2,,1,LOG,0,templog,/var/opt/mssql/data/templog.ldf,0,ONLINE,1024,-1,8192,0,0,0,0,0,0,,,,,,,,,,,,,
3,1,,0,ROWS,1,modeldev,/var/opt/mssql/data/model.mdf,0,ONLINE,1024,-1,8192,0,0,0,0,0,0,,,,,,,,,,,,,
3,2,,1,LOG,0,modellog,/var/opt/mssql/data/modellog.ldf,0,ONLINE,1024,-1,8192,0,0,0,0,0,0,,,,,,,,,,,,,
4,1,df5bb3b0-ca5f-4d57-9ed3-adfe7acec37c,0,ROWS,1,MSDBData,/var/opt/mssql/data/MSDBData.mdf,0,ONLINE,1888,-1,10,0,0,0,1,0,0,,,,,,,,,,,,,
4,2,f293a495-c46d-46e7-b982-3b9de226937c,1,LOG,0,MSDBLog,/var/opt/mssql/data/MSDBLog.ldf,0,ONLINE,96,268435456,10,0,0,0,1,0,0,,,,,,,,,,,,,
5,1,516ddf00-8097-45a1-a5d7-803d6b98d716,0,ROWS,1,DWDiagnostics,/var/opt/mssql/data/DWDiagnostics.mdf,0,ONLINE,128000,1280000,8192,0,0,0,0,0,0,,,,,,,,,,,,,
5,2,cd748def-5a95-419d-b468-1b8ce095f361,1,LOG,0,DWDiagnostics_log,/var/opt/mssql/data/DWDiagnostics_log.ldf,0,ONLINE,9216,268435456,8192,0,0,0,0,0,0,,,,,,,,,,,,,


## Ingest data into the SQL Server Databases

Before we start working with data, we need to bring it in to the system. We have several options to do that, from the `bcp` utility to SQL Server Integration Services, the Azure Data Factory and more. 

For the structured data, we'll use the SQL Server `RESTORE` command to bring in two databases from the location we specified earlier with the `kubectl` command.

The Code below shows all of that: 

In [1]:
USE [master]
RESTORE DATABASE [WideWorldImporters] 
FROM  DISK = N'/var/opt/mssql/data/WWI.bak' 
WITH  FILE = 1
,  REPLACE
,  MOVE N'WWI_Primary' TO N'/var/opt/mssql/data/WideWorldImporters.mdf'
,  MOVE N'WWI_UserData' TO N'/var/opt/mssql/data/WideWorldImporters_UserData.ndf'
,  MOVE N'WWI_Log' TO N'/var/opt/mssql/data/WideWorldImporters.ldf'
,  MOVE N'WWI_InMemory_Data_1' TO N'/var/opt/mssql/data/WideWorldImporters_InMemory_Data_1'
,  NOUNLOAD,  STATS = 5;
GO

USE [master]
RESTORE DATABASE [WideWorldImportersDW] 
FROM  DISK = N'/var/opt/mssql/data/WWIDW.bak' 
WITH  FILE = 1
,  REPLACE
,  MOVE N'WWI_Primary' TO N'/var/opt/mssql/data/WideWorldImportersDW.mdf'
,  MOVE N'WWI_UserData' TO N'/var/opt/mssql/data/WideWorldImportersDW_UserData.ndf'
,  MOVE N'WWI_Log' TO N'/var/opt/mssql/data/WideWorldImportersDW.ldf'
,  MOVE N'WWIDW_InMemory_Data_1' TO N'/var/opt/mssql/data/WideWorldImportersDW_InMemory_Data_1'
,  NOUNLOAD,  STATS = 5

GO

: Msg 3201, Level 16, State 2, Line 2
Cannot open backup device '/var/opt/mssql/data/WWI.bak'. Operating system error 2(The system cannot find the file specified.).

: Msg 3013, Level 16, State 1, Line 2
RESTORE DATABASE is terminating abnormally.

: Msg 3201, Level 16, State 2, Line 13
Cannot open backup device '/var/opt/mssql/data/WWIDW.bak'. Operating system error 2(The system cannot find the file specified.).

: Msg 3013, Level 16, State 1, Line 13
RESTORE DATABASE is terminating abnormally.

## Query Data

The SQL Server 2019 big data cluster Master Instance is a SQL Server Instance - and as such has most all of the query facilities and capabilities of Microsoft SQL Server running on Linux.

**TODO:** Run some standard queries. Investigate simple ML.

In [2]:
USE WideWorldImporters;
GO

/* Show the Populations. 
Where do we have the most people?
 */
SELECT CityName as 'City Name'
, StateProvinceName as 'State or Province'
, sp.LatestRecordedPopulation as 'Population'
, CountryName
FROM Application.Cities AS city
JOIN Application.StateProvinces AS sp on
    city.StateProvinceID = sp.StateProvinceID
JOIN Application.Countries AS ctry on 
    sp.CountryID=ctry.CountryID


City Name,State or Province,Population,CountryName
Aaronsburg,Pennsylvania,13284753,United States
Abanda,Alabama,5437278,United States
Abbeville,South Carolina,4774839,United States
Abbeville,Georgia,9992167,United States
Abbeville,Alabama,5437278,United States
Abbeville,Louisiana,4810488,United States
Abbeville,Mississippi,2991207,United States
Abbotsford,Wisconsin,6211317,United States
Abbott,Texas,27506120,United States
Abbott,Arkansas,3077747,United States


In [6]:
/* Show Customer Sales
Where do we have the most customers?
*/
USE WideWorldImporters;
GO

SELECT s.CustomerID
, s.CustomerName
, sc.CustomerCategoryName
,  pp.FullName AS PrimaryContact
,  ap.FullName AS AlternateContact
,  s.PhoneNumber
,  s.FaxNumber
,  bg.BuyingGroupName
,  s.WebsiteURL
,  dm.DeliveryMethodName AS DeliveryMethod
,  c.CityName AS CityName
,  s.DeliveryLocation AS DeliveryLocation
,  s.DeliveryRun
,  s.RunPosition
FROM Sales.Customers AS s
    LEFT OUTER JOIN Sales.CustomerCategories AS sc
    ON s.CustomerCategoryID = sc.CustomerCategoryID
    LEFT OUTER JOIN [Application].People AS pp
    ON s.PrimaryContactPersonID = pp.PersonID
    LEFT OUTER JOIN [Application].People AS ap
    ON s.AlternateContactPersonID = ap.PersonID
    LEFT OUTER JOIN Sales.BuyingGroups AS bg
    ON s.BuyingGroupID = bg.BuyingGroupID
    LEFT OUTER JOIN [Application].DeliveryMethods AS dm
    ON s.DeliveryMethodID = dm.DeliveryMethodID
    LEFT OUTER JOIN [Application].Cities AS c
    ON s.DeliveryCityID = c.CityID

## Next Step: Data Virtualization

**TODO:** Add Link