# データ仮想化

PolyBase を使用したデータ仮想化によるデータ取得

本デモでは、例として次の 2 種類の外部データソースにアクセスにアクセスし、組み合わせて利用
- Azure SQL Database
- CosmosDB (Mongo API)

## 1. SQL Database へのアクセス  

In [2]:
USE [DataVirtualization];

-- オブジェクトの初期化
IF EXISTS (SELECT * FROM sys.external_tables WHERE name = 'AzureSQLDB')
BEGIN
	DROP EXTERNAL TABLE AzureSQLDB
END;
IF EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'SQLDB')
BEGIN
	DROP EXTERNAL DATA SOURCE SQLDB
END;

IF EXISTS (SELECT * FROM sys.database_scoped_credentials WHERE name = 'SQLDB')
BEGIN
	DROP DATABASE SCOPED CREDENTIAL SQLDB 
END
GO

**1. 資格情報の作成**

作成した資格情報を使用して、SQL Database に接続が行われる

In [6]:
USE [master];
DECLARE @ID varchar(50), @SECRET varchar(100)
SELECT @ID = ID, @SECRET = SECRET FROM T_ID WHERE TYPE ='SQLDB'

USE [DataVirtualization];

DECLARE @sql varchar(8000)
EXEC xp_sprintf @sql OUTPUT, 
    'CREATE DATABASE SCOPED CREDENTIAL [SQLDB] 
		WITH IDENTITY = ''%s'', 
		SECRET=''%s''', 
	@ID, @SECRET
EXEC (@sql)
GO

**2. 外部データソースの作成**

SQL Server ベースの環境に接続する場合は、「sqlserver://」を指定する

In [7]:
USE [DataVirtualization];

-- 従来までの PolyBase と異なり BLOB / HDFS 以外を外部データソースとして登録可能
CREATE EXTERNAL DATA SOURCE [SQLDB]
WITH (
		LOCATION= 'sqlserver://<Server Name>.database.windows.net', 
		CREDENTIAL = [SQLDB]
)
GO

**3. 外部テーブルの作成**

作成した外部データソースを使用して、外部テーブルを作成  
BDC のデータ仮想化は、外部テーブルにより、SQL Server 外のデータにアクセスを行う

In [8]:
USE [DataVirtualization];

-- SQL DB を外部テーブルとして追加 (tpch の NATION を SQL DB に作成済み) 
CREATE EXTERNAL TABLE [AzureSQLDB]
( 
	[N_NATIONKEY] [int] NOT NULL,
	[N_NAME] [char](25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	[N_REGIONKEY] [int] NOT NULL,
	[N_COMMENT] [varchar](152) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) 
WITH
( 
    LOCATION = 'tpch.dbo.NATION', 
    DATA_SOURCE = [SQLDB]
)
GO

![SQLDB External Table](https://github.com/MasayukiOzawa/decode-2019-demo/raw/master/Images/02.Data%20Virtualization/SQLDB%20External%20Table.png)

**4. 外部テーブル経由でリモートデータを参照**

T-SQL により、リモートの SQL Server (SQL DB) のデータにアクセス  
ローカルのテーブルを参照しているが、実際のデータはリモートに格納されている


In [9]:
USE [DataVirtualization];

-- Big Data Cluster をデータハブとして使用し、SQL DB のデータを検索
SELECT * FROM [AzureSQLDB]

N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT
0,ALGERIA,0,haggle. carefully final deposits detect slyly agai
1,ARGENTINA,1,al foxes promise slyly according to the regular accounts. bold requests alon
2,BRAZIL,1,y alongside of the pending deposits. carefully special packages are about the ironic forges. slyly special
3,CANADA,1,"eas hang ironic, silent packages. slyly regular packages are furiously over the tithes. fluffily bold"
4,EGYPT,4,y above the carefully unusual theodolites. final dugouts are quickly across the furiously regular d
5,ETHIOPIA,0,ven packages wake quickly. regu
6,FRANCE,3,"refully final requests. regular, ironi"
7,GERMANY,3,"l platelets. regular accounts x-ray: unusual, regular acco"
8,INDIA,2,ss excuses cajole slyly across the packages. deposits print aroun
9,INDONESIA,2,slyly express asymptotes. regular deposits haggle slyly. carefully ironic hockey players sleep blithely. carefull


![Remote Query Plan](https://github.com/MasayukiOzawa/decode-2019-demo/raw/master/Images/02.Data%20Virtualization/SQLDB%20Remote%20Query.png)

## 2. MongoDB へのアクセス  

In [2]:
USE [DataVirtualization];

-- オブジェクトの初期化
IF EXISTS (SELECT * FROM sys.external_tables WHERE name = 'MongoDB_REGION')
BEGIN
	DROP EXTERNAL TABLE MongoDB_REGION
END;
IF EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'MongoDBInstance')
BEGIN
	DROP EXTERNAL DATA SOURCE MongoDBInstance
END;

IF EXISTS (SELECT * FROM sys.database_scoped_credentials WHERE name = 'MongoDBCredentials')
BEGIN
	DROP DATABASE SCOPED CREDENTIAL MongoDBCredentials 
END
GO

**1. 資格情報の作成**

作成した資格情報を使用して、MongoDB (今回は CosmosDB の Mongo API) に接続が行われる

In [3]:
USE [master];
DECLARE @ID varchar(50), @SECRET varchar(100)
SELECT @ID = ID, @SECRET = SECRET FROM T_ID WHERE TYPE ='MongoDB'

USE [DataVirtualization];

DECLARE @sql varchar(8000)
EXEC xp_sprintf @sql OUTPUT, 
    'CREATE DATABASE SCOPED CREDENTIAL [MongoDBCredentials] 
		WITH IDENTITY = ''%s'', 
		SECRET=''%s''', 
	@ID, @SECRET
EXEC (@sql)
GO

**2. 外部データソースの作成**

MongoDB の環境に接続する場合は、「mongdb://」を指定する

In [4]:
CREATE EXTERNAL DATA SOURCE MongoDBInstance
WITH ( 
LOCATION = 'mongodb://<Server Name>.documents.azure.com:10255',
CREDENTIAL = MongoDBCredentials
);
GO

**3. 外部テーブルの作成**

作成した外部データソースを使用して、外部テーブルを作成 

In [5]:
USE [DataVirtualization];

CREATE EXTERNAL TABLE MongoDB_REGION
( 
	[_id] NVARCHAR(24) COLLATE Japanese_CI_AS NOT NULL,
	[R_REGIONKEY] INT, 
	[R_NAME] NVARCHAR(4000) COLLATE Japanese_CI_AS, 
	[R_COMMENT] NVARCHAR(4000) COLLATE Japanese_CI_AS
) 
WITH
( 
    LOCATION = 'tpch.REGION', 
    DATA_SOURCE = MongoDBInstance
);
GO


**4. 外部テーブル経由でリモートデータを参照**

T-SQL により、MongoDB のデータにアクセス  
(FORCE SCALEOUTEXECUTION / DISABLE SCALEOUTEXECUTION により、コンピューティングの利用の強制 / 無効化を制御できる)

In [1]:
USE [DataVirtualization];
SELECT * FROM MongoDB_REGION
OPTION(FORCE SCALEOUTEXECUTION);

_id,R_REGIONKEY,R_NAME,R_COMMENT
5cce8456d59b29463c827f4d,0,AFRICA,lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to
5cce84868f678e0f504c6faa,2,ASIA,ges. thinly even pinto beans ca
5cce84a68f678e0f504c6fac,4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl
5cce84758f678e0f504c6fa9,1,AMERICA,"hs use ironic, even requests. s"
5cce84958f678e0f504c6fab,3,EUROPE,ly final courts cajole furiously final excuse


![MongoDB Data](https://github.com/MasayukiOzawa/decode-2019-demo/raw/master/Images/02.Data%20Virtualization/MongoDB%20Data.png)

## 3. 複数のデータソースを組み合わせて利用  
データ仮想化は、各データソースのデータを単一で使うだけではなく、組み合わせて使用することができる。  
SQL DB と MongoDB のデータを組み合わせて利用

In [2]:
USE [DataVirtualization];
SELECT 
	M.R_REGIONKEY,
	M.R_NAME,
	M.R_COMMENT,
	A.N_NATIONKEY,
	A.N_NAME,
	A.N_COMMENT
FROM
	MongoDB_REGION AS M
	LEFT JOIN
	AzureSQLDB AS A
	ON 
	M.R_REGIONKEY = N_REGIONKEY
WHERE
	M.R_REGIONKEY = 4
ORDER BY
	M.R_REGIONKEY ASC

R_REGIONKEY,R_NAME,R_COMMENT,N_NATIONKEY,N_NAME,N_COMMENT
4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl,4,EGYPT,y above the carefully unusual theodolites. final dugouts are quickly across the furiously regular d
4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl,10,IRAN,efully alongside of the slyly final dependencies.
4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl,11,IRAQ,nic deposits boost atop the quickly final requests? quickly regula
4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl,13,JORDAN,ic deposits are blithely about the carefully regular pa
4,MIDDLE EAST,uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl,20,SAUDI ARABIA,ts. silent requests haggle. closely express packages sleep across the blithely


![Multi Data Access](https://github.com/MasayukiOzawa/decode-2019-demo/raw/master/Images/02.Data%20Virtualization/Multi%20Data%20Source%20Access.png)