<img src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/solutions-microsoft-logo-small.png?raw=true" alt="Microsoft">
<br>

# Workshop: Microsoft SQL Server Machine Learning Services

#### <i>A Microsoft Course from the SQL Server team</i>

## SQL Server Machine Learning Services Architecture

<p style="border-bottom: 1px solid lightgrey;"></p>


<h2><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/pin.jpg?raw=true">Installing SQL Server Machine Learning Services</h2>

Although you completed this in your pre-requisites for this course, here's a quick review of installing the Machine Learning Services features for SQL Server. In this course, you're focusing on SQL Server version 2019. 

You can install SQL Server Machine Learning Services (as of SQL Server 2019) on the following Editions:

 - Enterprise (*basic and enhanced functions*)
 - Standard (*basic and enhanced functions*)
 - Web (*basic functions*)
 - Express with Advanced Services (*basic functions*)
 - The R language extension is available on the Microsoft Azure SQL Database platform for single databases and elastic pools using the *vCore*-based purchasing model in the *general purpose* and *business critical* service tiers. 

You can follow the [full installation process here](https://docs.microsoft.com/en-us/sql/advanced-analytics/install/sql-machine-learning-services-windows-install?view=sql-server-ver15). Note that the installation process is different for Windows and Linux. 

Note that the SQL Server Installer is also used to install a stand-alone instance of Microsoft Machine Learning Server. Do not select that option for including Machine Learning Services in SQL Server.

<br>
<img style="float: left; margin: 0px 15px 15px 0px;" src="https://docs.microsoft.com/en-us/sql/advanced-analytics/install/media/2017setup-features-page-mls-rpy.png?view=sql-server-2017">


<h2><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/textbubble.png?raw=true">Understanding the SQL Server ML Services Architecture</h2>

The SQL Server Extensibility Framework is an architecture for executing external code: Java (starting in SQL Server 2019), Python (starting in SQL Server 2017), and R (starting in SQL Server 2016). Code execution is isolated from the core engine processes, but fully integrated with SQL Server query execution. This means that you can push data from any SQL Server query to the external runtime, and consume or persist results back in SQL Server.

SQL Server 2016 introduced the R language as a companion server alongside the SQL Server Instance - called a *satellite*. SQL Server 2017 introduced Python language support, in the same fashion and usage as R. In SQL Server 2019, Java was added as an additional satellite process.


<h2><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/pin.jpg?raw=true">The SQL Server Extensibility Framework</h2>


Here are the basics of these components: 

**Component:** Description

**SQL Server Process** (*sqlsrver.exe*): SQL Server Engine. Calls the Launchpad service.

**Launchpad** (*launchpad.exe*): Service/Daemon that executes and manages the external script process. Calls a Launcher DLL specific to the Language.

**Launcher DLL** (*RLauncher.dll* for R, *PythonLauncher.dll* for Python): Extension for each language. Calls the language executable environment.

**R, Python, Java**:	The environments that run the languages for Machine Learning. The specific versions, editions, releases and bit-levels are installed for you by the SQL Server Installer, even if you have them installed already. Calls a *BxLServer*.

**BxlServer** (*bxlserver.exe*): Manages communication between SQL Server and external languages using *Windows Job Objects*. Receives and makes calls from and to the *SQL Satellite*.

**SQL Satellite** (*sqlsatellite.exe*): Handles input and output variables and data exchange, including basic data type resolution and error handling (**Note: you should still explicitly control data type transformations in code**) Receives and makes calls to and from the *SQL Server* process.

If selected, the Installer program for SQL Server installs the Microsoft ML Server-supported language runtime environments *alongside* the SQL Server Instance, and then sets up a Service (the *SQL Launchpad*) allowing the sandboxed processes to communicate over a secure channel. It also sets up several other components to allow the *scoring*, and in some cases the *training*, of Machine Learning models using Python or R. 

<img src="https://docs.microsoft.com/en-us/sql/advanced-analytics/media/generic-architecture.png?view=sql-server-2017">


<p><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/checkbox.png?raw=true"><b>Activity: Enable external script execution in SQL Server</b></p>

- In Azure Data Studio, create a connection to a SQL Server 2019 or higher Instance of SQL Server that you have administrative rights on. <a href="https://docs.microsoft.com/en-us/sql/azure-data-studio/quickstart-sql-server?view=sql-server-ver15" target="_blank">(<i>You can read about how to do that here</i>)</a>.
- Select your Instance's Connection in the <b>Attach To:</b> box at the top of this notebook.
- Click the <b>Not Trusted</b> box next to that to make this Notebook <b>Trusted</b>. (<i>Do this with all Notebooks in this course - it means you allow OS commands and other operations on this system.</i>)
- Now run the following code cell:


In [1]:
/* Enable ML Services
NOTE: 
You must have SQL Server ML Services Installed,
The SQL server Launchpad Service must be running, 
and you may need to restart the SQL Server Service if the scripts below do not work. 
More detailed information here: https://docs.microsoft.com/en-us/sql/advanced-analytics/install/sql-machine-learning-services-windows-install?view=sql-server-ver15 
*/

EXEC sp_configure  'external scripts enabled', 1
RECONFIGURE WITH OVERRIDE

When the R or Python language is called via a special Stored Procedure (which you must enable) SQL Server transfers data to the R or Python process which runs the code, and returns the result to the Stored Procedure in SQL Server.

<p>
<img src="https://github.com/amthomas46/SQL/blob/master/sql-cs-icc/code/sql-notebooks/images/java-r-python.png?raw=true" width="500">

Here's a breakdown of the code:

<p>
<img src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/TSQLAndR.png?raw=true" width="500">
<p>

You can run code to execute in Python as well as R. You can use either language by simply setting a parameter in the Stored Procedure.

This allows SQL Server professionals to work with and hybrid data in the way they are familiar with, and the Data Scientist to develop their R or Python code anywhere, and then deploying that code to SQL Server by embedding it in a Stored Procedure.

Run a few statements that implement this process:

In [2]:
/* Test R */
EXEC sp_execute_external_script  @language =N'R',
@script=N'
OutputDataSet <- InputDataSet;
',
@input_data_1 =N'SELECT 1 AS [Is R Working]'
WITH RESULT SETS (([Is R Working] int not null));
GO

Is R Working
1


In [3]:
/* Test Python */
EXEC sp_execute_external_script  @language =N'Python',
@script=N'
OutputDataSet = InputDataSet;
',
@input_data_1 =N'SELECT 1 AS [Is Python Working]'
WITH RESULT SETS (([Is Python Working] int not null));
GO

Is Python Working
1


In [4]:
/* Get R Info */
EXECUTE sp_execute_external_script @language = N'R'
, @script = N'
OutputDataSet <- data.frame(installed.packages()[,c("Package", "Version", "Depends", "License", "LibPath")]);'
WITH RESULT SETS(
    (Package NVARCHAR(255)
    , Version NVARCHAR(100)
    , Depends NVARCHAR(4000)
    , License NVARCHAR(1000)
    , LibPath NVARCHAR(2000))
    );
GO

Package,Version,Depends,License,LibPath
CompatibilityAPI,1.1.0,R (>= 3.2.2),file LICENSE,/opt/mssql/mlservices/libraries/RServer
MicrosoftML,9.4.6,"R (>= 3.3.2), methods, RevoScaleR (>= 9.2.1)",file LICENSE,/opt/mssql/mlservices/libraries/RServer
RevoPemaR,10.0.0,"R (>= 3.1.1), methods",Apache License 2.0,/opt/mssql/mlservices/libraries/RServer
RevoScaleR,9.4.6,R (>= 3.2.2),file LICENSE,/opt/mssql/mlservices/libraries/RServer
RevoTreeView,10.0.0,,file LICENSE,/opt/mssql/mlservices/libraries/RServer
doRSR,10.0.0,"R (>= 2.5.0), foreach(>= 1.2.0), iterators(>= 1.0.0), RevoScaleR(>= 2.0-0), utils, RevoUtils",file LICENSE,/opt/mssql/mlservices/libraries/RServer
mrsdeploy,1.1.3,R (>= 3.3.0),file LICENSE,/opt/mssql/mlservices/libraries/RServer
sqlrutils,1.0.0,R (>= 3.2.2),file LICENSE,/opt/mssql/mlservices/libraries/RServer
BH,1.66.0-1,,BSL-1.0,/opt/mssql/mlservices/runtime/R/library
DBI,1.0.0,"R (>= 3.0.0), methods",LGPL (>= 2),/opt/mssql/mlservices/runtime/R/library


In [5]:
/* Get Python Info */
EXECUTE sp_execute_external_script
@language =N'Python',
@script=N'import sys
print(sys.version)';
GO

EXECUTE sp_execute_external_script 
  @language = N'Python', 
  @script = N'import pip
import pandas as pd
installed_packages = pip.get_installed_distributions()
installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
   for i in installed_packages])
df = pd.DataFrame(installed_packages_list)
OutputDataSet = df'
WITH RESULT SETS (( InstalledPackageAndVersion nvarchar (150) ))

InstalledPackageAndVersion
adal==1.2.0
alabaster==0.7.9
anaconda-clean==1.0
anaconda-client==1.5.1
anaconda-navigator==1.3.1
argcomplete==1.0.0
astroid==1.4.7
astropy==1.2.1
autovizwidget==0.12.1
azureml-model-management-sdk==1.0.1b10


<h2><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/pin.jpg?raw=true">1.2 Programming SQL Server Machine Learning Services</h2>

You have two methods of working with Machine Learning Services in SQL Server:

1. You can create your R or Python code on a workstation with the Microsoft R or Python Libraries installed, which run certain operations on the SQL Server Instance remotely.
2. You can "wrap" the R or Python code in a Stored Procedure server-side, and run standard Transact-SQL statements to call for the scoring.

You can use four methods of running the Models you create in Machine Learning Services (in SQL Server 2019):

1. Using the Extensibility Framework, you can create trained Machine Learning Models, and store them as a binary object in a SQL Server table. You can then "score" (do the predictions or classifications) in SQL Server by loading the binary model and using it in Python or R code wrapped in a Stored Procedure.
2. Using the Native Scoring feature of the `PREDICT` Transact-SQL statement against a trained Machine Learning Model. The model that you use must have been created using one of the supported algorithms from the RevoScaleR package. 
3. Using the `sp_rxPredict` stored procedure provided as a wrapper for the `rxPredict` R function in RevoScaleR and MicrosoftML, and the `rx_predict` Python function in revoscalepy and microsoftml. It is written in C++ and is optimized specifically for scoring operations.
4. The *big data clusters* (BDC) feature provides not only the ML Server processes, but adds a Spark environment for *SparkR, PySpark, SparkML* and other libraries for Machine Learning over HDFS and database data.  This course focuses on the previous three methods, but a [full course on using BDC is here](https://github.com/Microsoft/sqlworkshops/tree/master/sqlserver2019bigdataclusters).
 

<p><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/thinking.jpg?raw=true"><b>For Further Study</b></p>

<br>
<br>

- Primary Documentation: [https://docs.microsoft.com/en-us/sql/advanced-analytics/r/sql-server-r-services?view=sql-server-2017](https://docs.microsoft.com/en-us/sql/advanced-analytics/r/sql-server-r-services?view=sql-server-2017)

- https://microsoft.github.io/sql-ml-tutorials/R/customerclustering/

<p><img style="float: left; margin: 0px 15px 15px 0px;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/education1.png?raw=true"><b>Next</b>: Project Methodology and Data Science</p>

Next, you'll learn more about setting up your project structure and working with Data Science in *02 Project Methodology and Data Science*. Open that Notebook to continue.
