Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
24aa710
Merge remote-tracking branch 'refs/remotes/Microsoft/master'
segovoni Jan 7, 2018
65fa725
Added some media for incoming demos on SQL Graph database
segovoni Jan 7, 2018
88d21bb
Added the file README.md and the first demo of the session SQL Server…
segovoni Jan 8, 2018
6b9b8b4
Fixed a bug in SQL text visualization
segovoni Jan 8, 2018
3bd6763
Deleted some Unicode characters
segovoni Jan 8, 2018
943d122
Merge remote-tracking branch 'refs/remotes/Microsoft/master'
segovoni Jan 9, 2018
02b6c7e
Update file README.md
segovoni Jan 9, 2018
58e72cd
Merge remote-tracking branch 'refs/remotes/Microsoft/master'
segovoni Jan 9, 2018
fa89740
Update README.md
segovoni Jan 9, 2018
d2c576f
Update README.md
segovoni Jan 9, 2018
ee1925d
Update README.md
segovoni Jan 9, 2018
689959f
Update README.md
segovoni Jan 9, 2018
caa68b7
Update README.md
segovoni Jan 9, 2018
2ed4272
Update README.md
segovoni Jan 9, 2018
5f1d65c
Update README.md
segovoni Jan 9, 2018
709fde1
Update README.md
segovoni Jan 9, 2018
a07bc30
Update README.md
segovoni Jan 9, 2018
d92f90a
Update README.md
segovoni Jan 9, 2018
a69af7b
Update README.md
segovoni Jan 9, 2018
97ee82f
Merge remote-tracking branch 'refs/remotes/Microsoft/master'
segovoni Jan 25, 2018
fbd3c9a
Update README.md
segovoni Jan 25, 2018
9e0482f
Update README.md added the demo "Build a sample recommendation system…
segovoni Jan 25, 2018
0515126
Added the file sqlsat675 40 Recommendation system.sql that contains s…
segovoni Jan 25, 2018
f511703
Update README.md
segovoni Jan 25, 2018
c5204c9
Added the file sqlsat675 30 Many-to-many relationships.sql
segovoni Jan 25, 2018
907df0e
Delete folder sql-saturday-675-parma-italy
segovoni Jan 26, 2018
925856b
Add folder recommendation-system
segovoni Jan 26, 2018
fb29a59
Rename files
segovoni Jan 26, 2018
33cca00
Update README.md
segovoni Jan 26, 2018
1caba95
Update README.md
segovoni Jan 26, 2018
14fe4a3
Update media file related to SQL Graph demos
segovoni Jan 26, 2018
acb32bc
Used function JSON_VALUE instead of OPENJSON to extract the first lan…
segovoni Jan 26, 2018
c9172ef
Update README.md
segovoni Jan 26, 2018
a4de013
Update README.md
segovoni Jan 26, 2018
521a9eb
Update README.md
segovoni Jan 26, 2018
831f8aa
Update README.md
segovoni Jan 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added media/demos/sql-graph/Create a Node Table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/demos/sql-graph/Create an Edge Table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
166 changes: 166 additions & 0 deletions samples/demos/sql-graph/recommendation-system/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# SQL Server 2017 Graph Database

SQL Server has always provided tools to manage hierarchies and relationships, facilitating query execution on hierarchical data, but sometimes relationships can become complex. Think about many-to-many relationships, relational databases don't have a native solution for many-to-many associations. A common approach to realize many-to-many associations is to introduce a table that holds such relationships.

SQL Server 2017, thanks to Graph Database, can express certain kinds of queries more easily than a relational database by transforming complex relationships into graphs.

These demos, based on [WideWorldImporters](https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases/wide-world-importers) sample database, are related to the session that [Sergio Govoni](https://mvp.microsoft.com/it-it/PublicProfile/4029181?fullName=Sergio%20Govoni) has done at the PASS SQL Saturday 675 in Parma (Italy).

For those who don't already know the [SQL Saturday](http://www.sqlsaturday.com) events: Since 2007, the PASS SQL Saturday program provides to users around the world the opportunity to organize free training sessions on SQL Server and related technologies. SQL Saturday is an event sponsored by PASS and therefore offers excellent opportunities for training, professional exchange and networking. You can find all details in this page: [About PASS SQL Saturday](http://www.sqlsaturday.com/about.aspx).


### Contents

[About this sample](#about-this-sample)<br/>
[Before you begin](#before-you-begin)<br/>
[Run this sample](#run-this-sample)<br/>
[Disclaimers](#disclaimers)<br/>
[Related links](#related-links)<br/>

<a name=about-this-sample></a>

## About this sample

1. **Applies to:**
- Azure SQL Database v12 (or higher)
- SQL Server 2017 (or higher)
2. **Demos:**
- Build and populating nodes and edges tables
- The new MATCH function
- Build a recommendation system for sales offers
3. **Workload:** Queries executed on [WideWorldImporters](https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0)
4. **Programming Language:** T-SQL
5. **Author:** [Sergio Govoni](https://mvp.microsoft.com/it-it/PublicProfile/4029181?fullName=Sergio%20Govoni)

<a name=before-you-begin></a>

## Before you begin

To run these demos, you need the following prerequisites.

**Account and Software prerequisites:**

1. Either
- Azure SQL Database v12 (or higher)
- SQL Server 2017 (or higher)
2. SQL Server Management Studio 17.x (or higher)

**Azure prerequisites:**

1. An Azure subscription. If you don't already have an Azure subscription, you can get one for free here: [get Azure free trial](https://azure.microsoft.com/en-us/free/)

2. When your Azure subscription is ready to use, you have to create an Azure SQL Database, to do that, you must have completed the first three steps explained in [Design your first Azure SQL database](https://docs.microsoft.com/en-us/azure/sql-database/sql-database-design-first-database)

<a name=run-this-sample></a>

## Run this sample

### Setup

#### Azure SQL Database Setup

1. Download the **WideWorldImporters-Standard.bacpac** from the WideWorldImporters database [page](https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0)

2. Import the **WideWorldImporters-Standard.bacpac** bacpac file to your Azure subscription. This [article](https://www.sqlshack.com/import-sample-bacpac-file-azure-sql-database/) on SQL Shack explains how to import WideWorldImporters database to an Azure SQL Database, anyway, the instructions are valid for any bacpac file

3. Launch SQL Server Management Studio and connect to the newly created WideWorldImporters-Standard database

#### SQL Server Setup

1. Download **WideWorldImporters-Full.bak** from the WideWorldImporters database [page](https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0)

2. Launch SQL Server Management Studio, connect to your SQL Server instance (2017) and restore **WideWorldImporters-Full.bak**. For further information about how to restore a database backup using SQL Server Management Studio, you can refer to this article: [Restore a Database Backup Using SSMS](https://docs.microsoft.com/en-us/sql/relational-databases/backup-restore/restore-a-database-backup-using-ssms). Once you have restored the WideWorldImporters database, you can connect it using the **USE** command like this:

```SQL
USE [WideWorldImporters]
```

The purpose of the file [before-you-begin.sql](./before-you-begin.sql) is to connect the database WideWorldImporters and create two new schema: **Edges** and **Nodes**.


### Create and populate graph objects

The first demo consists in creating graph objects such as Nodes and Edges, this is the purpose of the file [demo1-create-and-populate-nodes-and-edges.sql](./demo1-create-and-populate-nodes-and-edges.sql). Let's start with the Node table named **Nodes.Person**. A node table represents an entity in a Graph DB, every time a node is created, in addition to the user defined columns, SQL Server Engine will create an implicit column named **$node_id** that uniquely identifies a given node in the database, it contains a combination of the **object_id** of the node and an internally bigint stored in an hidden column named **graph_id**.

The following picture shows the CREATE statement with the new DDL extension **AS NODE**, this extension tells to the engine that we want to create a Node table.

![Picture 1](../../../../media/demos/sql-graph/Create%20a%20Node%20Table.png)

Now, it's the time to create the Edge table named **Edges.Friends**. Every Edge represents a relationship in a graph, it may or may not have any user defined attributes, Edges are always directed and connected with two nodes. In the first release of SQL Graph, constraints are not available on the Edge table, so an Edge table can connect any two nodes on the graph. Every time an Edge table is created, in addition to the user defined columns, the Engine will create three implicit columns:

1. **$edge_id** is a combination of the **object_id** of the Edge and an internally bigint stored in an hidden column named **graph_id**

2. **$from_id** stores the **$node_id** of the node where the Edge starts from

3. **$to_id** stores the **$node_id** of the node at which the Edge ends

The following picture shows the CREATE statement with the new DDL extension **AS EDGE**, this extension tells to the engine that we want to create an Edge table.

![Picture 2](../../../../media/demos/sql-graph/Create%20an%20Edge%20Table.png)

The node **Nodes.Person** and the edge **Edges.Friends** are populated starting from the table **Application.People** of WideWorldImporters DB.

### The first look to the MATCH clause

The second demo allows you to do a first look to the MATCH clause used to perform some query on Nodes and Edges we have just created (in the first demo).

The new T-SQL MATCH function allows you to specify the search pattern for a graph schema, it can be used only with graph Node and Edge tables in SELECT statements as a part of the WHERE clause. Based on the node **Nodes.Person** and the edge **Edges.Friends**, the file [demo2-using-the-match-clause.sql](./demo2-using-the-match-clause.sql) contains the following sample queries:

1. List of all guys that speak finnish with friends (Pattern: Node > Relationship > Node)

2. List of the top 5 people who have friends that speak Greek in the first and second connections

3. People who have common friends that speak Croatian

The search pattern, provided in the MATCH function, goes through one node to another by an edge, in the direction provided by the arrow. Edge names or aliases are provided inside parenthesis. Node names or aliases appear at the two ends of the arrow.


### Build a sample recommendation system using SQL Graph

Supposing to have a customer (of the table Sales.Customers) connected to our e-commerce, this customer is looking for the product (of the table Warehouse.StockItems) "USB food flash drive - pizza slice" or he/she has just bought that product. Our goal is finding the similar products to the one he/she is looking at, based on the behavior of other customers. In short words, we have to find products that are recommended for another one.

The following picture shows a possible scenario for our sales recommendation system.

![Picture 3](../../../../media/demos/sql-graph/Sales%20Recommendation%20Scenario.png)

This is the algorithm:

1. Identify the customer and the product he/she is purchasing

2. Identify the other customers who have purchased the same item he/she is looking for

3. Find the other products that customers, at step two, have purchased

4. Recommend to the current customer the top items from the previous step, ordered by the number of times they were purchased

The following picture shows the graphical representation of the algorithm.

![Picture 4](../../../../media/demos/sql-graph/Sales%20Recommendation%20System.png)

The file [demo3-create-and-populate-nodes-and-edges.sql](./demo3-create-and-populate-nodes-and-edges.sql) contains the statements to create and populate the nodes **Nodes.Customers**, **Nodes.StockItems** and the edge **Edges.Bought** starting from the tables of WideWorldImporters DB.

How can Graph Database helps us to implement this algorithm?

MATCH clause can express certain kinds of queries more easily than relational JOINs; let's start, we will use the counts to prioritize the recommendations that is the simplest possible algorithm for a recommendation service, in reality more complex filters are applied on top, for example text analysis of the product reviews to arrive at similarly measures.

The file [demo3-recommendation-system-for-sales.sql](./demo3-recommendation-system-for-sales.sql) contains the query that is able to extract top 5 products that are recommended for "USB food flash drive - pizza slice" using the MATCH clause.

The last query of the file [demo3-recommendation-system-for-sales.sql](./demo3-recommendation-system-for-sales.sql) shows you the implementation of the algorithm in the relational database using JOINs.. so you will know how many lines of code you would have wrote prior to SQL Graph Database.

<a name=disclaimers></a>

## Disclaimers

The code included in this sample is not intended to be a set of best practices on how to build scalable enterprise grade applications. This is beyond the scope of this quick start sample.

<a name=related-links></a>

## Related Links

For more information about Graph DB in SQL Server 2017, see these articles:

1. [Graph processing with SQL Server and Azure SQL Database](https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-overview)

2. [SQL Graph Architecture](https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture)

3. [Arvind Shyamsundar's Blog](https://blogs.msdn.microsoft.com/arvindsh/)
25 changes: 25 additions & 0 deletions samples/demos/sql-graph/recommendation-system/before-you-begin.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
------------------------------------------------------------------------
-- Event: SQL Saturday #675 Parma, November 18 2017 -
-- http://www.sqlsaturday.com/675/EventHome.aspx -
-- Session: SQL Server 2017 Graph Database -
-- Demo: Setup -
-- Author: Sergio Govoni -
-- Notes: -- -
------------------------------------------------------------------------

-- Full backup of WideWorldImporters sample database is available on GitHub
-- https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0

-- Documentation about WideWorldImporters sample database for SQL Server
-- and Azure SQL Database
-- https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases/wide-world-importers


USE [WideWorldImporters];
GO

CREATE SCHEMA [Nodes];
GO

CREATE SCHEMA [Edges];
GO
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
------------------------------------------------------------------------
-- Event: SQL Saturday #675 Parma, November 18 2017 -
-- http://www.sqlsaturday.com/675/EventHome.aspx -
-- Session: SQL Server 2017 Graph Database -
-- Demo: Demo1: Create and populate Nodes and Edges -
-- Author: Sergio Govoni -
-- Notes: -- -
------------------------------------------------------------------------

USE [WideWorldImporters];
GO

------------------------------------------------------------------------
-- Nodes -
------------------------------------------------------------------------

DROP TABLE IF EXISTS Nodes.Person;
GO

-- Create a node table
CREATE TABLE Nodes.Person
(
PersonID INTEGER NOT NULL PRIMARY KEY
,FullName NVARCHAR(50) NOT NULL
,[Language] NVARCHAR(50) NOT NULL
) AS NODE;
GO


SELECT * FROM sys.tables WHERE is_node = 1;
GO

SELECT
*
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
Table_Schema = 'Nodes'
AND Table_Name = 'Person';
GO


/*
SELECT FullName, CustomFields, *
FROM [Application].[People];
GO
*/

INSERT INTO Nodes.Person
(
PersonID
,FullName
,[Language]
)
SELECT
PersonID
,FullName
,[Language] = (SELECT JSON_VALUE(CustomFields, '$.OtherLanguages[0]'))
FROM
[Application].[People]
WHERE
(CustomFields IS NOT NULL)
AND ((SELECT JSON_VALUE(CustomFields, '$.OtherLanguages[0]')) <> '');
GO


SELECT * FROM Nodes.Person


------------------------------------------------------------------------
-- Edges -
------------------------------------------------------------------------

DROP TABLE IF EXISTS Edges.Friends;

-- Create an edge table
CREATE TABLE Edges.Friends
(
StartDate DATETIME NOT NULL
)
AS EDGE;
GO


SELECT * FROM sys.tables WHERE is_edge = 1;
GO

SELECT
C.*
FROM
sys.tables AS T
JOIN
INFORMATION_SCHEMA.COLUMNS AS C ON C.Table_Name = T.[name] AND C.Table_Schema = SCHEMA_NAME(T.[schema_id])
WHERE
(T.is_edge = 1)
ORDER BY
C.Table_Schema, C.Table_Name
GO



-- Insert friends who speak the same language
-- (one direction)
-- If a person speaks Finnish, I take for granted that
-- their friends speak Finnish
WITH Friends_Same_Language AS
(
SELECT
P1.$node_id AS From_Node_Id
,P2.$node_id AS To_Node_Id
,GETDATE() AS StartDate
,Direction = ROW_NUMBER() OVER (PARTITION BY P1.[Language], P2.[language] ORDER BY (SELECT NULL))
,From_FullName = P1.FullName
,From_Language = P1.[Language]
,To_FullName = P2.FullName
,To_Language = P2.[Language]
FROM
Nodes.Person AS P1
INNER JOIN
Nodes.Person AS P2 ON P1.[Language] = P2.[Language]
WHERE
-- The person itself isn't included
(P1.$node_id <> P2.$node_id)
)
INSERT INTO Edges.Friends
(
$from_id
,$to_id
,StartDate
)
SELECT
From_Node_Id
,To_Node_Id
,StartDate
FROM
Friends_Same_Language
WHERE
(Direction = 1);
GO


SELECT * FROM Nodes.Person;
GO


-- Insert some random connections
INSERT INTO Edges.Friends
(
$from_id
,$to_id
,StartDate
)
SELECT
From_Node_Id
,To_Node_Id
,StartDate
FROM
(
SELECT DISTINCT TOP 40
New_ID = NEWID()
,P1.$node_id AS From_Node_Id
,P2.$node_id AS To_Node_Id
,GETDATE() AS StartDate
FROM
Nodes.Person AS P1
INNER JOIN
Nodes.Person AS P2 ON P1.[Language] < P2.[Language]
WHERE
(P1.$node_id <> P2.$node_id)
ORDER BY
New_ID
) AS T;
GO
Loading