# Demo

Start looking at the PBI Dashboard and get insights of how your database is.

For this demo we are going to fix the over-partition table that has a CCI bad quality

[Partitioning tables in dedicated SQL pool - Azure Synapse Analytics | Microsoft Docs](https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition)

## Actual State

Using these views, we will understand how CCI is  

- [vCCIHEalth](https://github.com/microsoft/Azure_Synapse_Toolbox/blob/master/TSQL_Queries/Indexes/CCIHealthByTable.sql) 
- [vCCI\_Stats\_Detail](https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-memory-optimizations-for-columnstore-compression)

### **vCCIHealth**

This view can be created and used on your system to compute the average rows per row group and identify any suboptimal cluster columnstore indexes. 

### **vCCI\_Stats\_Detail**

This view contains useful information such as the number of rows in rowgroups and the reason for trimming if there was trimming.

In [1]:
SELECT * FROM dbo.vCCIHEalth WHERE table_name = 'Trip_Partitioned'

Schema_Name,Table_Name,Distribution_type,Total_Rows,Column_Count,OPEN_Row_Groups,OPEN_rows,MIN OPEN Row Group Rows,MAX OPEN_Row Group Rows,AVG OPEN_Row Group Rows,COMPRESSED_Row_Groups,COMPRESSED_Rows,Deleted_COMPRESSED_Rows,MIN COMPRESSED Row Group Rows,MAX COMPRESSED Row Group Rows,AVG_COMPRESSED_Rows,CLOSED_Row_Groups,CLOSED_Rows,MIN CLOSED Row Group Rows,MAX CLOSED Row Group Rows,AVG CLOSED Row Group Rows
dbo,Trip_Partitioned,HASH,170261325,23,647,20160109,422,98921,31159,720,150101216,0,160767,209717,208473,0,0,,,


In [8]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned' 
--and state_desc <> 'COMPRESSED' 
ORDER BY trim_reason_desc DESC

logical_table_name,row_group_id,state,state_desc,total_rows,trim_reason_desc,physical_name
Trip_Partitioned,2,3,COMPRESSED,53845,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,2,3,COMPRESSED,61026,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_8
Trip_Partitioned,2,3,COMPRESSED,61959,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_9
Trip_Partitioned,2,3,COMPRESSED,59189,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_37
Trip_Partitioned,2,3,COMPRESSED,65396,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_38
Trip_Partitioned,2,3,COMPRESSED,53845,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_39
Trip_Partitioned,2,3,COMPRESSED,50490,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_21
Trip_Partitioned,2,3,COMPRESSED,58958,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_42
Trip_Partitioned,2,3,COMPRESSED,50987,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_43
Trip_Partitioned,2,3,COMPRESSED,54392,REORG,Table_6cfc016a51394c4d8139e93d765b2d77_11


Checking how many partition a spefic table has

In [24]:
SELECT  
	  QUOTENAME(s.[name])+'.'+QUOTENAME(t.[name]) as Table_name
	, i.[name] as Index_name
	, COUNT(*) AS Partition_total
FROM    sys.partitions p
JOIN    sys.tables     t    ON    p.[object_id]   = t.[object_id]
JOIN    sys.schemas    s    ON    t.[schema_id]   = s.[schema_id]
JOIN    sys.indexes    i    ON    p.[object_id]   = i.[object_Id]
                            AND   p.[index_Id]    = i.[index_Id]
WHERE t.[name] = 'Trip_Partitioned' 
GROUP BY 	  
	  QUOTENAME(s.[name])+'.'+QUOTENAME(t.[name]) 
	, i.[name] 

Table_name,Index_name,Partition_total
[dbo].[Trip_Partitioned],ClusteredIndex_b18d11a02b7e48dda241dd83c29aaec9,13


## Issue and Fix

Creating a table with too many partitions can hurt performance due to processing overhead of partitioning leading to a inefficient CCI Rowgroups.

When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distribution  

### **_Formula_**

1.000.000 rows X 60 distribution X number of partitions 

1.000.000 \* 60 \* 13 =  780.000.000 mininum rows evenly distributed

### **Fix**

For this case, removing all partitions will be benefit and perform better.

In [1]:
-- Use LoadData user
CREATE TABLE Trip_Partitioned_new
WITH
(
	DISTRIBUTION = HASH ( [MedallionID] ),
	CLUSTERED COLUMNSTORE INDEX 
)
AS SELECT * FROM Trip_Partitioned
OPTION (LABEL = 'Loading Data')
GO

_**Let's compare the tables**_

_**Pay attention to # of Open Row Groups and**_

In [11]:
SELECT * FROM dbo.vCCIHEalth WHERE Table_Name = 'Trip_Partitioned';


Schema_Name,Table_Name,Distribution_type,Total_Rows,Column_Count,OPEN_Row_Groups,OPEN_rows,MIN OPEN Row Group Rows,MAX OPEN_Row Group Rows,AVG OPEN_Row Group Rows,COMPRESSED_Row_Groups,COMPRESSED_Rows,Deleted_COMPRESSED_Rows,MIN COMPRESSED Row Group Rows,MAX COMPRESSED Row Group Rows,AVG_COMPRESSED_Rows,CLOSED_Row_Groups,CLOSED_Rows,MIN CLOSED Row Group Rows,MAX CLOSED Row Group Rows,AVG CLOSED Row Group Rows
dbo,Trip_Partitioned,HASH,171164393,23,623,18814848,422,98921,30200,744,151446477,0,50490,209717,203557,0,0,,,


In [2]:
SELECT * FROM dbo.vCCIHEalth WHERE Table_Name = 'Trip_Partitioned_new';

Schema_Name,Table_Name,Distribution_type,Total_Rows,Column_Count,OPEN_Row_Groups,OPEN_rows,MIN OPEN Row Group Rows,MAX OPEN_Row Group Rows,AVG OPEN_Row Group Rows,COMPRESSED_Row_Groups,COMPRESSED_Rows,Deleted_COMPRESSED_Rows,MIN COMPRESSED Row Group Rows,MAX COMPRESSED Row Group Rows,AVG_COMPRESSED_Rows,CLOSED_Row_Groups,CLOSED_Rows,MIN CLOSED Row Group Rows,MAX CLOSED Row Group Rows,AVG CLOSED Row Group Rows
dbo,Trip_Partitioned_new,HASH,170261325,23,2,127888,57505,70383,63944,238,170133437,0,104585,1048576,714846,0,0,,,


In [13]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned' ORDER BY state_desc DESC;


logical_table_name,row_group_id,state,state_desc,total_rows,trim_reason_desc,physical_name
Trip_Partitioned,1,1,OPEN,21722,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,34692,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,46008,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,34213,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,26060,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,25720,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,4782,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,32728,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,40839,,Table_6cfc016a51394c4d8139e93d765b2d77_1
Trip_Partitioned,1,1,OPEN,45479,,Table_6cfc016a51394c4d8139e93d765b2d77_1


In [3]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned_new' ORDER BY state_desc DESC, trim_reason_desc DESC;

logical_table_name,row_group_id,state,state_desc,total_rows,trim_reason_desc,physical_name
Trip_Partitioned_new,3,1,OPEN,70383,,Table_f51155a33c3b444aba95aee00c2cf831_35
Trip_Partitioned_new,2,1,OPEN,57505,,Table_f51155a33c3b444aba95aee00c2cf831_35
Trip_Partitioned_new,1,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_35
Trip_Partitioned_new,0,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_35
Trip_Partitioned_new,1,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_60
Trip_Partitioned_new,0,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_60
Trip_Partitioned_new,1,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_1
Trip_Partitioned_new,0,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_1
Trip_Partitioned_new,1,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_2
Trip_Partitioned_new,0,3,COMPRESSED,1048576,NO_TRIM,Table_f51155a33c3b444aba95aee00c2cf831_2


In [4]:
RENAME OBJECT Trip_Partitioned TO Trip_Partitioned_orig
RENAME OBJECT Trip_Partitioned_new TO Trip_Partitioned

In [15]:
DROP TABLE dbo.Trip_Partitioned_orig

Back to PBI

Refresh the Dataset