# Demo

Start looking at the PBI Dashboard and get insights of how your database is.

For this demo we are going to fix the over-partition table that has a CCI bad quality

[Partitioning tables in dedicated SQL pool - Azure Synapse Analytics | Microsoft Docs](https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition)

## Actual State

Using these views, we will understand how CCI is  

- [vCCIHEalth](https://github.com/microsoft/Azure_Synapse_Toolbox/blob/master/TSQL_Queries/Indexes/CCIHealthByTable.sql) 
- [vCCI\_Stats\_Detail](https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-memory-optimizations-for-columnstore-compression)

### **vCCIHealth**

This view can be created and used on your system to compute the average rows per row group and identify any suboptimal cluster columnstore indexes. 

### **vCCI\_Stats\_Detail**

This view contains useful information such as the number of rows in rowgroups and the reason for trimming if there was trimming.

Focus on **OPEN_Row_Groups, MAX COMPRESSED Row Group Rows and AVG_COMPRESSED_Rows** columns

In [None]:
SELECT * FROM dbo.vCCIHEalth WHERE table_name = 'Trip_Partitioned'

In [None]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned' 
--and state_desc <> 'COMPRESSED' 
ORDER BY trim_reason_desc DESC

Checking how many partition a spefic table has

In [None]:
SELECT  
	  QUOTENAME(s.[name])+'.'+QUOTENAME(t.[name]) as Table_name
	, i.[name] as Index_name
	, COUNT(*) AS Partition_total
FROM    sys.partitions p
JOIN    sys.tables     t    ON    p.[object_id]   = t.[object_id]
JOIN    sys.schemas    s    ON    t.[schema_id]   = s.[schema_id]
JOIN    sys.indexes    i    ON    p.[object_id]   = i.[object_Id]
                            AND   p.[index_Id]    = i.[index_Id]
WHERE t.[name] = 'Trip_Partitioned' 
GROUP BY 	  
	  QUOTENAME(s.[name])+'.'+QUOTENAME(t.[name]) 
	, i.[name] 

## Issue and Fix

Creating a table with too many partitions can hurt performance due to processing overhead of partitioning leading to a inefficient CCI Rowgroups.

When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distribution.

### **_Formula_**

1.000.000 rows X 60 distribution X number of partitions 

1.000.000 \* 60 \* 13 =  780.000.000 mininum rows evenly distributed

### **Fix**

For this case, removing all partitions will be benefit and perform better.

In [None]:
-- Use LoadData user
CREATE TABLE Trip_Partitioned_new
WITH
(
	DISTRIBUTION = HASH ( [MedallionID] ),
	CLUSTERED COLUMNSTORE INDEX 
)
AS SELECT * FROM Trip_Partitioned
OPTION (LABEL = 'Loading Data')
GO

_**Let's compare the tables**_

_**Pay attention to # of Open Row Groups and**_

In [None]:
SELECT * FROM dbo.vCCIHEalth WHERE Table_Name = 'Trip_Partitioned';


In [None]:
SELECT * FROM dbo.vCCIHEalth WHERE Table_Name = 'Trip_Partitioned_new';

In [None]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned' ORDER BY state_desc DESC;

In [None]:
SELECT * FROM dbo.vCCI_Stats_Detail WHERE logical_table_name = 'Trip_Partitioned_new' ORDER BY state_desc DESC, trim_reason_desc DESC;

In [None]:
RENAME OBJECT Trip_Partitioned TO Trip_Partitioned_orig
RENAME OBJECT Trip_Partitioned_new TO Trip_Partitioned

In [None]:
DROP TABLE dbo.Trip_Partitioned_orig

Back to PBI

Refresh the Dataset