# T-SQL Tutorials

## SQL Server Bikestores Database

    GROUP BY– group the query result based on the values in a specified list of column expressions.
    HAVING – specify a search condition for a group or an aggregate.
    GROUPING SETS – generates multiple grouping sets.
    CUBE – generate grouping sets with all combinations of the dimension columns.
    ROLLUP – generate grouping sets with an assumption of the hierarchy between input columns.

In [1]:
import pyodbc
import os
import pandas as pd

#Check if drivers are installed
[x for x in pyodbc.drivers() if x.startswith("Microsoft Access Driver")]

# Define the connection string
conn_str = (
    r'DRIVER={ODBC Driver 17 for SQL Server};'
    r'SERVER=localhost;'
    r'DATABASE=BikeStores;'
    r'Trusted_Connection=yes;'
)

# Establish the connection
conn = pyodbc.connect(conn_str)

# Create a cursor
cursor = conn.cursor()

### SQL Server GROUP BY clause and aggregate functions

In practice, the GROUP BY clause is often used with aggregate functions for generating summary reports.

An aggregate function performs a calculation on a group and returns a unique value per group. For example, COUNT() returns the number of rows in each group. Other commonly used aggregate functions are SUM(), AVG() (average), MIN() (minimum), MAX() (maximum).

The GROUP BY clause arranges rows into groups and an aggregate function returns the summary (count, min, max, average, sum, etc.,) for each group.

For example, the following query returns the number of orders placed by the customer by year:

In [2]:
# execute a query
cursor.execute('''
SELECT
    customer_id,
    YEAR (order_date) order_year,
    COUNT (order_id) order_placed
FROM
    sales.orders
WHERE
    customer_id IN (1, 2)
GROUP BY
    customer_id,
    YEAR (order_date)
ORDER BY
    customer_id; 
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,customer_id,order_year,order_placed
0,1,2016,1
1,1,2018,2
2,2,2017,2
3,2,2018,1


In [2]:

# execute a query
cursor.execute('''
SELECT
    city,
    COUNT (customer_id) customer_count
FROM
    sales.customers
GROUP BY
    city
ORDER BY
    city;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,city,customer_count
0,Albany,3
1,Amarillo,5
2,Amityville,9
3,Amsterdam,5
4,Anaheim,11


In [3]:

# execute a query
cursor.execute('''
SELECT
    city,
    state,
    COUNT (customer_id) customer_count
FROM
    sales.customers
GROUP BY
    state,
    city
ORDER BY
    city,
    state;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,city,state,customer_count
0,Albany,NY,3
1,Amarillo,TX,5
2,Amityville,NY,9
3,Amsterdam,NY,5
4,Anaheim,CA,11


2) Using GROUP BY clause with the MIN and MAX functions example

In [15]:

# execute a query
cursor.execute('''
SELECT
    brand_name,
    MIN (list_price) min_price,
    MAX (list_price) max_price
FROM
    production.products p
INNER JOIN production.brands b 
    ON b.brand_id = p.brand_id
WHERE
    model_year = 2018
GROUP BY
    brand_name
ORDER BY
    brand_name;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,brand_name,min_price,max_price
0,Electra,269.99,2999.99
1,Heller,2599.0,2599.0
2,Strider,89.99,289.99
3,Surly,469.99,2499.99
4,Trek,159.99,11999.99


3) Using GROUP BY clause with the AVG() function example

In [16]:

# execute a query
cursor.execute('''
SELECT
    brand_name,
    AVG (list_price) avg_price
FROM
    production.products p
INNER JOIN production.brands b
    ON b.brand_id = p.brand_id
WHERE
    model_year = 2018
GROUP BY
    brand_name
ORDER BY
    brand_name;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,brand_name,avg_price
0,Electra,848.100111
1,Heller,2599.0
2,Strider,209.99
3,Surly,1502.457692
4,Trek,2464.99


4) Using GROUP BY clause with the SUM function example

In [17]:

# execute a query
cursor.execute('''
SELECT
    order_id,
    SUM (
        quantity * list_price * (1 - discount)
    ) net_value
FROM
    sales.order_items
GROUP BY
    order_id;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,order_id,net_value
0,1,10231.0464
1,2,1697.9717
2,3,1519.981
3,4,1349.982
4,5,3900.0607


### HAVING Clause

The HAVING clause is often used with the GROUP BY clause to filter groups based on a specified list of conditions. The following illustrates the HAVING clause syntax:

In this syntax, the GROUP BY clause summarizes the rows into groups and the HAVING clause applies one or more conditions to these groups. Only groups that make the conditions evaluated TRUE are included in the result. In other words, the groups for which the condition evaluates to  FALSE or UNKNOWN are filtered out.

In [20]:

# execute a query
cursor.execute('''
SELECT
    customer_id,
    YEAR (order_date),
    COUNT (order_id) order_count
FROM
    sales.orders
GROUP BY
    customer_id,
    YEAR (order_date)
HAVING
    COUNT (order_id) >= 2
ORDER BY
    customer_id;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,customer_id,Unnamed: 2,order_count
0,1,2018,2
1,2,2017,2
2,3,2018,3
3,4,2017,2
4,5,2016,2


In this example:

    First, the GROUP BY clause groups the sales order by customer and order year. The COUNT() function returns the number of orders each customer placed each year.
    Second, the HAVING clause filtered out all the customers whose number of orders is less than two.

The following statement finds the sales orders whose net values are greater than 20,000:

In [21]:

# execute a query
cursor.execute('''
SELECT
    order_id,
    SUM (
        quantity * list_price * (1 - discount)
    ) net_value
FROM
    sales.order_items
GROUP BY
    order_id
HAVING
    SUM (
        quantity * list_price * (1 - discount)
    ) > 20000
ORDER BY
    net_value;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,order_id,net_value
0,973,20177.7457
1,1334,20509.4254
2,1348,20648.9537
3,930,24607.0261
4,1364,24890.6244


In this above example:

    First, the SUM() function returns the net values of sales orders.
    Second, the HAVING clause filters the sales orders whose net values are less than or equal to 20,000.

The following statement first finds the maximum and minimum list prices in each product category. Then, it filters out the category which has a maximum list price greater than 4,000 or a minimum list price less than 500:

In [25]:

# execute a query
cursor.execute('''
SELECT
    category_id,
    MAX (list_price) max_list_price,
    MIN (list_price) min_list_price
FROM
    production.products
GROUP BY
    category_id
HAVING
    MAX (list_price) > 4000 OR MIN (list_price) < 500;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,category_id,max_list_price,min_list_price
0,1,489.99,89.99
1,2,2599.99,416.99
2,3,2999.99,250.99
3,5,4999.99,1559.99
4,6,5299.99,379.99


In [26]:

# execute a query
cursor.execute('''
SELECT
    category_id,
    AVG (list_price) avg_list_price
FROM
    production.products
GROUP BY
    category_id
HAVING
    AVG (list_price) BETWEEN 500 AND 1000;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,category_id,avg_list_price
0,2,682.123333
1,3,730.412307


Summary
Use the SQL Server HAVING clause to filter groups based on specified conditions.

### GROUPING SETS

In [27]:
# execute a query
cursor.execute('''
SELECT
	*
FROM
	sales.sales_summary
ORDER BY
	brand,
	category,
	model_year;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head()

Unnamed: 0,brand,category,model_year,sales
0,Electra,Children Bicycles,2016,109819.0
1,Electra,Children Bicycles,2017,79664.0
2,Electra,Children Bicycles,2018,18123.0
3,Electra,Comfort Bicycles,2016,206615.0
4,Electra,Comfort Bicycles,2017,17502.0


By definition, a grouping set is a group of columns by which you group. Typically, a single query with an aggregate defines a single grouping set.

For example, the following query defines a grouping set that includes brand and category which is denoted as (brand, category). The query returns the sales amount grouped by brand and category:

In [29]:
# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand,
    category
ORDER BY
    brand,
    category;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,Electra,Children Bicycles,207606.0
1,Electra,Comfort Bicycles,271542.0
2,Electra,Cruisers Bicycles,694909.0
3,Electra,Electric Bikes,31264.0
4,Haro,Children Bicycles,29240.0
5,Haro,Mountain Bikes,156145.0
6,Heller,Mountain Bikes,171459.0
7,Pure Cycles,Cruisers Bicycles,149476.0
8,Ritchey,Mountain Bikes,78899.0
9,Strider,Children Bicycles,4320.0


The following query returns the sales amount by brand. It defines a grouping set (brand):

In [30]:

# execute a query
cursor.execute('''
SELECT
    brand,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand
ORDER BY
    brand;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,sales
0,Electra,1205321.0
1,Haro,185385.0
2,Heller,171459.0
3,Pure Cycles,149476.0
4,Ritchey,78899.0
5,Strider,4320.0
6,Sun Bicycles,341994.0
7,Surly,949505.0
8,Trek,4602754.0


The following query returns the sales amount by category. It defines a grouping set (category):

In [31]:

# execute a query
cursor.execute('''
SELECT
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    category
ORDER BY
    category;

''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,category,sales
0,Children Bicycles,292189.0
1,Comfort Bicycles,394020.0
2,Cruisers Bicycles,995032.0
3,Cyclocross Bicycles,711011.0
4,Electric Bikes,916685.0
5,Mountain Bikes,2715078.0
6,Road Bikes,1665098.0


The following query defines an empty grouping set (). It returns the sales amount for all brands and categories.

In [32]:

# execute a query
cursor.execute('''
SELECT
    SUM (sales) sales
FROM
    sales.sales_summary;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,sales
0,7689113.0


The four queries above return four result sets with four grouping sets:

    (brand, category)
    (brand)
    (category)
    ()

To get a unified result set with the aggregated data for all grouping sets, you can use the UNION ALL operator.

Because UNION ALL operator requires all result sets to have the same number of columns, you need to add NULL to the select list of the queries like this:

In [33]:

# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand,
    category
UNION ALL
SELECT
    brand,
    NULL,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand
UNION ALL
SELECT
    NULL,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    category
UNION ALL
SELECT
    NULL,
    NULL,
    SUM (sales)
FROM
    sales.sales_summary
ORDER BY brand, category;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,,,7689113.0
1,,Children Bicycles,292189.0
2,,Comfort Bicycles,394020.0
3,,Cruisers Bicycles,995032.0
4,,Cyclocross Bicycles,711011.0
5,,Electric Bikes,916685.0
6,,Mountain Bikes,2715078.0
7,,Road Bikes,1665098.0
8,Electra,,1205321.0
9,Electra,Children Bicycles,207606.0


To fix these problems, SQL Server provides a subclause of the GROUP BY clause called GROUPING SETS.

The GROUPING SETS defines multiple grouping sets in the same query. The following shows the general syntax of the GROUPING SETS:

In [36]:

# execute a query
cursor.execute('''
SELECT
	brand,
	category,
	SUM (sales) sales
FROM
	sales.sales_summary
GROUP BY
	GROUPING SETS (
		(brand, category),
		(brand),
		(category),
		()
	)
ORDER BY
	brand,
	category;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,,,7689113.0
1,,Children Bicycles,292189.0
2,,Comfort Bicycles,394020.0
3,,Cruisers Bicycles,995032.0
4,,Cyclocross Bicycles,711011.0
5,,Electric Bikes,916685.0
6,,Mountain Bikes,2715078.0
7,,Road Bikes,1665098.0
8,Electra,,1205321.0
9,Electra,Children Bicycles,207606.0


### GROUPING function

The GROUPING function indicates whether a specified column in a GROUP BY clause is aggregated or not. It returns 1 for aggregated or 0 for not aggregated in the result set.

In [37]:

# execute a query
cursor.execute('''
SELECT
    GROUPING(brand) grouping_brand,
    GROUPING(category) grouping_category,
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    GROUPING SETS (
        (brand, category),
        (brand),
        (category),
        ()
    )
ORDER BY
    brand,
    category;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,grouping_brand,grouping_category,brand,category,sales
0,1,1,,,7689113.0
1,1,0,,Children Bicycles,292189.0
2,1,0,,Comfort Bicycles,394020.0
3,1,0,,Cruisers Bicycles,995032.0
4,1,0,,Cyclocross Bicycles,711011.0
5,1,0,,Electric Bikes,916685.0
6,1,0,,Mountain Bikes,2715078.0
7,1,0,,Road Bikes,1665098.0
8,0,1,Electra,,1205321.0
9,0,0,Electra,Children Bicycles,207606.0


The value in the grouping_brand column indicates whether the row is aggregated or not:

    1 means that the sales amount is aggregated by brand
    0 means that the sales amount is not aggregated by brand.
    The same logic is applied to the grouping_category column.

### CUBE

Grouping sets specify groupings of data in a single query. For example, the following query defines a single grouping set represented as (brand):

In [38]:
# execute a query
cursor.execute('''
SELECT 
    brand, 
    SUM(sales)
FROM 
    sales.sales_summary
GROUP BY 
    brand;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,Unnamed: 2
0,Electra,1205321.0
1,Haro,185385.0
2,Heller,171459.0
3,Pure Cycles,149476.0
4,Ritchey,78899.0
5,Strider,4320.0
6,Sun Bicycles,341994.0
7,Surly,949505.0
8,Trek,4602754.0


The CUBE is a subclause of the GROUP BY clause that allows you to generate multiple grouping sets. The following illustrates the general syntax of the CUBE

In this syntax, the CUBE generates all possible grouping sets based on the dimension columns d1, d2, and d3 that you specify in the CUBE clause.

The above query returns the same result set as the following query, which uses the  GROUPING SETS:

In [42]:

# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    CUBE(brand, category);
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,Electra,Children Bicycles,207606.0
1,Haro,Children Bicycles,29240.0
2,Strider,Children Bicycles,4320.0
3,Sun Bicycles,Children Bicycles,2328.0
4,Trek,Children Bicycles,48695.0
5,,Children Bicycles,292189.0
6,Electra,Comfort Bicycles,271542.0
7,Sun Bicycles,Comfort Bicycles,122478.0
8,,Comfort Bicycles,394020.0
9,Electra,Cruisers Bicycles,694909.0


In this example, we have two dimension columns specified in the CUBE clause, therefore, we have a total of four grouping sets.

The following example illustrates how to perform a partial CUBE to reduce the number of grouping sets generated by the query:

In [43]:

# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand,
    CUBE(category);
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,Electra,Children Bicycles,207606.0
1,Electra,Comfort Bicycles,271542.0
2,Electra,Cruisers Bicycles,694909.0
3,Electra,Electric Bikes,31264.0
4,Electra,,1205321.0
5,Haro,Children Bicycles,29240.0
6,Haro,Mountain Bikes,156145.0
7,Haro,,185385.0
8,Heller,Mountain Bikes,171459.0
9,Heller,,171459.0


### SQL Server ROLLUP

The SQL Server ROLLUP is a subclause of the GROUP BY clause which provides a shorthand for defining multiple grouping sets.

Unlike the CUBE subclause, ROLLUP does not create all possible grouping sets based on the dimension columns; the CUBE makes a subset of those.

When generating the grouping sets, ROLLUP assumes a hierarchy among the dimension columns and only generates grouping sets based on this hierarchy.

The ROLLUP is often used to generate subtotals and totals for reporting purposes.

The ROLLUP is commonly used to calculate the aggregates of hierarchical data such as sales by year > quarter > month.

In [44]:

# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand,
    ROLLUP(category);
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,Electra,Children Bicycles,207606.0
1,Electra,Comfort Bicycles,271542.0
2,Electra,Cruisers Bicycles,694909.0
3,Electra,Electric Bikes,31264.0
4,Electra,,1205321.0
5,Haro,Children Bicycles,29240.0
6,Haro,Mountain Bikes,156145.0
7,Haro,,185385.0
8,Heller,Mountain Bikes,171459.0
9,Heller,,171459.0


In this example, the query assumes that there is a hierarchy between brand and category, which is the brand > category.

Note that if you change the order of brand and category, the result will be different as shown in the following query:

In [45]:

# execute a query
cursor.execute('''
SELECT
    brand,
    category,
    SUM (sales) sales
FROM
    sales.sales_summary
GROUP BY
    brand,
    ROLLUP(category, brand);
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(20)

Unnamed: 0,brand,category,sales
0,Electra,Children Bicycles,207606.0
1,Electra,Comfort Bicycles,271542.0
2,Electra,Cruisers Bicycles,694909.0
3,Electra,Electric Bikes,31264.0
4,Electra,,1205321.0
5,Haro,Children Bicycles,29240.0
6,Haro,Mountain Bikes,156145.0
7,Haro,,185385.0
8,Heller,Mountain Bikes,171459.0
9,Heller,,171459.0
