### Modifying data / Data manipulation language
In this section, you will learn how to modify data in the database using Data Manipulation Language (DML), which includes SQL commands such as INSERT, DELETE, and UPDATE.
    
    INSERT – insert a row into a table
    INSERT multiple rows – insert multiple rows into a table using a single INSERT statement
    INSERT INTO SELECT – insert data that comes from the result set of a query into a table.
    UPDATE – change the existing values in a table.
    UPDATE JOIN – update values in a table based on values from another table using JOIN clauses.
    DELETE – delete one or more rows of a table.
    MERGE – walk you through the steps of performing a mixture of insertion, update, and deletion using a single statement.
    Transaction – show you how to start a transaction explicitly using the BEGIN TRANSACTION, COMMIT, and ROLLBACK statements

In [34]:
import pyodbc
import os
import pandas as pd

#Check if drivers are installed
#[x for x in pyodbc.drivers() if x.startswith("Microsoft Access Driver")]

# Define the connection string
conn_str = (
    r'DRIVER={ODBC Driver 17 for SQL Server};'
    r'SERVER=localhost;'
    r'DATABASE=BikeStores;'
    r'Trusted_Connection=yes;'
)

# Establish the connection
conn = pyodbc.connect(conn_str)

# Create a cursor
cursor = conn.cursor()

In [35]:
# execute a query
cursor.execute('''
SELECT 
    name
FROM 
    master.sys.databases
ORDER BY 
    name;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,name
0,AdventureWorksDW2019
1,AdventureWorksLT2022
2,BikeStores
3,hr
4,master
5,model
6,msdb
7,tempdb


### INSERT

Let's create a sample table to understand INSERT

In [36]:
# execute a query
cursor.execute('''
CREATE TABLE sales.promotions (
    promotion_id INT PRIMARY KEY IDENTITY (1, 1),
    promotion_name VARCHAR (255) NOT NULL,
    discount NUMERIC (3, 2) DEFAULT 0,
    start_date DATE NOT NULL,
    expired_date DATE NOT NULL
); 
''')

#In this query we are defining schemaname.tablename ( Columnname datatype ifprimarykey (size) ifnull)

ProgrammingError: ('42S01', "[42S01] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]There is already an object named 'promotions' in the database. (2714) (SQLExecDirectW)")

In [37]:
cursor.execute('''
INSERT INTO sales.promotions (
    promotion_name,
    discount,
    start_date,
    expired_date
)
VALUES
    (
        '2018 Summer Promotion',
        0.15,
        '20180601',
        '20180901'
    );
''')

#We can choosing selective columns to insert values, however note that we must insert into primary key field and not null fields

<pyodbc.Cursor at 0x20575a2f8b0>

now print what has been inserted

In [38]:
cursor.execute('''
SELECT
    *
FROM
    sales.promotions;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,promotion_id,promotion_name,discount,start_date,expired_date
0,1,2018 Summer Promotion,0.15,2018-06-01,2018-09-01
1,2,2018 Fall Promotion,0.15,2018-10-01,2018-11-01
2,12,2018 Summer Promotion,0.15,2018-06-01,2018-09-01


To insert as well as return what has been inserted use OUTPUT as shown

In [39]:
cursor.execute('''
INSERT INTO sales.promotions (
    promotion_name,
    discount,
    start_date,
    expired_date
) OUTPUT inserted.promotion_id
VALUES
    (
        '2018 Fall Promotion',
        0.15,
        '20181001',
        '20181101'
    );

''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,promotion_id
0,13


And to capture inserted values from multiple columns, we can use inserted.columnname next to OUTPUT

In [40]:
cursor.execute('''
INSERT INTO sales.promotions (
    promotion_name,
    discount,
    start_date,
    expired_date
) OUTPUT inserted.promotion_id,
 inserted.promotion_name,
 inserted.discount,
 inserted.start_date,
 inserted.expired_date
VALUES
    (
        '2018 Winter Promotion',
        0.2,
        '20181201',
        '20190101'
    );
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,promotion_id,promotion_name,discount,start_date,expired_date
0,14,2018 Winter Promotion,0.2,2018-12-01,2019-01-01


### INSERT multiple ROWS

In [41]:
cursor.execute('''
INSERT INTO sales.promotions (
    promotion_name,
    discount,
    start_date,
    expired_date
)
VALUES
    (
        '2019 Summer Promotion',
        0.15,
        '20190601',
        '20190901'
    ),
    (
        '2019 Fall Promotion',
        0.20,
        '20191001',
        '20191101'
    ),
    (
        '2019 Winter Promotion',
        0.25,
        '20191201',
        '20200101'
    );

''')


<pyodbc.Cursor at 0x20575a2f8b0>

In [42]:
cursor.execute('''
SELECT
    *
FROM
    sales.promotions;
''')


# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,promotion_id,promotion_name,discount,start_date,expired_date
0,1,2018 Summer Promotion,0.15,2018-06-01,2018-09-01
1,2,2018 Fall Promotion,0.15,2018-10-01,2018-11-01
2,12,2018 Summer Promotion,0.15,2018-06-01,2018-09-01
3,13,2018 Fall Promotion,0.15,2018-10-01,2018-11-01
4,14,2018 Winter Promotion,0.2,2018-12-01,2019-01-01
5,15,2019 Summer Promotion,0.15,2019-06-01,2019-09-01
6,16,2019 Fall Promotion,0.2,2019-10-01,2019-11-01
7,17,2019 Winter Promotion,0.25,2019-12-01,2020-01-01


Inserting multiple rows and returning the inserted id list example

In [43]:
cursor.execute('''
INSERT INTO 
	sales.promotions ( 
		promotion_name, discount, start_date, expired_date
	)
OUTPUT inserted.promotion_id
VALUES
	('2020 Summer Promotion',0.25,'20200601','20200901'),
	('2020 Fall Promotion',0.10,'20201001','20201101'),
	('2020 Winter Promotion', 0.25,'20201201','20210101');
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,promotion_id
0,18
1,19
2,20


### INSERT INTO SELECT

Lets create a table to understand how to insert into SELECT

In [44]:
cursor.execute('''
-- creating a table

CREATE TABLE sales.addresses (
    address_id INT IDENTITY PRIMARY KEY,
    street VARCHAR (255) NOT NULL,
    city VARCHAR (50),
    state VARCHAR (25),
    zip_code VARCHAR (5)
);   
''')


<pyodbc.Cursor at 0x20575a2f8b0>

In [45]:

cursor.execute('''
--inserting values into table

INSERT INTO sales.addresses (street, city, state, zip_code) 
SELECT
    street,
    city,
    state,
    zip_code
FROM
    sales.customers
ORDER BY
    first_name,
    last_name; 
''')


<pyodbc.Cursor at 0x20575a2f8b0>

In [46]:

cursor.execute('''
-- displaying inserted values

SELECT
    *
FROM
    sales.addresses;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,address_id,street,city,state,zip_code
0,1,807 Grandrose Ave.,Yonkers,NY,10701
1,2,807 Grandrose Ave.,Yonkers,NY,10701
2,3,26 Market Drive,Forest Hills,NY,11375
3,4,26 Market Drive,Forest Hills,NY,11375
4,5,60 Myers Dr.,Amityville,NY,11701
5,6,60 Myers Dr.,Amityville,NY,11701
6,7,9782 Indian Spring Lane,Harlingen,TX,78552
7,8,9782 Indian Spring Lane,Harlingen,TX,78552
8,9,167 James St.,Los Banos,CA,93635
9,10,167 James St.,Los Banos,CA,93635


#### Insert some rows from another table example

In [47]:

cursor.execute('''
INSERT INTO 
    sales.addresses (street, city, state, zip_code) 
SELECT
    street,
    city,
    state,
    zip_code
FROM
    sales.stores
WHERE
    city IN ('Santa Cruz', 'Baldwin')
''')

cursor.execute('''
-- displaying inserted values

SELECT
    *
FROM
    sales.addresses;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,address_id,street,city,state,zip_code
0,1,807 Grandrose Ave.,Yonkers,NY,10701
1,2,807 Grandrose Ave.,Yonkers,NY,10701
2,3,26 Market Drive,Forest Hills,NY,11375
3,4,26 Market Drive,Forest Hills,NY,11375
4,5,60 Myers Dr.,Amityville,NY,11701
5,6,60 Myers Dr.,Amityville,NY,11701
6,7,9782 Indian Spring Lane,Harlingen,TX,78552
7,8,9782 Indian Spring Lane,Harlingen,TX,78552
8,9,167 James St.,Los Banos,CA,93635
9,10,167 James St.,Los Banos,CA,93635


#### Insert the top N of rows

In [48]:

cursor.execute('''
TRUNCATE TABLE sales.addresses;
''')

cursor.execute('''
INSERT TOP (10) 
INTO sales.addresses (street, city, state, zip_code) 
SELECT
    street,
    city,
    state,
    zip_code
FROM
    sales.customers
ORDER BY
    first_name,
    last_name;
''')

cursor.execute('''
SELECT * FROM sales.addresses
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,address_id,street,city,state,zip_code
0,1,807 Grandrose Ave.,Yonkers,NY,10701
1,2,807 Grandrose Ave.,Yonkers,NY,10701
2,3,26 Market Drive,Forest Hills,NY,11375
3,4,26 Market Drive,Forest Hills,NY,11375
4,5,60 Myers Dr.,Amityville,NY,11701
5,6,60 Myers Dr.,Amityville,NY,11701
6,7,9782 Indian Spring Lane,Harlingen,TX,78552
7,8,9782 Indian Spring Lane,Harlingen,TX,78552
8,9,167 James St.,Los Banos,CA,93635
9,10,167 James St.,Los Banos,CA,93635


#### Insert the top percent of rows

In [49]:

cursor.execute('''
TRUNCATE TABLE sales.addresses;
''')

cursor.execute('''
INSERT TOP (10) PERCENT  
INTO sales.addresses (street, city, state, zip_code) 
SELECT
    street,
    city,
    state,
    zip_code
FROM
    sales.customers
ORDER BY
    first_name,
    last_name;
''')

cursor.execute('''
SELECT * FROM sales.addresses
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,address_id,street,city,state,zip_code
0,1,60 Myers Dr.,Amityville,NY,11701
1,2,8165 Baker Ave.,Franklin Square,NY,11010
2,3,683 West Kirkland Dr.,East Northport,NY,11731
3,4,34 Green Lake Street,Spring Valley,NY,10977
4,5,684 Howard St.,Sugar Land,TX,77478
5,6,8 San Juan Drive,East Elmhurst,NY,11369
6,7,14 Henry Smith St.,Rockville Centre,NY,11570
7,8,9544 Mulberry Drive,Rego Park,NY,11374
8,9,9593 North Sherman Dr.,Apple Valley,CA,92307
9,10,38 Old Fairground St.,East Northport,NY,11731


In [50]:
conn.close()
conn.closed

True

#### UPDATE

To modify existing data in a table, you use the following UPDATE statement:

Syntax:

In this syntax:

First, specify the name of the table you want to update data after the UPDATE keyword.

Second, specify a list of columns c1, c2, …, cn and new values v1, v2, … vn in the SET clause.

Third, filter the rows to update by specifying a condition in the WHERE clause. The WHERE clause is optional. If 

you skip the WHERE clause, the statement will update all rows in the table.

In [51]:
import pyodbc
import os
import pandas as pd

#Check if drivers are installed
#[x for x in pyodbc.drivers() if x.startswith("Microsoft Access Driver")]

# Define the connection string
conn_str = (
    r'DRIVER={ODBC Driver 17 for SQL Server};'
    r'SERVER=localhost;'
    r'DATABASE=BikeStores;'
    r'Trusted_Connection=yes;'
)

# Establish the connection
conn = pyodbc.connect(conn_str)

# Create a cursor
cursor = conn.cursor()

In [52]:

cursor.execute('''
CREATE TABLE sales.taxes (
	tax_id INT PRIMARY KEY IDENTITY (1, 1),
	state VARCHAR (50) NOT NULL UNIQUE,
	state_tax_rate DEC (3, 2),
	avg_local_tax_rate DEC (3, 2),
	combined_rate AS state_tax_rate + avg_local_tax_rate,
	max_local_tax_rate DEC (3, 2),
	updated_at datetime
);
''')


<pyodbc.Cursor at 0x20575dce030>

First create table and insert a single row then we can update it

In [53]:
cursor.execute(
    '''
SELECT * FROM sales.taxes
    '''
)

<pyodbc.Cursor at 0x20575dce030>

In [54]:
cursor.execute('''
UPDATE sales.taxes
SET updated_at = GETDATE();
''')

<pyodbc.Cursor at 0x20575dce030>

In [55]:
cursor.execute('''
SELECT * FROM sales.taxes
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

#### Update multiple columns

In [56]:
cursor.execute('''
UPDATE sales.taxes
SET max_local_tax_rate += 0.02,
    avg_local_tax_rate += 0.01
WHERE
    max_local_tax_rate = 0.01;
''')

cursor.execute('''
SELECT * FROM sales.taxes
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

#### DELETE

#### 1) Delete the number of random rows example

In [57]:
cursor.execute('''
DELETE TOP (21)
FROM sales.order_items;
''')

cursor.execute('''
SELECT * FROM sales.order_items
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,order_id,item_id,product_id,quantity,list_price,discount
0,8,1,22,1,269.99,0.05
1,8,2,20,2,599.99,0.07
2,9,1,7,2,3999.99,0.1
3,10,1,14,1,269.99,0.1
4,11,1,8,1,1799.99,0.05
5,11,2,22,2,269.99,0.1
6,11,3,16,2,599.99,0.2
7,12,1,4,2,2899.99,0.1
8,12,2,11,1,1680.99,0.05
9,13,1,13,1,269.99,0.1


#### 2) Delete the percent of random rows example

In [58]:
cursor.execute('''
DELETE TOP (5) PERCENT
FROM sales.order_items;

''')

cursor.execute('''
SELECT * FROM sales.order_items
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,order_id,item_id,product_id,quantity,list_price,discount
0,88,3,7,1,3999.99,0.1
1,89,1,5,1,1320.99,0.05
2,89,2,6,2,469.99,0.1
3,90,1,3,1,999.99,0.1
4,90,2,6,1,469.99,0.07
5,91,1,11,1,1680.99,0.1
6,91,2,25,1,499.99,0.05
7,91,3,15,2,529.99,0.07
8,91,4,13,2,269.99,0.07
9,92,1,8,1,1799.99,0.1


#### 3) Delete all rows from a table example

In [32]:
cursor.execute('''
DELETE
FROM
    sales.order_items
''')

cursor.execute('''
SELECT * FROM sales.order_items
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

In [59]:
conn.close()
conn.closed

True