### Modifying data / Data manipulation language
In this section, you will learn how to modify data in the database using Data Manipulation Language (DML), which includes SQL commands such as INSERT, DELETE, and UPDATE.
    
    INSERT – insert a row into a table
    INSERT multiple rows – insert multiple rows into a table using a single INSERT statement
    INSERT INTO SELECT – insert data that comes from the result set of a query into a table.
    UPDATE – change the existing values in a table.
    UPDATE JOIN – update values in a table based on values from another table using JOIN clauses.
    DELETE – delete one or more rows of a table.
    MERGE – walk you through the steps of performing a mixture of insertion, update, and deletion using a single statement.
    Transaction – show you how to start a transaction explicitly using the BEGIN TRANSACTION, COMMIT, and ROLLBACK statements

In [1]:
import pyodbc
import os
import pandas as pd

#Check if drivers are installed
#[x for x in pyodbc.drivers() if x.startswith("Microsoft Access Driver")]

# Define the connection string
conn_str = (
    r'DRIVER={ODBC Driver 17 for SQL Server};'
    r'SERVER=localhost;'
    r'DATABASE=BikeStores;'
    r'Trusted_Connection=yes;'
)

# Establish the connection
conn = pyodbc.connect(conn_str)

# Create a cursor
cursor = conn.cursor()

In [2]:
# execute a query
cursor.execute('''
SELECT 
    name
FROM 
    master.sys.databases
ORDER BY 
    name;
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,name
0,AdventureWorksDW2019
1,AdventureWorksLT2022
2,BikeStores
3,hr
4,master
5,model
6,msdb
7,tempdb


#### MERGE

Suppose, you have two table called source and target tables, and you need to update the target table based on the values matched from the source table. There are three cases:

1. The source table has some rows that do not exist in the target table. In this case, you need to insert rows that are in the source table into the target table.
2. The target table has some rows that do not exist in the source table. In this case, you need to delete rows from the target table.
3. The source table has some rows with the same keys as the rows in the target table. However, these rows have different values in the non-key columns. In this case, you need to update the rows in the target table with the values coming from the source table.

![MERGE](SQL-Server-MERGE.png)

If you use the INSERT, UPDATE, and DELETE statement individually, you have to construct three separate statements to update the data to the target table with the matching rows from the source table.

However, SQL Server provides the MERGE statement that allows you to perform three actions at the same time. The following shows the syntax of the MERGE statement:

In [1]:
import pyodbc
import os
import pandas as pd

#Check if drivers are installed
#[x for x in pyodbc.drivers() if x.startswith("Microsoft Access Driver")]

# Define the connection string
conn_str = (
    r'DRIVER={ODBC Driver 17 for SQL Server};'
    r'SERVER=localhost;'
    r'DATABASE=BikeStores;'
    r'Trusted_Connection=yes;'
)

# Establish the connection
conn = pyodbc.connect(conn_str)

# Create a cursor
cursor = conn.cursor()

In [4]:
# execute a query
cursor.execute('''
CREATE TABLE sales.category (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(255) NOT NULL,
    amount DECIMAL(10 , 2 )
);
''')

cursor.execute('''
INSERT INTO sales.category(category_id, category_name, amount)
VALUES(1,'Children Bicycles',15000),
    (2,'Comfort Bicycles',25000),
    (3,'Cruisers Bicycles',13000),
    (4,'Cyclocross Bicycles',10000);
''')


<pyodbc.Cursor at 0x21353ec2130>

In [5]:

cursor.execute('''
CREATE TABLE sales.category_staging (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(255) NOT NULL,
    amount DECIMAL(10 , 2 )
);
''')

cursor.execute('''
    INSERT INTO sales.category_staging(category_id, category_name, amount)
    VALUES(1,'Children Bicycles',15000),
        (3,'Cruisers Bicycles',13000),
        (4,'Cyclocross Bicycles',20000),
        (5,'Electric Bikes',10000),
        (6,'Mountain Bikes',10000);
''')



<pyodbc.Cursor at 0x21353ec2130>

To update data to the sales.category (target table) with the values from the sales.category_staging (source table), you use the following MERGE statement:

In [7]:
cursor.execute('''
MERGE sales.category t 
    USING sales.category_staging s
ON (s.category_id = t.category_id)
WHEN MATCHED
    THEN UPDATE SET 
        t.category_name = s.category_name,
        t.amount = s.amount
WHEN NOT MATCHED BY TARGET 
    THEN INSERT (category_id, category_name, amount)
         VALUES (s.category_id, s.category_name, s.amount)
WHEN NOT MATCHED BY SOURCE 
    THEN DELETE;
''')



<pyodbc.Cursor at 0x21353ec2130>

![MERGE](SQL-Server-MERGE-Example.png)

In this example, we used the values in the category_id columns in both tables as the merge condition.

First, the rows with id 1, 3, 4 from the sales.category_staging table matches with the rows from the target table, therefore, the MERGE statement updates the values in category name and amount columns in the sales.category table.

Second, the rows with id 5 and 6 from the sales.category_staging table do not exist in the sales.category table, so the MERGE statement inserts these rows into the target table.

Third, the row with id 2 from the sales.category table does not exist in the sales.sales_staging table, therefore, the MERGE statement deletes this row.
As a result of the merger, the data in the sales.category table is fully synchronized with the data in the sales.category_staging table.

### Transaction

A transaction is a single unit of work that typically contains multiple T-SQL statements.

If a transaction is successful, the changes are committed to the database. However, if a transaction has an error, the changes have to be rolled back.

When executing a single statement such as INSERT, UPDATE, and DELETE, SQL Server uses the autocommit transaction. In this case, each statement is a transaction.

To start a transaction explicitly, you use the BEGIN TRANSACTION or BEGIN TRAN statement first:

BEGIN TRANSACTION;

Then, execute one or more statements including INSERT, UPDATE, and DELETE.

Finally, commit the transaction using the COMMIT statement:

COMMIT;

Or roll back the transaction using the ROLLBACK statement:

ROLLBACK;

In [8]:
cursor.execute('''
CREATE TABLE invoices (
  id int IDENTITY PRIMARY KEY,
  customer_id int NOT NULL,
  total decimal(10, 2) NOT NULL DEFAULT 0 CHECK (total >= 0)
);
''')


cursor.execute('''
CREATE TABLE invoice_items (
  id int,
  invoice_id int NOT NULL,
  item_name varchar(100) NOT NULL,
  amount decimal(10, 2) NOT NULL CHECK (amount >= 0),
  tax decimal(4, 2) NOT NULL CHECK (tax >= 0),
  PRIMARY KEY (id, invoice_id),
  FOREIGN KEY (invoice_id) REFERENCES invoices (id)
	ON UPDATE CASCADE
	ON DELETE CASCADE
);
''')

<pyodbc.Cursor at 0x21353ec2130>

The invoices table stores the header of the invoice while the invoice_items table stores the line items. The total field in the invoices table is calculated from the line items.

In [9]:
cursor.execute('''
BEGIN TRANSACTION;

INSERT INTO invoices (customer_id, total)
VALUES (100, 0);

INSERT INTO invoice_items (id, invoice_id, item_name, amount, tax)
VALUES (10, 1, 'Keyboard', 70, 0.08),
       (20, 1, 'Mouse', 50, 0.08);

UPDATE invoices
SET total = (SELECT
  SUM(amount * (1 + tax))
FROM invoice_items
WHERE invoice_id = 1);

COMMIT;
''')

<pyodbc.Cursor at 0x21353ec2130>

In [12]:
cursor.execute('''
SELECT * FROM invoices
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,id,customer_id,total
0,1,100,129.6


In [13]:
cursor.execute('''
SELECT * FROM invoice_items
''')

# Fetch all rows from the executed query
rows = cursor.fetchall()

# Get the column names
columns = [column[0] for column in cursor.description]

# Convert the rows into a list of dictionaries
data = [dict(zip(columns, row)) for row in rows]

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,id,invoice_id,item_name,amount,tax
0,10,1,Keyboard,70.0,0.08
1,20,1,Mouse,50.0,0.08


In [14]:
conn.close()
conn.closed

True