# SQL with Python Reference Guide 8
# Data Modification Statements
## (Justin M. Olds)
Based on Stanford SQL course: https://lagunita.stanford.edu/courses/DB/SQL/SelfPaced/info

---
**Data Modification Statements Overview** 

* **INSERT** -- Two methods: **(1)** Specify the values for the tuples to be inserted and **(2)** Insert the results of a separate SELECT statement that satisfies the attribute requirements. 
* **DELETE** -- Specify values to be removed based on WHERE conditions, which can include subqueries to become quite elaborate.
* **UPDATE** -- Similar to the DELETE command, but attributes for tuples that meet a condition specified in the WHERE clause can be replaced with a new expression. (Multiple attributes can be updated simultaneously)

In [2]:
import sqlite3
import pandas as pd

conn = sqlite3.connect("class.db")
c = conn.cursor()

---
### Tables and Insert code below (same as before--college admissions data)

In [58]:
c.execute('DROP TABLE IF EXISTS College')
c.execute('DROP TABLE IF EXISTS Student') 
c.execute('DROP TABLE IF EXISTS Apply') 

c.execute('CREATE TABLE College(cName TEXT, state TEXT, enrollment INT)')
c.execute('CREATE TABLE Student(sID INT, sName TEXT, GPA REAL, sizeHS INT)')
c.execute('CREATE TABLE Apply(sID INT, cName TEXT, major TEXT, decision TEXT)')
conn.commit()

In [59]:
c.execute('DELETE FROM Student')
c.execute('DELETE FROM College')
c.execute('DELETE FROM Apply')

c.execute("INSERT INTO Student VALUES (123, 'Amy', 3.9, 1000)")
c.execute("INSERT INTO Student values (234, 'Bob', 3.6, 1500)")
c.execute("INSERT INTO Student values (345, 'Craig', 3.5, 500)")
c.execute("INSERT INTO Student values (456, 'Doris', 3.9, 1000)")
c.execute("INSERT INTO Student values (567, 'Edward', 2.9, 2000)")
c.execute("INSERT INTO Student values (678, 'Fay', 3.8, 200)")
c.execute("INSERT INTO Student values (789, 'Gary', 3.4, 800)")
c.execute("INSERT INTO Student values (987, 'Helen', 3.7, 800)")
c.execute("INSERT INTO Student values (876, 'Irene', 3.9, 400)")
c.execute("INSERT INTO Student values (765, 'Jay', 2.9, 1500)")
c.execute("INSERT INTO Student values (654, 'Amy', 3.9, 1000)")
c.execute("INSERT INTO Student values (543, 'Craig', 3.4, 2000)")

c.execute("INSERT INTO College values ('Stanford', 'CA', 15000)")
c.execute("INSERT INTO College values ('Berkeley', 'CA', 36000)")
c.execute("INSERT INTO College values ('MIT', 'MA', 10000)")
c.execute("INSERT INTO College values ('Cornell', 'NY', 21000)")

c.execute("INSERT INTO Apply values (123, 'Stanford', 'CS', 'Y')")
c.execute("INSERT INTO Apply values (123, 'Stanford', 'EE', 'N')")
c.execute("INSERT INTO Apply values (123, 'Berkeley', 'CS', 'Y')")
c.execute("INSERT INTO Apply values (123, 'Cornell', 'EE', 'Y')")
c.execute("INSERT INTO Apply values (234, 'Berkeley', 'biology', 'N')")
c.execute("INSERT INTO Apply values (345, 'MIT', 'bioengineering', 'Y')")
c.execute("INSERT INTO Apply values (345, 'Cornell', 'bioengineering', 'N')")
c.execute("INSERT INTO Apply values (345, 'Cornell', 'CS', 'Y')")
c.execute("INSERT INTO Apply values (345, 'Cornell', 'EE', 'N')")
c.execute("INSERT INTO Apply values (678, 'Stanford', 'history', 'Y')")
c.execute("INSERT INTO Apply values (987, 'Stanford', 'CS', 'Y')")
c.execute("INSERT INTO Apply values (987, 'Berkeley', 'CS', 'Y')")
c.execute("INSERT INTO Apply values (876, 'Stanford', 'CS', 'N')")
c.execute("INSERT INTO Apply values (876, 'MIT', 'biology', 'Y')")
c.execute("INSERT INTO Apply values (876, 'MIT', 'marine biology', 'N')")
c.execute("INSERT INTO Apply values (765, 'Stanford', 'history', 'Y')")
c.execute("INSERT INTO Apply values (765, 'Cornell', 'history', 'N')")
c.execute("INSERT INTO Apply values (765, 'Cornell', 'psychology', 'Y')")
c.execute("INSERT INTO Apply values (543, 'MIT', 'CS', 'N')")
conn.commit()


---
### INSERT command example (Method 1: specify attribute values for a tuple)

Add a new college to the College database. 


In [60]:

c.execute("INSERT INTO College VALUES ('Carnegie Mellon', 'PA', 11500)")


<sqlite3.Cursor at 0x22b950fe500>

### INSERT command example (Method 2: obtain attribute values (at least, some) from a select statement)

Have all students who didn't apply anywhere apply to CS at Carnegie Mellon.

In [61]:
c.execute("""
    INSERT INTO Apply
        SELECT sID, 'Carnegie Mellon', 'CS', NULL
        FROM Student
        WHERE sID NOT IN (SELECT sID FROM Apply)
""")


<sqlite3.Cursor at 0x22b950fe500>

Inserted students are listed at the bottom: 

In [9]:
df = pd.read_sql_query("""
    SELECT *
    FROM Apply 
    """, conn);df

Unnamed: 0,sID,cName,major,decision
0,123,Stanford,CS,Y
1,123,Stanford,EE,N
2,123,Berkeley,CS,Y
3,123,Cornell,EE,Y
4,234,Berkeley,biology,N
5,345,MIT,bioengineering,Y
6,345,Cornell,bioengineering,N
7,345,Cornell,CS,Y
8,345,Cornell,EE,N
9,678,Stanford,history,Y


### Another example: 

Admit to Carnegie Mellon EE, all students who were turned down to EE. elsewhere. 

First, check SELECT statement: 

In [10]:
df = pd.read_sql_query("""
    SELECT *
    FROM Student
    WHERE sID IN 
        (SELECT sID FROM Apply WHERE major = 'EE' AND decision='N')
    """, conn);df

Unnamed: 0,sID,sName,GPA,sizeHS
0,123,Amy,3.9,1000
1,345,Craig,3.5,500


In [62]:
c.execute("""
    INSERT INTO Apply
        SELECT sID, 'Carnegie Mellon', 'EE', 'Y'
        FROM Student
        WHERE sID IN 
            (SELECT sID FROM Apply WHERE major = 'EE' AND decision = 'N')
""")

<sqlite3.Cursor at 0x22b950fe500>

Check that the two students were inserted. Yes, at the bottom of the table.

In [57]:
df = pd.read_sql_query("""
    SELECT *
    FROM Apply 
    """, conn);df

Unnamed: 0,sID,cName,major,decision
0,123,Stanford,CS,Y
1,123,Stanford,EE,N
2,123,Berkeley,CS,Y
3,123,Cornell,EE,Y
4,234,Berkeley,biology,N
5,345,MIT,bioengineering,Y
6,345,Cornell,bioengineering,N
7,345,Cornell,EE,N
8,678,Stanford,history,Y
9,987,Stanford,CS,Y


---
### DELETE
Delete all students who applied to more than two different majors.


In [13]:
df = pd.read_sql_query("""
    SELECT sID, COUNT(DISTINCT major)
    FROM Apply
    GROUP BY sID
    HAVING COUNT(DISTINCT major) > 2
""", conn);df

Unnamed: 0,sID,COUNT(DISTINCT major)
0,345,3
1,876,3


In [63]:
c.execute("""
    DELETE FROM Student
    WHERE sID IN
        (SELECT sID
        FROM Apply
        GROUP BY sID
        HAVING COUNT(DISTINCT major) >2)
""")

<sqlite3.Cursor at 0x22b950fe500>

### Another DELETE example

Delete colleges with no CS applicants. In this case, we delete the CS  students in the Apply table from Cornell so Cornell will be removed from the College table. 

In [65]:
c.execute("""
    DELETE FROM Apply
    WHERE cName = 'Cornell' AND major = 'CS'
""")

<sqlite3.Cursor at 0x22b950fe500>

In [66]:
df = pd.read_sql_query("""
    SELECT * 
    FROM College
    WHERE cName NOT IN 
        (SELECT cName
        FROM Apply
        WHERE major = 'CS')
""", conn);df

Unnamed: 0,cName,state,enrollment
0,Cornell,NY,21000


In [67]:
c.execute("""
    DELETE FROM College
    WHERE cName NOT IN
        (SELECT cName
        FROM Apply
        WHERE major = 'CS')
""")

<sqlite3.Cursor at 0x22b950fe500>

Cornell is removed from the College table (See below)

In [68]:
df = pd.read_sql_query("""
    SELECT * 
    FROM College
    """, conn);df

Unnamed: 0,cName,state,enrollment
0,Stanford,CA,15000
1,Berkeley,CA,36000
2,MIT,MA,10000
3,Carnegie Mellon,PA,11500


### UPDATE commands
Update rows in the Apply table such that applications to Carnegie Mellon with GPA < 3.6 are now accepted (Decision='Y') but turn them into Economics majors

In [70]:
df = pd.read_sql_query("""
    SELECT *
    FROM Apply
    WHERE cName = 'Carnegie Mellon'
        AND sID IN (SELECT sID FROM Student WHERE GPA < 3.6)
""", conn); df

Unnamed: 0,sID,cName,major,decision
0,567,Carnegie Mellon,CS,
1,789,Carnegie Mellon,CS,


In [71]:
c.execute("""
    UPDATE Apply
    SET decision = 'Y', major = 'economics'
    WHERE cName = 'Carnegie Mellon'
        AND sID IN (SELECT sID FROM Student WHERE GPA < 3.6)
""")

<sqlite3.Cursor at 0x22b950fe500>

In [72]:
df = pd.read_sql_query("""
    SELECT *
    FROM Apply
    WHERE cName = 'Carnegie Mellon'
        AND sID IN (SELECT sID FROM Student WHERE GPA < 3.6)
""", conn); df

Unnamed: 0,sID,cName,major,decision
0,567,Carnegie Mellon,economics,Y
1,789,Carnegie Mellon,economics,Y


#### Another UPDATE example
The following example shows how SELECT statements can be embedded within the SET attribute expression(s).

In [74]:
c.execute("""
    UPDATE Student
    SET GPA = (SELECT MAX(GPA) FROM Student),
        sizeHS = (SELECT MIN(sizeHS) FROM Student)
""")

<sqlite3.Cursor at 0x22b950fe500>

In [75]:
df = pd.read_sql_query("""
    SELECT *
    FROM Student
""", conn); df

Unnamed: 0,sID,sName,GPA,sizeHS
0,123,Amy,3.9,200
1,234,Bob,3.9,200
2,345,Craig,3.9,200
3,456,Doris,3.9,200
4,567,Edward,3.9,200
5,678,Fay,3.9,200
6,789,Gary,3.9,200
7,987,Helen,3.9,200
8,876,Irene,3.9,200
9,765,Jay,3.9,200
