# Data Manipulation Language (DML)

**Dr. Pengfei Zhao**

Finance Mathematics Program, 

BNU-HKBU United International College

 <img src="../Figures/DB/dml.png" width = "580" height = "350" alt="图片名称" align=center />

### 1. INSERT INTO

* The `INSERT INTO` statement is used to insert new records in a table.

**Syntax**

> INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);

**Example**

In [1]:
import pandas as pd
import numpy as np

In [2]:
df_company_info = pd.read_csv('../data/stocks_info.csv', sep=',').set_index('code').drop('index', axis=1)
mask = (df_company_info.index>0) & (df_company_info.index<100)
df_company_info = df_company_info[mask]
df_company_info.head()

Unnamed: 0_level_0,name,industry,area,pe,outstanding,totals,totalAssets,liquidAssets,fixedAssets,reserved,...,bvps,pb,timeToMarket,undp,perundp,rev,profit,gpr,npr,holders
code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
4,国农科技,生物制药,深圳,0.0,0.83,0.84,24121.07,15411.35,3121.76,66.48,...,1.37,18.87,19910114,1968.0,0.23,-67.37,-136.97,50.38,-7.65,12388.0
50,深天马Ａ,元器件,深圳,34.87,14.01,14.01,2723411.25,953202.69,652960.94,1133764.63,...,10.3,2.43,19950315,160825.38,1.15,30.28,99.48,21.23,7.46,74135.0
39,中集集团,轻工机械,深圳,35.71,12.64,29.81,13555758.0,6347079.5,2144378.25,345642.41,...,10.27,2.04,19940408,1856273.75,6.23,54.26,790.49,18.38,2.43,73120.0
32,深桑达Ａ,元器件,深圳,83.63,2.79,4.22,203884.5,148940.69,10375.28,26385.09,...,3.49,3.41,19931028,53116.03,1.26,-24.19,-21.12,18.36,3.97,26883.0
70,特发信息,通信设备,深圳,31.28,5.42,6.27,579230.81,431329.66,56197.21,73884.7,...,3.01,3.8,20000511,47786.65,0.76,15.22,38.05,16.86,4.58,39923.0


In [3]:
df_sub = df_company_info[['name', 'industry', 'area', 'pe', 'totalAssets', 'esp']]

In [4]:
df_sub.head()

Unnamed: 0_level_0,name,industry,area,pe,totalAssets,esp
code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
4,国农科技,生物制药,深圳,0.0,24121.07,-0.065
50,深天马Ａ,元器件,深圳,34.87,2723411.25,0.538
39,中集集团,轻工机械,深圳,35.71,13555758.0,0.42
32,深桑达Ａ,元器件,深圳,83.63,203884.5,0.107
70,特发信息,通信设备,深圳,31.28,579230.81,0.274


In [6]:
%load_ext sql
try:
    import pymysql
    pymysql.install_as_MySQLdb()
except ImportError:
    pass

In [7]:
%sql mysql://few:123456@localhost/HelloDB

'Connected: few@HelloDB'

In [8]:
%sql INSERT INTO company_info (code, name, industry, area, pe, totalAssets, eps) \
            VALUES ('000039', '中集集团', '轻工机械', '深圳', 35.71, 13555758.00, 0.420);

1 rows affected.


[]

In [9]:
%sql select * from company_info

1 rows affected.


code,name,industry,area,pe,totalAssets,eps
39,中集集团,轻工机械,深圳,35.71,13555758.0,0.42


In [10]:
df_daily_bar = pd.read_csv('../data/hundred_stocks_twoyears_daily_bar.csv', sep=',')
df_39 = df_daily_bar[df_daily_bar['code']==39]
df_39.head()

Unnamed: 0,code,date,open,high,close,low,volume
15782,39,2015-08-03,22.5,23.7,23.67,22.31,405551.91
15783,39,2015-08-04,23.73,25.15,25.07,23.26,386567.91
15784,39,2015-08-05,25.07,25.3,24.16,24.01,291495.03
15785,39,2015-08-06,23.8,24.67,23.85,23.5,191049.45
15786,39,2015-08-07,24.18,24.9,24.6,24.18,221424.77


In [11]:
%sql INSERT INTO daily_bar (code, date, open, high, close, low, volume) \
            VALUES ('000039', '2015-08-03', 22.50, 23.70, 23.67, 22.31, 405551.91);
%sql INSERT INTO daily_bar (code, date, open, high, close, low, volume) \
            VALUES ('000039', '2015-08-04', 23.73, 25.15,25.07, 23.26, 386567.91);
%sql INSERT INTO daily_bar (code, date, open, high, close, low, volume) \
            VALUES ('000039', '2015-08-05', 25.07, 25.30, 24.16, 24.01, 291495.03);

1 rows affected.
1 rows affected.
1 rows affected.


[]

In [3]:
%sql delete from daily_bar

2 rows affected.


[]

In [5]:
%sql select * from daily_bar;

3 rows affected.


code,date,open,high,close,low,volume
39,2015-08-03 00:00:00,22.5,23.7,23.67,22.31,405551.91
39,2015-08-04 00:00:00,23.73,25.15,25.07,23.26,386567.91
39,2015-08-05 00:00:00,25.07,25.3,24.16,24.01,291495.03


**Test about FOREIGN KEY**

* "daily_bar" table has a foreign key referencing to "company_info", from what we learned about `integrity constraint`, we know operations (1) adding a record in "daily_bar" table whose "code" is not in parent table "company_info" (2) deleting a record in "company_info" whose "code" exists in "daily_bar" table will be aborted.

**Adding a record in "daily_bar" table whose "code" is not in parent table "company_info"**

In [12]:
%sql INSERT INTO daily_bar (code, date, open, high, close, low, volume) \
            VALUES ('000045', '2015-08-03', 22.50, 23.70, 23.67, 22.31, 405551.91);

IntegrityError: (pymysql.err.IntegrityError) (1452, 'Cannot add or update a child row: a foreign key constraint fails (`HelloDB`.`daily_bar`, CONSTRAINT `FK_bar` FOREIGN KEY (`code`) REFERENCES `company_info` (`code`))') [SQL: "INSERT INTO daily_bar (code, date, open, high, close, low, volume)             VALUES ('000045', '2015-08-03', 22.50, 23.70, 23.67, 22.31, 405551.91);"] (Background on this error at: http://sqlalche.me/e/gkpj)

You can see the above error in the last line "Cannot add or update a child row: a foreign key constraint fails".

## 2. DELETE

* The DELETE statement is used to delete existing records in a table.

**Syntax**

>```SQL
DELETE FROM table_name WHERE condition;
```

**Note:** Be careful when deleting records in a table! Notice the WHERE clause in the DELETE statement. The WHERE clause specifies which record(s) that should be deleted. If you omit the WHERE clause, **all records** in the table will be deleted!

**Example**

In [13]:
%sql select * from daily_bar

3 rows affected.


code,date,open,high,close,low,volume
39,2015-08-03 00:00:00,22.5,23.7,23.67,22.31,405551.91
39,2015-08-04 00:00:00,23.73,25.15,25.07,23.26,386567.91
39,2015-08-05 00:00:00,25.07,25.3,24.16,24.01,291495.03


In [14]:
%sql delete from daily_bar where date='2015-08-05'
%sql select * from daily_bar

1 rows affected.
2 rows affected.


code,date,open,high,close,low,volume
39,2015-08-03 00:00:00,22.5,23.7,23.67,22.31,405551.91
39,2015-08-04 00:00:00,23.73,25.15,25.07,23.26,386567.91


**Example: Test about FOREIGN KEY**

 **Deleting a record in "company_info" whose "code" exists in "daily_bar" table**

In [15]:
%sql delete from company_info where code='000039'

IntegrityError: (pymysql.err.IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`HelloDB`.`daily_bar`, CONSTRAINT `FK_bar` FOREIGN KEY (`code`) REFERENCES `company_info` (`code`))') [SQL: "delete from company_info where code='000039'"] (Background on this error at: http://sqlalche.me/e/gkpj)

You can see the above error in the last line "Cannot delete or update a parent row: a foreign key constraint fails"

## 3. UPDATE 

* The UPDATE statement is used to modify the existing records in a table.

**Syntax**

>```SQL
UPDATE table_name SET column1 = value1, column2 = value2,... WHERE condition
```

**Note:** Be careful when updating records in a table! Notice the WHERE clause in the UPDATE statement. The WHERE clause specifies which record(s) that should be updated. If you omit the WHERE clause, **all records** in the table will be updated!

**Example**

In [16]:
%sql select * from company_info

1 rows affected.


code,name,industry,area,pe,totalAssets,eps
39,中集集团,轻工机械,深圳,35.71,13555758.0,0.42


In [17]:
%sql UPDATE company_info SET pe=36.71 WHERE code='000039'

1 rows affected.


[]

In [18]:
%sql select * from company_info 

1 rows affected.


code,name,industry,area,pe,totalAssets,eps
39,中集集团,轻工机械,深圳,36.71,13555758.0,0.42
