# SQL Selection I

**Dr. Pengfei Zhao**

Finance Mathematics Program, 

BNU-HKBU United International College

## 1. Basic `select`

* The SELECT statement is used to select data from a database. The data returned is stored in a result table, called the result-set.

**Syntax**

> ```SQL
SELECT column1, column2, ..., FROM table_name;
```

* Here, column1, column2, ... are the field names of the table you want to select data from. If you want to select all the fields available in the table, use the following syntax:

> ``` SQL
SELECT * FROM table_name;
```

#### Fill the Table

In [7]:
%load_ext sql
import pymysql
pymysql.install_as_MySQLdb()

%sql mysql://few:123456@localhost/HelloDB?charset=utf8

'Connected: few@HelloDB'

In [8]:
%sql show tables;

4 rows affected.


Tables_in_HelloDB
company_info
daily_bar
hundred_stocks_twoyears_daily_bar
sh50


In [9]:
%sql drop table daily_bar;

0 rows affected.


[]

In [15]:
%sql drop table company_info;
%sql drop table sh50;

0 rows affected.
0 rows affected.


[]

In [1]:
import pandas as pd
stocks_info_df = pd.read_csv('../data/stocks_info.csv')

In [2]:
stocks_info_df.tail()

Unnamed: 0,index,code,name,industry,area,pe,outstanding,totals,totalAssets,liquidAssets,...,bvps,pb,timeToMarket,undp,perundp,rev,profit,gpr,npr,holders
3440,3440,300727,润禾材料,化工原料,浙江,0.0,0.0,0.0,43196.92,25291.61,...,0.0,0.0,0,8362.07,0.0,0.0,0.0,29.56,10.75,0.0
3441,3441,300723,一品红,化学制药,广东,0.0,0.0,0.0,76309.68,41579.12,...,0.0,0.0,0,22526.03,0.0,0.0,0.0,56.95,10.57,0.0
3442,3442,300721,怡达股份,化工原料,江苏,0.0,0.0,0.0,88405.46,52139.9,...,0.0,0.0,0,16709.92,0.0,0.0,0.0,16.56,5.44,0.0
3443,3443,2912,中新赛克,软件服务,深圳,0.0,0.0,0.0,95634.96,77540.45,...,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,31.45,0.0
3444,3444,2911,佛燃股份,供气供热,广东,0.0,0.0,0.0,441762.84,90943.81,...,0.0,0.0,0,89013.92,0.0,0.0,0.0,22.0,9.61,0.0


In [5]:
from sqlalchemy import create_engine
conn_helloDB = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}".format(user="few", pw="123456", db="HelloDB"))

In [18]:
stocks_info_df.to_sql(con=conn_helloDB, name='company_info', if_exists='replace', index=False)

In [14]:
sh50_df = pd.read_csv('../data/sh50.csv', sep='\t')
sh50_df.head()

Unnamed: 0,code,name,weight
0,603993,洛阳钼业,0.4
1,601989,中国重工,1.08
2,601988,中国银行,1.79
3,601985,中国核电,0.64
4,601881,中国银河,0.14


In [16]:
sh50_df.to_sql(con=conn_helloDB, name='sh50', if_exists='replace', index=False)

In [23]:
sh50_1year_df = pd.read_csv('../data/sh50_oneyear_dailybar.csv', sep=',')
sh50_1year_df.head()

Unnamed: 0.1,Unnamed: 0,code,date,open,high,close,low,volume
0,0,600000,2014-11-12,10.85,11.07,11.05,10.74,2710385.75
1,1,600000,2014-11-13,11.1,11.38,10.94,10.86,3712639.75
2,2,600000,2014-11-14,10.88,10.98,10.9,10.75,1984143.5
3,3,600000,2014-11-17,10.94,11.08,10.75,10.73,2145198.25
4,4,600000,2014-11-18,10.75,10.8,10.45,10.43,2549009.0


In [24]:
sh50_1year_df = sh50_1year_df.iloc[:,1:]

In [25]:
sh50_1year_df.head()

Unnamed: 0,code,date,open,high,close,low,volume
0,600000,2014-11-12,10.85,11.07,11.05,10.74,2710385.75
1,600000,2014-11-13,11.1,11.38,10.94,10.86,3712639.75
2,600000,2014-11-14,10.88,10.98,10.9,10.75,1984143.5
3,600000,2014-11-17,10.94,11.08,10.75,10.73,2145198.25
4,600000,2014-11-18,10.75,10.8,10.45,10.43,2549009.0


In [26]:
sh50_1year_df.to_sql(con=conn_helloDB, name='sh50_oneyear_dailybar', if_exists='replace', index=False)

In [27]:
%sql show tables;

4 rows affected.


Tables_in_HelloDB
company_info
hundred_stocks_twoyears_daily_bar
sh50
sh50_oneyear_dailybar


**Example**

In [20]:
%sql SHOW TABLES;

3 rows affected.


Tables_in_HelloDB
company_info
hundred_stocks_twoyears_daily_bar
sh50


In [33]:
import pandas as pd
from sqlalchemy import create_engine

df_sh50 = pd.read_sql('select * from sh50', con=conn_helloDB)
df_company = pd.read_sql('select * from company_info', con=conn_helloDB)
df_sh50_1year = pd.read_sql('select * from sh50_oneyear_dailybar', con=conn_helloDB)

In [28]:
# select entire table
%sql SELECT * FROM sh50

50 rows affected.


code,name,weight
603993,洛阳钼业,0.4
601989,中国重工,1.08
601988,中国银行,1.79
601985,中国核电,0.64
601881,中国银河,0.14
601878,浙商证券,0.12
601857,中国石油,1.08
601818,光大银行,1.39
601800,中国交建,0.4
601766,中国中车,1.66


In [29]:
# select some columns from the table
%sql SELECT code, name FROM sh50

50 rows affected.


code,name
603993,洛阳钼业
601989,中国重工
601988,中国银行
601985,中国核电
601881,中国银河
601878,浙商证券
601857,中国石油
601818,光大银行
601800,中国交建
601766,中国中车


In [19]:
df_sh50[['code','name']]

Unnamed: 0,code,name
0,603993,洛阳钼业
1,601989,中国重工
2,601988,中国银行
3,601985,中国核电
4,601881,中国银河
5,601878,浙商证券
6,601857,中国石油
7,601818,光大银行
8,601800,中国交建
9,601766,中国中车


### 2. `DISTINCT` keyword

* Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.

* The `SELECT DISTINCT` statement is used to return only distinct (different) values.

**Syntax**

> ```SQL
SELECT DISTINCT column1, column2, ... FROM table_name;
```

**Example**

In [30]:
%sql DESCRIBE company_info

24 rows affected.


Field,Type,Null,Key,Default,Extra
index,bigint(20),YES,,,
code,bigint(20),YES,,,
name,text,YES,,,
industry,text,YES,,,
area,text,YES,,,
pe,double,YES,,,
outstanding,double,YES,,,
totals,double,YES,,,
totalAssets,double,YES,,,
liquidAssets,double,YES,,,


In [31]:
%sql select industry from company_info

3445 rows affected.


industry
专用机械
化工原料
化工原料
通信设备
供气供热
半导体
证券
航空
元器件
元器件


In [7]:
%sql SELECT DISTINCT industry FROM company_info

110 rows affected.


industry
专用机械
化工原料
通信设备
供气供热
半导体
证券
航空
元器件
生物制药
塑料


In [34]:
df_company['industry'].unique()

array(['专用机械', '化工原料', '通信设备', '供气供热', '半导体', '证券', '航空', '元器件', '生物制药',
       '塑料', '白酒', '化学制药', '建筑施工', '饲料', '机械基件', '电气设备', '运输设备', '钢加工',
       '其他建材', '广告包装', '电器仪表', '乳制品', '铝', '服饰', '医疗保健', '软件服务', '保险',
       '综合类', '汽车配件', '空运', '轻工机械', '新型电力', '矿物制品', '染料涂料', '中成药', '农药化肥',
       '玻璃', '食品', '工程机械', '多元金融', '旅游服务', '互联网', '酒店餐饮', '小金属', '纺织机械',
       '超市连锁', '环境保护', '造纸', '家用电器', '橡胶', '全国地产', '百货', '影视音像', '水力发电',
       '区域地产', '商贸代理', '农业综合', '仓储物流', '纺织', '园区开发', '医药商业', '汽车整车',
       '种植业', '电脑设备', '林业', '旅游景点', '装修装饰', '日用化工', '公路', '渔业', '化纤',
       '水运', '特种钢', '水泥', '家居用品', '机场', '红黄药酒', '文教休闲', '啤酒', '电信运营',
       '房产服务', '出版业', '路桥', '摩托车', '铜', '化工机械', '汽车服务', '港口', '黄金', '批发业',
       '机床制造', '普钢', '其他商业', '软饮料', '农用机械', '水务', '铅锌', '煤炭开采', '银行',
       '陶瓷', '火力发电', '石油加工', '石油贸易', '石油开采', '焦炭加工', '船舶', '商品城', '公共交通',
       '铁路', '电器连锁'], dtype=object)

### 3. `WHERE` clause

* The WHERE clause is used to filter records, which extract only those records that fulfill a specified condition.

**Syntax**

>```SQL
SELECT column1, column2, ... FROM table_name WHERE condition;
```

**Note:** 
1. The `WHERE` clause is not only used in SELECT statement, it is also used in `UPDATE`, `DELETE` statement, etc.!
2. SQL requires single quotes around text values (most database systems will also allow double quotes). However, numeric fields should not be enclosed in quotes:

**Example**

In [12]:
%sql select * from company_info where industry='证券'

34 rows affected.


code,name,industry,area,pe,pb,totalAssets
600621,华鑫股份,证券,上海,89.22,2.29,1624948.5
600155,宝硕股份,证券,河北,71.35,1.43,4179352.5
601878,浙商证券,证券,浙江,56.6,4.66,5144893.0
2500,山西证券,证券,山西,58.18,2.36,5267785.5
2797,第一创业,证券,深圳,100.55,4.55,3060716.75
601099,太平洋,证券,云南,236.86,2.35,4339441.5
601375,中原证券,证券,河南,68.24,3.12,4051356.0
776,广发证券,证券,广东,16.52,1.68,36241140.0
601881,中国银河,证券,北京,30.29,2.13,24398524.0
601108,财通证券,证券,浙江,54.27,4.85,4966410.5


In [21]:
mask = df_company['industry']=='证券'
df_company[mask]

Unnamed: 0,code,name,industry,area,pe,pb,totalAssets
6,600621,华鑫股份,证券,上海,89.22,2.29,1624948.5
207,600155,宝硕股份,证券,河北,71.35,1.43,4179352.5
605,601878,浙商证券,证券,浙江,56.6,4.66,5144893.0
709,2500,山西证券,证券,山西,58.18,2.36,5267785.5
834,2797,第一创业,证券,深圳,100.55,4.55,3060716.75
907,601099,太平洋,证券,云南,236.86,2.35,4339441.5
913,601375,中原证券,证券,河南,68.24,3.12,4051356.0
923,776,广发证券,证券,广东,16.52,1.68,36241140.0
945,601881,中国银河,证券,北京,30.29,2.13,24398524.0
951,601108,财通证券,证券,浙江,54.27,4.85,4966410.5


**Operators in The WHERE Clause**

|  Operator | Description  |
|---|---|
| =  | equal  |
| <>  | Not equal. Note: In some versions of SQL this operator may be written as !=  |
| >  | Greater than  |
| <  | Less than  |
| >=  | Greater than or equal  |
| <=  | Less than or equal  |
| BETWEEN  | Between an inclusive range  |
| LIKE  | Search for a pattern  |
| IN  | To specify multiple possible values for a column  |

In [13]:
%sql select * from company_info where pe<10

352 rows affected.


code,name,industry,area,pe,pb,totalAssets
4,国农科技,生物制药,深圳,0.0,18.87,24121.07
300142,沃森生物,生物制药,云南,0.0,10.16,572023.31
600291,西水股份,保险,内蒙,9.09,2.36,23830208.0
300312,邦讯技术,通信设备,北京,0.0,5.54,131523.59
2306,*ST云网,酒店餐饮,北京,0.0,-269.49,9350.9
2194,武汉凡谷,通信设备,湖北,0.0,4.72,225832.41
611,天首发展,纺织,内蒙,0.0,4.72,75417.53
600641,万业企业,区域地产,上海,4.74,1.75,819652.31
600983,惠而浦,家用电器,安徽,0.0,1.62,840521.81
600151,航天机电,半导体,上海,0.0,1.92,1384813.38


In [16]:
%sql select * from company_info where name LIKE '%科技'

244 rows affected.


code,name,industry,area,pe,pb,totalAssets
2908,德生科技,元器件,广东,99.4,9.66,52750.28
4,国农科技,生物制药,深圳,0.0,18.87,24121.07
300716,国立科技,塑料,广东,40.32,3.16,72146.3
300620,光库科技,元器件,广东,75.0,9.16,52511.7
2618,丹邦科技,元器件,深圳,336.83,4.47,249018.52
2409,雅克科技,化工原料,江苏,211.42,9.02,179658.88
603690,至纯科技,专用机械,上海,116.82,14.12,95762.72
2617,露笑科技,电气设备,浙江,40.81,5.12,583202.25
300548,博创科技,元器件,浙江,65.76,8.58,65914.06
2202,金风科技,电气设备,新疆,18.43,2.6,6782537.0


### 4. `AND`, `OR` and `NOT`

* The WHERE clause can be combined with AND, OR, and NOT operators.
* The AND and OR operators are used to filter records based on more than one condition:
    * The AND operator displays a record if all the conditions separated by AND is TRUE.
    * The OR operator displays a record if any of the conditions separated by OR is TRUE.
* The NOT operator displays a record if the condition(s) is NOT TRUE.

**Syntax**

>```SQL
SELECT column1, ... FROM table_name WHERE condition1 AND condition2 AND ...;

> SELECT column1, ... FROM table_name WHERE condition1 OR condition2 OR ...;

> SELECT column1, ... FROM table_name WHERE NOT condition;
```



**Example**

In [24]:
%sql select * from company_info where industry = '银行' and pe<10

20 rows affected.


code,name,industry,area,pe,pb,totalAssets
600000,浦发银行,银行,上海,6.64,0.96,606383680.0
600016,民生银行,银行,北京,5.72,0.85,571252480.0
600036,招商银行,银行,深圳,8.71,1.56,616923904.0
1,平安银行,银行,深圳,8.27,1.07,313748096.0
601169,北京银行,银行,北京,6.42,0.87,227522592.0
601818,光大银行,银行,北京,5.51,0.77,403041408.0
600919,江苏银行,银行,江苏,7.49,1.03,173756544.0
601009,南京银行,银行,江苏,6.65,1.17,114467088.0
601166,兴业银行,银行,福建,5.6,0.92,640699328.0
601229,上海银行,银行,上海,8.63,1.08,175999888.0


In [31]:
mask = (df_company['industry']=='银行') & (df_company['pe']<10)
df_company[mask]

Unnamed: 0,code,name,industry,area,pe,pb,totalAssets
961,600000,浦发银行,银行,上海,6.64,0.96,606383700.0
1063,600016,民生银行,银行,北京,5.72,0.85,571252500.0
1497,600036,招商银行,银行,深圳,8.71,1.56,616923900.0
1857,1,平安银行,银行,深圳,8.27,1.07,313748100.0
2014,601169,北京银行,银行,北京,6.42,0.87,227522600.0
2090,601818,光大银行,银行,北京,5.51,0.77,403041400.0
2093,600919,江苏银行,银行,江苏,7.49,1.03,173756500.0
2105,601009,南京银行,银行,江苏,6.65,1.17,114467100.0
2183,601166,兴业银行,银行,福建,5.6,0.92,640699300.0
2296,601229,上海银行,银行,上海,8.63,1.08,175999900.0


### 5. `ORDER BY` Keyword

* The `ORDER BY` keyword is used to sort the result-set in ascending or descending order.

* The `ORDER BY` keyword sorts the records in **ascending order by default**. To sort the records in descending order, use the `DESC` keyword.

**Syntax**

>```SQL
SELECT column1, ... FROM table_name ORDER BY column1, ... ASC|DESC;
```

**Example**

In [32]:
%sql select * from company_info where industry = '银行' and pe<10 order by pb DESC

20 rows affected.


code,name,industry,area,pe,pb,totalAssets
2142,宁波银行,银行,浙江,8.72,1.74,95306440.0
600036,招商银行,银行,深圳,8.71,1.56,616923904.0
601997,贵阳银行,银行,贵州,7.88,1.42,43293212.0
601009,南京银行,银行,江苏,6.65,1.17,114467088.0
600926,杭州银行,银行,浙江,9.7,1.15,79777160.0
601229,上海银行,银行,上海,8.63,1.08,175999888.0
1,平安银行,银行,深圳,8.27,1.07,313748096.0
601398,工商银行,银行,北京,6.82,1.04,2576479744.0
600919,江苏银行,银行,江苏,7.49,1.03,173756544.0
601939,建设银行,银行,北京,6.34,1.01,2205394176.0


### 6 `LIMIT` keyword

* The `LIMIT` keyword is used to specify the number of records to return.

* The `LIMIT` keyword is useful on large tables with thousands of records. Returning a large number of records can impact on performance.

**Syntax**

>```SQL
SELECT column_name(s) FROM table_name WHERE condition LIMIT number;
```

**Note:** Not all database systems support the  clause. MySQL supports the LIMIT clause to select a limited number of records, while Oracle uses ROWNUM.

**Example**

In [60]:
%sql select * from company_info ORDER BY totalAssets DESC limit 10;

10 rows affected.


code,name,industry,area,pe,pb,totalAssets
601398,工商银行,银行,北京,6.82,1.04,2576479744.0
601939,建设银行,银行,北京,6.34,1.01,2205394176.0
601288,农业银行,银行,北京,5.53,0.89,2092311808.0
601988,中国银行,银行,北京,5.87,0.83,1942243712.0
601328,交通银行,银行,上海,6.25,0.76,893579072.0
601166,兴业银行,银行,福建,5.6,0.92,640699328.0
600036,招商银行,银行,深圳,8.71,1.56,616923904.0
601318,中国平安,保险,深圳,14.45,2.85,616851584.0
600000,浦发银行,银行,上海,6.64,0.96,606383680.0
600016,民生银行,银行,北京,5.72,0.85,571252480.0


In [61]:
%sql SELECT * from company_info WHERE area='广东' ORDER BY totalAssets DESC limit 10;

10 rows affected.


code,name,industry,area,pe,pb,totalAssets
600048,保利地产,全国地产,广东,11.23,1.31,63262392.0
776,广发证券,证券,广东,16.52,1.68,36241140.0
333,美的集团,家用电器,广东,17.41,4.89,24218328.0
651,格力电器,家用电器,广东,13.33,4.69,22072378.0
600029,南方航空,空运,广东,9.8,1.82,21133400.0
100,TCL 集团,家用电器,广东,23.93,2.47,16090642.0
600325,华发股份,区域地产,广东,13.17,1.23,13346109.0
601238,广汽集团,汽车整车,广东,15.1,3.48,9184121.0
987,越秀金控,多元金融,广东,40.11,2.01,7776926.5
539,粤电力Ａ,火力发电,广东,25.88,1.14,7098163.0


### 7. `MIN` and `MAX` function

* The `MIN()` function returns the smallest value of the selected column.

* The `MAX()` function returns the largest value of the selected column.

>```SQL
SELECT MIN/MAX(column_name) FROM table_name WHERE condition;
```

**Example**

In [36]:
%sql SELECT MAX(pe) from company_info;

1 rows affected.


MAX(pe)
20163.08


In [42]:
import numpy as np

df_company[df_company.pe==np.max(df_company.pe)]

Unnamed: 0,code,name,industry,area,pe,pb,totalAssets
1445,600733,S*ST前锋,区域地产,四川,20163.08,53.53,38522.1


### 8. `COUNT()`, `AVG()` and `SUM()` Functions

* The `COUNT()` function returns the number of rows that matches a specified criteria.

* The `AVG()` function returns the average value of a numeric column.

* The `SUM()` function returns the total sum of a numeric column.

**Syntax**

>```SQL
SELECT COUNT/AVG/SUM(column_name) FROM table_name WHERE condition;
```

**Example**

In [43]:
%sql SELECT COUNT(*) FROM company_info;

1 rows affected.


COUNT(*)
3445


In [44]:
%sql SELECT AVG(totalAssets) FROM company_info;

1 rows affected.


AVG(totalAssets)
6270242.639500718


### 9. `LIKE` Operator

* The `LIKE` operator is used in a `WHERE` clause to search for a specified pattern in a column.

* There are two wildcards used in conjunction with the LIKE operator:

    * `%` - The percent sign represents zero, one, or multiple characters
    * `_` - The underscore represents a single character

**Syntax**

>```SQL
SELECT column1, column2, ...FROM table_name WHERE columnN LIKE pattern;
```

**Example**

In [48]:
%sql SELECT * FROM company_info 

3445 rows affected.


code,name,industry,area,pe,pb,totalAssets
300722,N新余,专用机械,江西,28.1,3.51,43229.96
603916,N苏博特,化工原料,江苏,25.74,2.4,249746.05
300725,N药石,化工原料,江苏,17.03,3.01,36124.62
603083,N剑桥,通信设备,上海,28.87,2.19,184472.72
600903,贵州燃气,供气供热,贵州,25.26,1.28,751383.75
600460,士兰微,半导体,浙江,76.67,5.33,594216.69
600621,华鑫股份,证券,上海,89.22,2.29,1624948.5
300719,安达维尔,航空,北京,317.04,5.98,58974.3
603920,世运电路,元器件,广东,43.0,3.91,294390.53
2908,德生科技,元器件,广东,99.4,9.66,52750.28


In [50]:
%sql SELECT * FROM company_info WHERE industry LIKE '%软件%'

160 rows affected.


code,name,industry,area,pe,pb,totalAssets
600406,国电南瑞,软件服务,江苏,43.39,5.54,1754030.75
600570,恒生电子,软件服务,浙江,90.55,12.64,496096.06
300645,正元智慧,软件服务,浙江,1368.84,6.9,79530.81
300608,思特奇,软件服务,北京,348.03,5.46,81293.43
603138,海量数据,软件服务,北京,76.2,12.22,62705.81
300366,创意信息,软件服务,四川,67.42,2.44,374654.53
300508,维宏股份,软件服务,上海,57.8,10.67,53064.4
2544,杰赛科技,软件服务,广东,259.26,7.58,393695.28
2280,联络互动,软件服务,浙江,472.3,3.44,1398823.0
300525,博思软件,软件服务,福建,212.06,8.14,59848.9


### 10. `IN` Operator

* The `IN` operator allows you to specify multiple values in a WHERE clause.

* The `IN` operator is a shorthand for multiple OR conditions.

**Syntax**

>```SQL
SELECT column_name(s) FROM table_name WHERE column_name IN (value1, ...);
```

>```SQL
SELECT column_name(s) FROM table_name WHERE column_name IN (SELECT STATEMENT);
```

**Example**

In [58]:
%sql SELECT * FROM company_info WHERE area IN ('黑龙江', '吉林', '辽宁');

152 rows affected.


code,name,industry,area,pe,pb,totalAssets
600167,联美控股,供气供热,辽宁,37.71,3.46,997789.81
603396,金辰股份,专用机械,辽宁,61.12,7.01,95960.03
603559,中通国脉,通信设备,吉林,326.32,12.42,78824.17
300597,吉大通信,通信设备,吉林,114.85,7.23,93014.45
600747,*ST大控,多元金融,辽宁,70.63,2.25,237582.34
603315,福鞍股份,机械基件,辽宁,114.02,5.0,114385.97
661,长春高新,生物制药,吉林,48.46,7.4,741841.38
631,顺发恒业,区域地产,吉林,12.71,1.8,1394656.25
600360,华微电子,半导体,吉林,82.42,2.95,375349.19
2667,鞍重股份,工程机械,辽宁,77.35,3.32,87334.43


In [64]:
%sql SELECT * FROM company_info WHERE name IN (SELECT name FROM sh50) ORDER BY totalAssets DESC

50 rows affected.


code,name,industry,area,pe,pb,totalAssets
601398,工商银行,银行,北京,6.82,1.04,2576479744.0
601288,农业银行,银行,北京,5.53,0.89,2092311808.0
601988,中国银行,银行,北京,5.87,0.83,1942243712.0
601328,交通银行,银行,上海,6.25,0.76,893579072.0
601166,兴业银行,银行,福建,5.6,0.92,640699328.0
600036,招商银行,银行,深圳,8.71,1.56,616923904.0
601318,中国平安,保险,深圳,14.45,2.85,616851584.0
600000,浦发银行,银行,上海,6.64,0.96,606383680.0
600016,民生银行,银行,北京,5.72,0.85,571252480.0
601818,光大银行,银行,北京,5.51,0.77,403041408.0


In [72]:
mask = np.array([np.any(ele == df_sh50.name) for ele in df_company.name])
df_company[mask].sort_values(by='totalAssets', ascending=False)

Unnamed: 0,code,name,industry,area,pe,pb,totalAssets
3185,601398,工商银行,银行,北京,6.82,1.04,2576480000.0
2906,601288,农业银行,银行,北京,5.53,0.89,2092312000.0
2845,601988,中国银行,银行,北京,5.87,0.83,1942244000.0
2585,601328,交通银行,银行,上海,6.25,0.76,893579100.0
2183,601166,兴业银行,银行,福建,5.6,0.92,640699300.0
1497,600036,招商银行,银行,深圳,8.71,1.56,616923900.0
105,601318,中国平安,保险,深圳,14.45,2.85,616851600.0
961,600000,浦发银行,银行,上海,6.64,0.96,606383700.0
1063,600016,民生银行,银行,北京,5.72,0.85,571252500.0
2090,601818,光大银行,银行,北京,5.51,0.77,403041400.0


### 11. `BETWEEN` Operator

* The BETWEEN operator selects values within a given range. The values can be numbers, text, or dates.

* The BETWEEN operator is inclusive: begin and end values are included. 

**Syntax**

>```SQL
SELECT column_name(s) FROM table_name WHERE column_name BETWEEN value1 AND value2;
```

**Example**

In [76]:
%sql SELECT * from company_info WHERE pe BETWEEN 18 AND 20;

62 rows affected.


code,name,industry,area,pe,pb,totalAssets
601021,春秋航空,空运,上海,19.46,3.68,2052272.38
2202,金风科技,电气设备,新疆,18.43,2.6,6782537.0
600176,中国巨石,玻璃,浙江,18.95,3.31,2453806.5
887,中鼎股份,橡胶,安徽,19.08,3.35,1484711.13
786,北新建材,其他建材,北京,19.35,3.47,1640711.75
603369,今世缘,白酒,江苏,19.74,3.97,656467.25
600426,华鲁恒升,农药化肥,山东,19.77,2.51,1532190.88
600600,青岛啤酒,啤酒,山东,19.27,2.71,3255315.5
600618,氯碱化工,化工原料,上海,18.84,5.96,481418.06
601607,上海医药,医药商业,上海,18.61,2.0,9199127.0


In [81]:
%sql SELECT * from sh50_oneyear_dailybar WHERE date BETWEEN '2014-12-01' AND '2014-12-15' and code='600000';

11 rows affected.


code,date,open,high,close,low,volume
600000,2014-12-01 00:00:00,12.45,12.98,12.16,12.12,6072703.5
600000,2014-12-02 00:00:00,12.03,13.08,12.87,12.03,5893710.5
600000,2014-12-03 00:00:00,12.84,13.29,12.58,12.32,7305614.5
600000,2014-12-04 00:00:00,12.58,13.3,13.27,12.37,7200416.5
600000,2014-12-05 00:00:00,13.42,14.0,13.53,12.9,8596476.0
600000,2014-12-08 00:00:00,13.39,14.04,13.81,13.2,7039225.0
600000,2014-12-09 00:00:00,13.56,14.16,12.65,12.46,8691938.0
600000,2014-12-10 00:00:00,12.7,13.25,13.16,12.21,6202724.0
600000,2014-12-11 00:00:00,12.96,13.55,13.05,12.85,4425712.0
600000,2014-12-12 00:00:00,13.05,13.38,12.98,12.78,3105587.5


In [86]:
mask = (df_sh50_1year.code=='600000') & (df_sh50_1year.date>='2014-12-01') & (df_sh50_1year.date<='2014-12-15')
df_sh50_1year[mask]

Unnamed: 0,code,date,open,high,close,low,volume
13,600000,2014-12-01,12.45,12.98,12.16,12.12,6072703.5
14,600000,2014-12-02,12.03,13.08,12.87,12.03,5893710.5
15,600000,2014-12-03,12.84,13.29,12.58,12.32,7305614.5
16,600000,2014-12-04,12.58,13.3,13.27,12.37,7200416.5
17,600000,2014-12-05,13.42,14.0,13.53,12.9,8596476.0
18,600000,2014-12-08,13.39,14.04,13.81,13.2,7039225.0
19,600000,2014-12-09,13.56,14.16,12.65,12.46,8691938.0
20,600000,2014-12-10,12.7,13.25,13.16,12.21,6202724.0
21,600000,2014-12-11,12.96,13.55,13.05,12.85,4425712.0
22,600000,2014-12-12,13.05,13.38,12.98,12.78,3105587.5


In [None]:
**Syntax**

In [None]:
>```SQL
```

In [None]:
**Syntax**

In [None]:
>```SQL
```

In [None]:
**Syntax**

In [None]:
**Example**

In [None]:
**Example**

In [None]:
**Example**

In [None]:
**Example**

In [None]:
**Example**