# **数据加载、存储与文件格式**
---

### **读写文本格式的数据**
> **read_csv和read_table**

In [2]:
import numpy as np
import pandas as pd
df = pd.read_csv('examples/ex1.csv')
df


Unnamed: 0,a,b,c,d,message
0,1,2,3,4,hello
1,5,6,7,8,world
2,9,10,11,12,foo


In [3]:
import numpy as np
import pandas as pd
df = pd.read_table('examples/ex1.csv',sep=',')
df

Unnamed: 0,a,b,c,d,message
0,1,2,3,4,hello
1,5,6,7,8,world
2,9,10,11,12,foo


> **names：无标题文件可以自定义表格标题**

In [4]:
pd.read_csv('examples/ex2.csv',names=['a','b','c','d','message'])

Unnamed: 0,a,b,c,d,message
0,1,2,3,4,hello
1,5,6,7,8,world
2,9,10,11,12,foo


> **indes_col：指定文件列作为索引**

In [5]:
names = ['a', 'b', 'c', 'd', 'message']
pd.read_csv('examples/ex2.csv',names=names,index_col='message')

Unnamed: 0_level_0,a,b,c,d
message,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
hello,1,2,3,4
world,5,6,7,8
foo,9,10,11,12


> **index_col=['key1', 'key2']创建分层索引**

In [6]:
parsed = pd.read_csv('examples/csv_mindex.csv',
                    index_col=['key1', 'key2'])
parsed

Unnamed: 0_level_0,Unnamed: 1_level_0,value1,value2
key1,key2,Unnamed: 2_level_1,Unnamed: 3_level_1
one,a,1,2
one,b,3,4
one,c,5,6
one,d,7,8
two,a,9,10
two,b,11,12
two,c,13,14
two,d,15,16


> **skiprows=[1,2,3,4]：跳过行**

In [13]:
pd.read_csv('examples/ex4.csv',skiprows=[0,2,3])

Unnamed: 0,a,b,c,d,message
0,1,2,3,4,hello
1,5,6,7,8,world
2,9,10,11,12,foo


In [16]:
result = pd.read_csv('examples/ex5.csv')
result

Unnamed: 0,something,a,b,c,d,message
0,one,1,2,3.0,4,
1,two,5,6,,8,world
2,three,9,10,11.0,12,foo


> **read_csv/table常见参数**
 ---
> **path：文件位置**   
> **sep:拆分方式、分隔符或者正则表达式**  
> **header：列名，如果没有列名应该设置为none**  
> **index_col：索引列**   
> **names:设置列名，结合header**   
> **skiprows :跳过行**  
> **na_values:替换NA值**  
> **nrows:需要读取的行数**  

### **读取文件块**

> **nrows：读取指定行数**

In [23]:
pd.options.display.max_rows = 10
result = pd.read_csv('examples/ex6.csv',index_col='key',nrows=5)
result

Unnamed: 0_level_0,one,two,three,four
key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
L,0.467976,-0.038649,-0.295344,-1.824726
B,-0.358893,1.404453,0.704965,-0.200638
G,-0.50184,0.659254,-0.421691,-0.057688
R,0.204886,1.074134,1.388361,-0.982404
Q,0.354628,-0.133116,0.283763,-0.837063


> **chunksize:逐块读取**

In [None]:
chunker = pd.read_csv('ch06/ex6.csv', chunksize=1000)
chunker

### **将数据写出到文本格式**

> **to_csv:将数据写入文件**  
> **na_rep='null'：null替换空值**  
> **index=False, header=False:对于没数据的列禁用索引和列名**

In [26]:
data = pd.read_csv('examples/ex5.csv')
data.to_csv('examples/out.csv')


In [27]:
import sys
data.to_csv(sys.stdout,sep='|')

|something|a|b|c|d|message
0|one|1|2|3.0|4|
1|two|5|6||8|world
2|three|9|10|11.0|12|foo


### **二进制数据格式**
---

### **Web APIs 交互**
---

### **数据库交互**
---