# Julia DataFrames.jl 介紹 (一): 入門操作

![](https://juliadata.github.io/DataFrames.jl/stable/assets/logo.png)

DataFrames.jl 官方網站: [https://juliadata.github.io/DataFrames.jl/stable/](https://juliadata.github.io/DataFrames.jl/stable/)

DataFrames.jl GitHub: [https://github.com/JuliaData/DataFrames.jl/blob/master/docs/src/index.md](https://github.com/JuliaData/DataFrames.jl/blob/master/docs/src/index.md)

## 0. 安裝

如果尚未安裝過 DataFrames.jl 的話, 執行 `Pkg.add()` 進行安裝

In [1]:
using Pkg
Pkg.add(PackageSpec(name="DataFrames", version="0.20.2"))

[32m[1m   Updating[22m[39m registry at `C:\Users\HSI\.julia\registries\General`


[?25l

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`




[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `C:\Users\HSI\.julia\environments\v1.4\Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `C:\Users\HSI\.julia\environments\v1.4\Manifest.toml`
[90m [no changes][39m


## 1. 建立 DataFrame

In [2]:
using DataFrames

In [None]:
list1 = [:F1,:F2,:F3]
dfx = DataFrame();
for i in list1
    dfx[:,i] = 1:3
end

dfx

### 1.1 使用向量建立 DataFrame

In [9]:
methods(DataFrame)

In [7]:
df = DataFrame(col1 = 1:5, col2 = ["M", "F", "F", missing, "M"])
# df = DataFrame(1:5, ["M", "F", "F", missing, "M"]) # This fails.

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,M
2,2,F
3,3,F
4,4,missing
5,5,M


### 1.2 使用 column by column 的方式建立 DataFrame

In [191]:
# 使用建構子建立空的 DataFrame
df = DataFrame()

In [192]:
# 指定各個 column 及其值, 加入到 DataFrame 中
df.col1 = 1:5
df.col2 = ["M", "F", "F", missing, "M"]

# DataFrames.show() 函式顯示 DataFrame
show(df)

5×2 DataFrame
│ Row │ col1  │ col2    │
│     │ [90mInt64[39m │ [90mString⍰[39m │
├─────┼───────┼─────────┤
│ 1   │ 1     │ M       │
│ 2   │ 2     │ F       │
│ 3   │ 3     │ F       │
│ 4   │ 4     │ [90mmissing[39m │
│ 5   │ 5     │ M       │

### 1.3 新增 row 資料列到 DataFrame

新增 row 到 DataFrame, 資料值的部分可以使用 tuple, vector, 或是 dictionary

In [32]:
# 使用 tuple
push!(df, (1, "M"))

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,M
2,2,F
3,3,F
4,4,missing
5,5,M
6,1,M


In [33]:
# 使用 vector
push!(df, [2, "f"])

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,M
2,2,F
3,3,F
4,4,missing
5,5,M
6,1,M
7,2,f


In [209]:
describe(df[:,[:col2]])

Unnamed: 0_level_0,variable,mean,min,median,max,nunique,nmissing,eltype
Unnamed: 0_level_1,Symbol,Nothing,String,Nothing,String,Int64,Int64,Union
1,col2,,F,,M,2,1,"Union{Missing, String}"


In [211]:
# 使用 Dict
push!(df, Dict(:col2 => "F", :col1 => 2)) 
# question mark in a column title indicates that there are missing in the array

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,M
2,2,F
3,3,F
4,4,missing
5,5,M
6,2,F
7,2,F


In [35]:
push!(df, Dict("col2" => "F", "col1" => 2))

┌ Error: Error adding value to column :col1.
└ @ DataFrames C:\Users\HSI\.julia\packages\DataFrames\S3ZFo\src\dataframe\dataframe.jl:1316


KeyError: KeyError: key :col1 not found

In [187]:
dict1 = Dict(:key1 => 1,:key2 => 2)

Dict{Symbol,Int64} with 2 entries:
  :key2 => 2
  :key1 => 1

In [184]:
dict1 = Dict("key1" => 1,:key2 => 2)

Dict{Any,Int64} with 2 entries:
  :key2  => 2
  "key1" => 1

### 1.4 刪除 Row 資料

呼叫 `deleterows!()` 函式可將 DataFrame 中指定的 row 刪除

In [24]:
deleterows!(df, 7:8)

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,M
2,2,F
3,3,F
4,4,missing
5,5,M
6,1,M
7,2,f
8,1,M


### 1.5 載入資料集

延續先前範例, 使用 CSV 載入 UCI Machine Learning Repository 的 Auto MPG Data Set, 資料集的物件為 DataFrames 類型.

若尚未安裝 CSV.jl 的話請先安裝.

In [10]:
Pkg.add("CSV")

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.2/Manifest.toml`
[90m [no changes][39m


In [25]:
using CSV

使用 CSV.jl 透過 `read()` 函式將 CSV 資料產生為 DataFrame, `CSV.read()` 之回傳資料類別即為 DataFrame 型別.

warning 的訊息是正常的，原因是資料集裡面有缺值，所以載入 CSV 時會有警告訊息.

In [53]:
df = CSV.read("auto-mpg_D17.data", delim=',')



Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year
Unnamed: 0_level_1,Float64,Int64⍰,Float64,String,Float64⍰,Float64⍰,Float64
1,18.0,8,307.0,130.0,3504.0,12.0,70.0
2,15.0,8,350.0,165.0,3693.0,11.5,70.0
3,18.0,8,318.0,150.0,3436.0,11.0,70.0
4,16.0,8,304.0,150.0,3433.0,12.0,70.0
5,17.0,8,302.0,140.0,3449.0,10.5,70.0
6,15.0,8,429.0,198.0,4341.0,10.0,70.0
7,14.0,8,454.0,220.0,4354.0,9.0,70.0
8,14.0,8,440.0,215.0,4312.0,8.5,70.0
9,14.0,8,455.0,225.0,4425.0,10.0,70.0
10,15.0,8,390.0,190.0,3850.0,8.5,70.0


In [27]:
# 要顯示所有 column 或 row 的話, 可以透過 `show()` 函式
# 下面示範顯示第 1 - 5筆資料列的所有 column
show(df[1:5, :], allcols=true)

5×9 DataFrame
│ Row │ mpg     │ cylinders │ displacement │ horsepower │ weight   │
│     │ [90mFloat64[39m │ [90mInt64⍰[39m    │ [90mFloat64[39m      │ [90mString[39m     │ [90mFloat64⍰[39m │
├─────┼─────────┼───────────┼──────────────┼────────────┼──────────┤
│ 1   │ 18.0    │ 8         │ 307.0        │ 130.0      │ 3504.0   │
│ 2   │ 15.0    │ 8         │ 350.0        │ 165.0      │ 3693.0   │
│ 3   │ 18.0    │ 8         │ 318.0        │ 150.0      │ 3436.0   │
│ 4   │ 16.0    │ 8         │ 304.0        │ 150.0      │ 3433.0   │
│ 5   │ 17.0    │ 8         │ 302.0        │ 140.0      │ 3449.0   │

│ Row │ acceleration │ model year │ origin  │ car name                  │
│     │ [90mFloat64⍰[39m     │ [90mFloat64[39m    │ [90mFloat64[39m │ [90mString[39m                    │
├─────┼──────────────┼────────────┼─────────┼───────────────────────────┤
│ 1   │ 12.0         │ 70.0       │ 1.0     │ chevrolet chevelle malibu │
│ 2   │ 11.5         │ 70.0       │ 1.0     │ b

### 1.6 複製 DataFrame

呼叫 `copy()` 函式可以複製並建立一個新的 DataFrame

In [79]:
df2 = copy(df)
# df2 和 df 的內容相同
describe(df2) # For an object x, print descriptive statistics to io.

Unnamed: 0_level_0,variable,mean,min,median,max,nunique,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Union…,Union…,Type
1,col1,3.0,1,3.0,5,,,Int64
2,col2,,F,,M,2.0,1.0,"Union{Missing, String}"


You must use `df2 = copy(df)` or `df2 = df[:,:]` to duplicate an entire dataframe. Otherwise, modifications on one also apply to the other.

In [248]:
df = DataFrame();
df.col1 = 1:6
df.col2 = ["row 1", "row 2", "row 3", "row 4",missing, "row 6"];

In [249]:
df2 = df # df2 just points to df. No duplication
df3 = copy(df) # this duplicates
df4 = df[:,:] # this duplicates
df6 = first(df,5) # this duplicates
df5 = view(df,2,:) # return indices that point to df1. No duplication
df7 = df[!,:]

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,row 1
2,2,row 2
3,3,row 3
4,4,row 4
5,5,missing
6,6,row 6


In [254]:
deleterows!(df,[2,3]) # delete row 2, 3
deleterows!(df2,4) # delete row 4 (previous row 6)
deleterows!(df3,5)
deleterows!(df4,1:4)
show(df); show(df2); show(df3); show(df4); show(df6);

BoundsError: BoundsError: attempt to access 1-element Array{Int64,1} at index [4]

In [251]:
df

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,row 1
2,4,row 4
3,5,missing


`df5 = view(df,2,:)` views the 1st row in df. Since the rows 2,3,6 of df are deleted previously, the 2nd row is now the previous 4th row.

In [252]:
df5

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
2,4,row 4


In [253]:
df7

Unnamed: 0_level_0,col1,col2
Unnamed: 0_level_1,Int64,String⍰
1,1,row 1
2,4,row 4
3,5,missing


In [218]:
println(objectid(df))
println(objectid(df2))
println(objectid(df3))
println(objectid(df4))
println(objectid(df5))
println(objectid(df6))
println(objectid(df7)) # memory location

18105907993406196748
18105907993406196748
62453577532054375
12534539854090992422
17500274004664766282
10794258277842078017
5241647470823125230


## 2. 將 DataFrame 儲存到 CSV 檔案

In [134]:
df = CSV.read("auto-mpg_D17.data", delim=',')
CSV.write("a_D17.csv", df)



"a_D17.csv"

從目錄中可以看到 csv 檔案已寫入

In [135]:
readdir()

50-element Array{String,1}:
 ".ipynb_checkpoints"
 "04-02-2020.csv"
 "04-02-2020_D18.csv"
 "04-02-2020_D19.csv"
 "AbstractArray.png"
 "Julia_Number (2).png"
 "Julia_Number.png"
 "Questions to ask.ipynb"
 "a.csv"
 "a_D17.csv"
 "auto-mpg.data"
 "auto-mpg_D17.data"
 "b.json"
 ⋮
 "julia_014_database_IO_csv_json_SQL.ipynb"
 "julia_015_Logging.ipynb"
 "julia_016_Multiple Dispatch.ipynb"
 "julia_017_ DataFrames 入門.ipynb"
 "julia_018_example.ipynb"
 "julia_019_example-Copy1.ipynb"
 "julia_019_example.ipynb"
 "julia_020_example.ipynb"
 "log.txt"
 "pointy.png"
 "scalar_index.png"
 "train.csv"

In [136]:
# 使用 Julia 內建的 DelimitedFiles library 
using DelimitedFiles

In [137]:
# 驗證檔案 header 與前 5 筆資料
readdlm("a_D17.csv")[1:6]

6-element Array{Any,1}:
 "mpg,cylinders,displacement,horsepower,weight,acceleration,model"
 "18.0,8,307.0,130.0,3504.0,12.0,70.0,1.0,chevrolet"
 "15.0,8,350.0,165.0,3693.0,11.5,70.0,1.0,buick"
 "18.0,8,318.0,150.0,3436.0,11.0,70.0,1.0,plymouth"
 "16.0,8,304.0,150.0,3433.0,12.0,70.0,1.0,amc"
 "17.0,8,302.0,140.0,3449.0,10.5,70.0,1.0,ford"

## 3. DataFrame 的操作

### 3.1 檢視 DataFrame

In [138]:
# 檢視 DataFrame 的尺寸
size(df)

(398, 9)

In [139]:
# 彙總 DataFrame 資訊
describe(df)

Unnamed: 0_level_0,variable,mean,min,median,max,nunique,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Union…,Union…,Type
1,mpg,23.5146,9.0,23.0,46.6,,,Float64
2,cylinders,5.44836,3.0,4.0,8,,1.0,"Union{Missing, Int64}"
3,displacement,192.682,8.0,146.0,455.0,,,Float64
4,horsepower,,304.0,,?,94.0,,String
5,weight,2966.01,193.0,2797.5,5140.0,,6.0,"Union{Missing, Float64}"
6,acceleration,27.5656,8.0,15.5,4732.0,,6.0,"Union{Missing, Float64}"
7,model year,112.433,18.5,76.0,3035.0,,,Float64
8,origin,1.98719,1.0,1.0,70.0,,,Float64
9,car name,,71.0,,vw rabbit custom,306.0,,String


下面三行程式, 均可列出所有的 row 與 column

In [60]:
df
df[!, :]
df[:, :]

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year
Unnamed: 0_level_1,Float64,Int64⍰,Float64,String,Float64⍰,Float64⍰,Float64
1,18.0,8,307.0,130.0,3504.0,12.0,70.0
2,15.0,8,350.0,165.0,3693.0,11.5,70.0
3,18.0,8,318.0,150.0,3436.0,11.0,70.0
4,16.0,8,304.0,150.0,3433.0,12.0,70.0
5,17.0,8,302.0,140.0,3449.0,10.5,70.0
6,15.0,8,429.0,198.0,4341.0,10.0,70.0
7,14.0,8,454.0,220.0,4354.0,9.0,70.0
8,14.0,8,440.0,215.0,4312.0,8.5,70.0
9,14.0,8,455.0,225.0,4425.0,10.0,70.0
10,15.0,8,390.0,190.0,3850.0,8.5,70.0


在 Jupyter Notebook 環境中, 預設顯示螢幕容許大小的資料, 因此可能不會顯示所有 column 和 row. 使用 `show()` 函式, 可以有效地控制顯示. 下面的例子是設定 `allcols=true` 及 `allrows=true` 以顯示所有 column 及 row.

In [22]:
show(df, allcols=true, allrows=true)

398×9 DataFrame
│ Row │ mpg     │ cylinders │ displacement │ horsepower │ weight   │
│     │ [90mFloat64[39m │ [90mInt64⍰[39m    │ [90mFloat64[39m      │ [90mString[39m     │ [90mFloat64⍰[39m │
├─────┼─────────┼───────────┼──────────────┼────────────┼──────────┤
│ 1   │ 18.0    │ 8         │ 307.0        │ 130.0      │ 3504.0   │
│ 2   │ 15.0    │ 8         │ 350.0        │ 165.0      │ 3693.0   │
│ 3   │ 18.0    │ 8         │ 318.0        │ 150.0      │ 3436.0   │
│ 4   │ 16.0    │ 8         │ 304.0        │ 150.0      │ 3433.0   │
│ 5   │ 17.0    │ 8         │ 302.0        │ 140.0      │ 3449.0   │
│ 6   │ 15.0    │ 8         │ 429.0        │ 198.0      │ 4341.0   │
│ 7   │ 14.0    │ 8         │ 454.0        │ 220.0      │ 4354.0   │
│ 8   │ 14.0    │ 8         │ 440.0        │ 215.0      │ 4312.0   │
│ 9   │ 14.0    │ 8         │ 455.0        │ 225.0      │ 4425.0   │
│ 10  │ 15.0    │ 8         │ 390.0        │ 190.0      │ 3850.0   │
│ 11  │ 15.0    │ 8         │ 383.0  

`first()` 和 `last()` 函式用來顯示 DataFrame 中的前 n 筆或後 n 筆的資料

In [140]:
first(df, 5)

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year
Unnamed: 0_level_1,Float64,Int64⍰,Float64,String,Float64⍰,Float64⍰,Float64
1,18.0,8,307.0,130.0,3504.0,12.0,70.0
2,15.0,8,350.0,165.0,3693.0,11.5,70.0
3,18.0,8,318.0,150.0,3436.0,11.0,70.0
4,16.0,8,304.0,150.0,3433.0,12.0,70.0
5,17.0,8,302.0,140.0,3449.0,10.5,70.0


### 3.2 DataFrame 子集

要查看 DataFrame 子集, 可以使用 `df[<row index>, <column index>]`

In [141]:
df[1:5, :]

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year
Unnamed: 0_level_1,Float64,Int64⍰,Float64,String,Float64⍰,Float64⍰,Float64
1,18.0,8,307.0,130.0,3504.0,12.0,70.0
2,15.0,8,350.0,165.0,3693.0,11.5,70.0
3,18.0,8,318.0,150.0,3436.0,11.0,70.0
4,16.0,8,304.0,150.0,3433.0,12.0,70.0
5,17.0,8,302.0,140.0,3449.0,10.5,70.0


In [142]:
# 如前述, 顯示所有 column
show(df[1:5, :], allcols=true)

5×9 DataFrame
│ Row │ mpg     │ cylinders │ displacement │ horsepower │ weight   │
│     │ [90mFloat64[39m │ [90mInt64⍰[39m    │ [90mFloat64[39m      │ [90mString[39m     │ [90mFloat64⍰[39m │
├─────┼─────────┼───────────┼──────────────┼────────────┼──────────┤
│ 1   │ 18.0    │ 8         │ 307.0        │ 130.0      │ 3504.0   │
│ 2   │ 15.0    │ 8         │ 350.0        │ 165.0      │ 3693.0   │
│ 3   │ 18.0    │ 8         │ 318.0        │ 150.0      │ 3436.0   │
│ 4   │ 16.0    │ 8         │ 304.0        │ 150.0      │ 3433.0   │
│ 5   │ 17.0    │ 8         │ 302.0        │ 140.0      │ 3449.0   │

│ Row │ acceleration │ model year │ origin  │ car name                  │
│     │ [90mFloat64⍰[39m     │ [90mFloat64[39m    │ [90mFloat64[39m │ [90mString[39m                    │
├─────┼──────────────┼────────────┼─────────┼───────────────────────────┤
│ 1   │ 12.0         │ 70.0       │ 1.0     │ chevrolet chevelle malibu │
│ 2   │ 11.5         │ 70.0       │ 1.0     │ b

In [143]:
# 可以指定特定要查看的 row / column
df[[1, 3, 5], [1, 2, 9]]

Unnamed: 0_level_0,mpg,cylinders,car name
Unnamed: 0_level_1,Float64,Int64⍰,String
1,18.0,8,chevrolet chevelle malibu
2,18.0,8,plymouth satellite
3,17.0,8,ford torino


指定 column 可以使用 index, 也可以使用 column 名稱, 使用的方式為 ":" 加上 column 名稱. 示範如下:

In [144]:
df[1:5, [:mpg, :displacement, :horsepower]]

Unnamed: 0_level_0,mpg,displacement,horsepower
Unnamed: 0_level_1,Float64,Float64,String
1,18.0,307.0,130.0
2,15.0,350.0,165.0
3,18.0,318.0,150.0
4,16.0,304.0,150.0
5,17.0,302.0,140.0


### 3.3 `select()` 及 `select!()`

如果要篩選 DataFrame 中的 column, 可以使用 `select()` 和 `select!()` 函式. 兩者不同之處在於, `select()` 不會變更原 DataFrame 而會傳回傳變更後的 DataFrame, 而 `select!()`會變更原 DataFrame.

In [29]:
select(df2, 1:3)

Unnamed: 0_level_0,mpg,cylinders,displacement
Unnamed: 0_level_1,Float64,Int64⍰,Float64
1,18.0,8,307.0
2,15.0,8,350.0
3,18.0,8,318.0
4,16.0,8,304.0
5,17.0,8,302.0
6,15.0,8,429.0
7,14.0,8,454.0
8,14.0,8,440.0
9,14.0,8,455.0
10,15.0,8,390.0


In [30]:
# df2 未改變
size(df2)

(398, 9)

In [31]:
# 呼叫 select!() 後 df2 被變更, 僅剩下被篩選的 3 個 column
select!(df2, 1:3)
size(df2)

(398, 3)

### 3.4 行 (column) 的操作

#### Aggregate

`aggregate` 函式可以套用到 column 中的每一個值, 例如如果要計算及找出汽車油耗 (mpg) 與排氣量 (displacement) 的平均數和中位數, 可以透過下列的示範來達成. 計算平均數和中位數時, 我們運用 Statistics 模組的 `mean` 及 `median` 函式來計算.

In [165]:
df3 = CSV.read("a_D17.csv")
df3 = df[1:6,:]

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year
Unnamed: 0_level_1,Float64,Int64⍰,Float64,String,Float64⍰,Float64⍰,Float64
1,18.0,8,307.0,130.0,3504.0,12.0,70.0
2,15.0,8,350.0,165.0,3693.0,11.5,70.0
3,18.0,8,318.0,150.0,3436.0,11.0,70.0
4,16.0,8,304.0,150.0,3433.0,12.0,70.0
5,17.0,8,302.0,140.0,3449.0,10.5,70.0
6,15.0,8,429.0,198.0,4341.0,10.0,70.0


In [166]:
df3 = df[1:5, [:mpg, :displacement, :weight]]

Unnamed: 0_level_0,mpg,displacement,weight
Unnamed: 0_level_1,Float64,Float64,Float64⍰
1,18.0,307.0,3504.0
2,15.0,350.0,3693.0
3,18.0,318.0,3436.0
4,16.0,304.0,3433.0
5,17.0,302.0,3449.0


In [148]:
using Statistics

In [167]:
aggregate(df3, [mean, median])

Unnamed: 0_level_0,mpg_mean,displacement_mean,weight_mean,mpg_median,displacement_median,weight_median
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Float64,Float64
1,16.8,316.2,3503.0,17.0,307.0,3449.0


In [170]:
x2 = x -> x*2
aggregate(df3,[:mpg,:weight], x2)

Unnamed: 0_level_0,mpg,weight,displacement_function
Unnamed: 0_level_1,Float64,Float64⍰,Float64
1,18.0,3504.0,614.0
2,15.0,3693.0,700.0
3,18.0,3436.0,636.0
4,16.0,3433.0,608.0
5,17.0,3449.0,604.0


In [172]:
df = DataFrame(a = repeat([1, 2, 3, 4], outer=[2]),
                      b = repeat([2, 1], outer=[4]),
                      c = 1:8)

Unnamed: 0_level_0,a,b,c
Unnamed: 0_level_1,Int64,Int64,Int64
1,1,2,1
2,2,1,2
3,3,2,3
4,4,1,4
5,1,2,5
6,2,1,6
7,3,2,7
8,4,1,8


In [177]:
aggregate(df, :a, sum) # it seems that it sums the rows who's 'a' is the same.

Unnamed: 0_level_0,a,b,c_sum
Unnamed: 0_level_1,Int64,Int64,Int64
1,1,2,6
2,2,1,8
3,3,2,10
4,4,1,12


#### Sort 簡介

Sorting 在之後的內容會有更詳細的介紹

`sort()` 排序後不會改變原來的 DataFrame

In [34]:
sort(df3)

Unnamed: 0_level_0,mpg,displacement
Unnamed: 0_level_1,Float64,Float64
1,15.0,350.0
2,16.0,304.0
3,17.0,302.0
4,18.0,307.0
5,18.0,318.0


In [35]:
df3

Unnamed: 0_level_0,mpg,displacement
Unnamed: 0_level_1,Float64,Float64
1,18.0,307.0
2,15.0,350.0
3,18.0,318.0
4,16.0,304.0
5,17.0,302.0


`sort!()` 排序後會改變原來的 DataFrame

下面範例是依 displacement 反序排序

In [178]:
sort!(df3, :displacement, rev=true)

Unnamed: 0_level_0,mpg,displacement,weight
Unnamed: 0_level_1,Float64,Float64,Float64⍰
1,15.0,350.0,3693.0
2,18.0,318.0,3436.0
3,18.0,307.0,3504.0
4,16.0,304.0,3433.0
5,17.0,302.0,3449.0


In [37]:
df3

Unnamed: 0_level_0,mpg,displacement
Unnamed: 0_level_1,Float64,Float64
1,15.0,350.0
2,18.0,318.0
3,18.0,307.0
4,16.0,304.0
5,17.0,302.0


In [180]:
sort(df3, :mpg)

Unnamed: 0_level_0,mpg,displacement,weight
Unnamed: 0_level_1,Float64,Float64,Float64⍰
1,15.0,350.0,3693.0
2,16.0,304.0,3433.0
3,17.0,302.0,3449.0
4,18.0,318.0,3436.0
5,18.0,307.0,3504.0
