# Ghehlien

**Ghehlien** (系聯) is a clustering method used in Old Chinese Phonology

## 1. Analyses of Kuangxyonh

In this section, the ghehlien of **pyanxchet upper characters** (反切上字) and **pyanxchet lower characters** (反切下字) in **Kuangxyonh** (廣韻) will be analysed.

### 1.1 Reading data from file

In [1]:
using CSV

In [2]:
df = CSV.read("data.csv", types = Dict(7 => String))

Unnamed: 0,廣韻韻部順序&廣韻韻部原貌(調整前),小韻序,上字,下字,中古拼音(polyhedron 版),廣韻字頭(覈校後),小韻內字序
1,上平01東,1,德,紅,tung,東,1
2,上平01東,1,德,紅,tung,菄,2
3,上平01東,1,德,紅,tung,鶇,3
4,上平01東,1,德,紅,tung,䍶,4
5,上平01東,1,德,紅,tung,𠍀,5
6,上平01東,1,德,紅,tung,倲,6
7,上平01東,1,德,紅,tung,𩜍,7
8,上平01東,1,德,紅,tung,𢘐,8
9,上平01東,1,德,紅,tung,涷,9
10,上平01東,1,德,紅,tung,蝀,10


### 1.2 Pyanxchet Upper Characters

In [3]:
include("fuzzynum.jl")
using fuzzynum

**1.2.1. Create a new set $S$ and put all upper characters into it:**

In [4]:
# Remove missing data, for those small rhymes (小韻) that has no pyanxchet

s = Set{String}(filter!(x -> typeof(x) == String, Array(df[:上字])))

Set(String["當", "跪", "女", "握", "羽", "危", "尼", "羊", "同", "醋"  …  "匹", "連", "征", "并", "下", "辝", "色", "卑", "視", "縛"])

**1.2.2. Zip all the pyanxchet upper characters with their pyanxchet upper characters**

In [5]:
function getUCList()
    dfG = Array(df[Symbol("廣韻字頭(覈校後)")])
    dfS = Array(df[:上字])
    lst = setToList(s)
    n = length(lst)
    ret = []
    for i in 1:n
        ch = lst[i]
        push!(ret, (ch, dfS[getIndexInArr(dfG, ch)]))
    end
    ret
end

getUCList (generic function with 1 method)

In [6]:
uclist = getUCList()

471-element Array{Any,1}:
 ("兹", "疾")
 ("鋤", "士")
 ("爭", "側")
 ("明", "武")
 ("之", "止")
 ("數", "所")
 ("北", "博")
 ("彼", "甫")
 ("衢", "其")
 ("爲", "薳")
 ("匹", "譬")
 ("愛", "烏")
 ("傍", "步")
 ⋮         
 ("平", "房")
 ("區", "豈")
 ("速", "桑")
 ("始", "詩")
 ("呵", "虎")
 ("部", "裴")
 ("諸", "章")
 ("丕", "敷")
 ("榮", "永")
 ("遵", "將")
 ("除", "直")
 ("狂", "巨")

**1.2.3 Do ghehlien**

In [7]:
ghehlien(uclist)

兹匠疾自情慈秦
鋤仕崇鶵查豺助雛士崱鉏牀
爭側阻仄莊鄒簪
明靡文美武亡望眉巫彌綿無
之章征氏占止旨煑脂識職正支諸
數山沙色疏疎生砂所史
北博伯布邊補巴百晡
彼父甫必兵并防筆弼婢陂符卑方皮縛扶畀分毗裴馮浮府鄙便封附房平部
爲雨筠于羽薳洧雲云永有韋王榮
衢具強俟求暨渠臼奇其巨狂
匹譬
愛哀安鷖烏
傍白捕薄蒲步
先胥蘇須司息素寫辛雖斯桑思相私悉速
胡侯獲乎下戶懷何黃
弋台隨悅實營辝余似旬夷辭以羊乘食寺詳移翼徐祥予與餘神夕
堂徒唐特同陀度杜
驅傾跪弃起袪曲乞綺丘欽詰羌去卿窺豈墟區
憂謁握挹央依烟於委衣一乙紆伊憶
署嘗承是成視市常蜀殊寔時殖
德多得
力連縷里良呂離林
借𩛠醉資祖將作即子姊漸則臧遵
當都冬
廁創瘡初叉楚測芻
郎魯練
治宅丈持佇植臣遟直墜池場柱馳除
天吐土託他通
虛香羲朽休興況許喜
姑乖各過兼楷公古佳格詭
母模慕莫摸謨
洛勒落賴盧來
如兒儒人耳而仍汝
火虎花馨荒海呼呵
豬追張竹丁卓徵陟珍迍知中褚猪
蒼麁取采麤倉遷醋青七千親
牛俄虞危宜玉遇魚擬疑研愚吾五語
乃奴內諾那㚷
康口謙枯恪苦空牽可客
女拏尼穠
披敷孚拂撫芳峯妃丕
前藏在徂才昨
舉規居九俱紀几吉
式矢施詩釋試傷失湯書賞舒商始
抽楮癡恥敕丑
滂普
充處赤尺叱昌
雌此


### 1.3 Pyanxchet Lower Characters

**1.3.1. Create a new set $S$ and put all upper characters into it:**

In [8]:
# Remove missing data, for those small rhymes (小韻) that has no pyanxchet

s = Set{String}(filter!(x -> typeof(x) == String, Array(df[:下字])))

Set(String["懈", "甾", "當", "甚", "法", "賄", "越", "俾", "運", "河"  …  "亞", "寸", "教", "戀", "畏", "位", "鄭", "醒", "贈", "圓"])

**1.3.2. Zip all the pyanxchet upper characters with their pyanxchet upper characters**

In [9]:
function getLCList()
    dfG = Array(df[Symbol("廣韻字頭(覈校後)")])
    dfS = Array(df[:下字])
    lst = setToList(s)
    n = length(lst)
    ret = []
    for i in 1:n
        ch = lst[i]
        ind = getIndexInArr(dfG, ch)
        if ind != -1
            if typeof(ch) == String && typeof(dfS[ind]) == String
                push!(ret, (ch, dfS[ind]))
            end
        end
    end
    ret
end

getLCList (generic function with 1 method)

In [10]:
lclist = getLCList()

1185-element Array{Any,1}:
 ("婁", "朱")
 ("肌", "夷")
 ("焉", "言")
 ("肴", "茅")
 ("制", "例")
 ("懈", "隘")
 ("鍾", "容")
 ("預", "洳")
 ("孟", "更")
 ("綸", "迍")
 ("爲", "支")
 ("灼", "若")
 ("甾", "持")
 ⋮         
 ("杯", "回")
 ("佃", "年")
 ("贈", "亙")
 ("襃", "毛")
 ("拜", "怪")
 ("荏", "甚")
 ("允", "準")
 ("赧", "板")
 ("牒", "協")
 ("斗", "口")
 ("曹", "勞")
 ("圓", "權")

**1.3.3 Do ghehlien**

In [11]:
ghehlien(lclist)

婁于熱朱足別句滅輸列誅俞隅辥逾俱芻
肌夷尼糾資私脂飢黝
焉軒言
肴孝交嘲茅稍皃覺教
制訐罽蔽袂例憩祭弊
懈隘卦賣
鍾封用凶容頌庸恭
預灼甾姐遮若嗟居車魚奢藥諸邪其而爵洳雀賒與兹之勺略余野持
孟行當盲浪宕剛岡郎庚更
綸筠脣贇倫旬遵勻迍
爲隨倚毀吹危帋垂紙是規綺離移知支累詭彼靡此髓爾侈捶氏隋委豸
隱謹
焮靳
帶太大轄貝艾蓋
敢覽埯
合閤荅沓雜
彪烋幽虯
類醉遂萃
賜避益迹昔智寄積義亦恚易辟豉
佞徑定
哀來開
勞刀遭牢曹
冉廉漸淹炎染琰占斂鹽
政正成盛盈貞姓征并情鄭
貢弄鳳送
晏澗鴈按旰諫案旦贊
妙虐笑肖約
幸耿
蛙緺媧
鑒懺
文倦權員彥變囀攣眷云分戀卷圓
綏維遺隹追
記既溉志豙吏置
道抱晧老早浩
甚深淫枕針朕稔荏
法乏
贍豔
乎姑吾孤胡都烏吳
皛晈鳥了皎
賄猥罪
典峴殄繭
激弔嘯叫
妹輩昧佩
酉九有柳婦久
麵見電練甸
夥𠁥蟹買
幻幰偃蹇辨免堰
摘核革責厄戹
越拔伐八發黠
戶補賈魯杜古
俾婢企弭
或國
男陷𧸖含南韽
刮䫄
筆乙密
涬冷靈刑萌頂莖爭鼎宏迥丁挺剄耕打經醒
巷絳
界戒怪壞介拜
逼即側力極直
灰恢回杯
恕署
證蒸乘應庱矜冰升膺𩜁仍孕甑兢陵
運問
勒德得則
昆尊䰟渾奔
計詣戾
緣川泉全宣專
四質叱至寐畢必利自一二日悉栗冀器七吉
唾臥钁縛貨籰
朗黨
犯錽范
河何俄歌
令仙扇然連延
店念
六逐菊竹福匊
㢡掌网妄養放昉丈往兩
疋葅
恆滕崩增登棱朋
賀邏个箇佐
訪亮向況讓㨾
候奏漏豆遘
羽甫矩雨武禹
展演翦煙輦先前淺善
真振珍遴鄰印刃覲晉人賓
𩏩嚴
飽巧絞爪
割曷葛達
還鰥關班頑
月厥物勿弗
銜監鑑
牙加霞巴
敏殞
泛終梵中弓眾融戎仲宮
禾婆和過波戈
但寒乾安干
兼甜
檻𣊟暫唵瞰禫濫蹔黤感
里紀史理擬士己
墨北黑
張羊章莊陽良
輒葉攝接涉
丸潘貫喚官端筭
董摠孔動
華瓜花
話夬邁快
玉欲蜀曲錄
在紿改肯等亥愷乃宰
膎佳
哉才
戰膳
冬宗
任林心尋
郤逆戟劇
尾匪
后垢口厚苟斗
恩痕根
霸㕦嫁訝駕化亞
楷皆駭諧
遇注具戍
爇衛歲芮銳輟劣稅
公蠓空紅東凍
扃螢
䒦凡
緩伴管滿旱纂笴
瀌夭表嬌矯喬囂
滑屑忽結蔑骨
內報秏隊繢對
甲狎
閑閒山
謝夜炙
呪宿祐副溜富又救
板綰鯇赧
協愜頰牒
京驚卿
病命
冢奉踵宂勇隴
役隻石
奇羈宜
末撥
桂惠
广奄儉檢險
翼職
篆兗轉緬
迄訖乞
綜宋統
庾主
潁營傾頃䁝
昭遙招
湩𪁪
準尹允
擊狄歷
建阮怨願袁販煩晚元万遠
秋周尤由鳩求州流
外會最

### 1.4 Pyanxchet Lower Characters (Grouped By Small Rhymes)

In [12]:
using Query

In [13]:
function solve_1_4()
    a1 = collect(zip(Array(df[Symbol("廣韻韻部順序&廣韻韻部原貌(調整前)")]), Array(df[:下字])))
    filter!(x -> typeof(x[1]) == String && typeof(x[2]) == String, a1)
    a11 = fastuniq(a1)
    a2 = zip(collect(zip(Array(df[Symbol("廣韻韻部順序&廣韻韻部原貌(調整前)")]), Array(df[Symbol("廣韻字頭(覈校後)")]))), df[:下字])
    d1 = Dict(a2)
    t1 = []
    for a in a11
        try
            res = d1[a]
            if typeof(res) != Missings.Missing
                push!(t1, (a[1], a[2], res))
            end
        catch
        end
    end
    [ Array(i) for i in @groupby(t1, x -> x[1], x -> x) ]
end

solve_1_4 (generic function with 1 method)

In [14]:
grp_1_4 = solve_1_4()

206-element Array{Array{Any,1},1}:
 Any[("上平01東", "紅", "公"), ("上平01東", "弓", "戎"), ("上平01東", "戎", "融"), ("上平01東", "中", "弓"), ("上平01東", "弓", "戎"), ("上平01東", "融", "戎"), ("上平01東", "戎", "融"), ("上平01東", "弓", "戎"), ("上平01東", "中", "弓"), ("上平01東", "宮", "戎")  …  ("上平01東", "戎", "融"), ("上平01東", "空", "紅"), ("上平01東", "終", "戎"), ("上平01東", "中", "弓"), ("上平01東", "紅", "公"), ("上平01東", "公", "紅"), ("上平01東", "紅", "公"), ("上平01東", "東", "紅"), ("上平01東", "公", "紅"), ("上平01東", "公", "紅")]
 Any[("上平02冬", "宗", "冬"), ("上平02冬", "冬", "宗"), ("上平02冬", "宗", "冬"), ("上平02冬", "冬", "宗"), ("上平02冬", "宗", "冬"), ("上平02冬", "冬", "宗")]                                                                                                                                                                                                                                                                                                         
 Any[("上平03鍾", "容", "封"), ("上平03鍾", "鍾", "容"), ("上平03鍾", "容", "封"), ("上平03鍾", "封", "容"), ("上平03鍾", "容", "封"),

In [15]:
for i in grp_1_4
    println("")
    println(i[1][1])
    ghehlien([ (String(x[2]), String(x[3])) for x in i ])
end


上平01東
公空紅東
融戎中弓終宮

上平02冬
宗冬

上平03鍾
凶容庸封鍾恭

上平04江
雙江

上平05支
規隋吹知爲離支隨危移垂
奇羈宜

上平06脂
肌夷飢資私尼脂
隹綏遺追維
悲眉

上平07之
兹其甾而持之

上平08微
歸微非韋
希依衣

上平09魚
諸余魚居

上平10虞
芻俱誅隅朱于逾輸俞
無夫

上平11模
吳吾胡都姑孤乎烏

上平12齊
迷兮雞低稽臡奚
圭攜

上平13佳
佳膎
媧蛙緺

上平14皆
諧皆
乖淮懷

上平15灰
恢回杯灰

上平16咍
來哀開
哉才

上平17真
巾銀
人鄰珍賓真
筠贇倫

上平18諄
迍倫勻綸旬脣遵

上平19臻
詵臻

上平20文
云分文

上平21欣
斤欣

上平22元
袁元煩
言軒

上平23魂
渾尊䰟奔昆

上平24痕
恩根痕

上平25寒
寒安干

上平26桓
潘丸官端

上平27刪
顏姦
關班頑還

上平28山
閒山閑
鰥頑

下平01先
先前煙
田年堅顛賢
玄涓

下平02仙
權攣員圓
焉乾
川泉專宣全緣
然連延仙

下平03蕭
彫聊蕭堯幺

下平04宵
招昭遙
消宵霄邀焦
喬瀌嬌囂

下平05肴
嘲肴茅交

下平06豪
毛袍襃
遭勞刀牢曹

下平07歌
何俄河歌

下平08戈
波禾戈和婆
迦伽
𦚢靴𩨷

下平09麻
牙加霞巴
華瓜花
賒車奢遮
邪嗟

下平10陽
張方陽莊章羊良王

下平11唐
郎當岡剛
黃光旁

下平12庚
榮兵明
行橫盲庚
京卿驚

下平13耕
萌莖宏耕

下平14清
貞成并征盈情
傾營

下平15青
丁刑靈經
扃螢

下平16蒸
膺仍冰矜兢乘升蒸陵

下平17登
崩棱恆朋登滕增
肱弘

下平18尤
尤周流求州鳩秋由
浮謀

下平19侯
侯婁鉤

下平20幽
虯烋幽彪

下平21侵
任林尋心
吟簪今金
深針淫

下平22覃
男南含

下平23談
酣甘談三

下平24鹽
炎淹占鹽廉

下平25添
甜兼

下平26咸
咸讒

下平27銜
監銜

下平28嚴
𩏩嚴

下平29凡
䒦凡

上01董
孔動摠蠓董

上02腫
冢奉宂勇隴踵
悚拱
𪁪湩

上03講
項慃講

上04紙
俾弭婢
爾是氏豸帋此侈紙
詭委綺彼毀捶髓靡累倚

上05旨
壘癸誄鄙美水軌洧
几履雉姊
視矢

上06止
理里紀己擬史士
市止

上07尾
豨豈
匪尾
鬼偉

上08語
渚舉巨呂与與許

上09麌
甫