# 假设

全眼的像差，是最终对视网膜产生影响的原因。那么使用佩戴OK镜之前、或者佩戴短期后的像差数据，是否可能预测出远期的眼轴长或者是屈光状态呢？

## 已知的缺陷

全眼像差受到多方面的影响：

* 瞳孔大小；
* 调节状态；
* 测量时间，OK镜佩戴后，白天的角膜形态是否会逐渐变化，导致像差随着时间改变。



# 数据

来自于[Predictive factors associated with axial length growth and myopia progression in orthokeratology ](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6561598/ )

该文献带有excel数据，共有7个sheet，分别是：

* age, sex, visual acuity：年龄，性别，视力，其中视力用LogMAR，包含了未矫正和最佳矫正视力。
* AXL：眼轴长，用IOL master测量了中央，鼻侧30度，颞侧30度
  >AXL measurement with IOLMaster (Carl Zeiss, Jena, Germany) in central, N30, and T30 gazes
* CR：散瞳验光，用WAM-5500测量了中央，鼻侧30度，颞侧30度
  >cycloplegic refraction; autorefraction (WAM-5500; Shigiya Machinery Works Ltd., Hiroshima, Japan) in central, 30° nasal (N30), and 30° temporal (T30) gazes under cycloplegia
* MR：
  >manifested refraction
* specular microscopy：不知为何，测量了角膜内皮细胞计数。
  >evaluation of the corneal endothelium via noncontact specular microscopy (SP-8000; Konan Medical, Nishinomiya, Japan). 
* aberrometer：像差，给了高阶的Zernike系数。
  >wavefront assessment for a 6-mm pupil using a WASCA aberrometer (Carl Zeiss, Jena, Germany) following pupil dilation using a mixture of 0.5% phenylephrine and 0.5% tropicamide (Mydrin-P; Santen Pharmaceutical, Osaka, Japan)
* pentacam：角膜地形图。
  很遗憾，这里面不是raw data，只有Pre和12mo的K1, K2
* orbscan II：角膜地形图。
  也不是角膜地形图的原始数据，但除了Kmin, Kmax,还有Central corneal thickness, 3-mm-zone irregularity, 5-mm-zone irregularity

In [48]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from functools import reduce


import os

%matplotlib inline

## 数据清洗

读取数据后，将数据分成两部分，（务必要注意是否有数据泄露）

* X：从这些数据可能推导出结果，我估计会有术前的数据，一部分术后的数据。
    * patient_info中：``` ['Age','Sex (male = 1, female = 2']```
    * AXL：```['Pre C AXL', 'Pre N AXL', 'Pre T AXL'] ```
    * CR：
      ```python
         ['Pre AR C Sph', 'Pre AR C cyl',\
          'Pre AR N Sph', 'Pre AR N cyl',\
          'Pre AR T Sph', 'Pre AR T cyl']  
       ```
    * aberrometer:
        * Pre和12mo （犹豫，不知道是否有数据泄露）
    * cornea：
        * Pre和12mo
* Y：
    * AXL：12mo的C，N，T，以及delta，其中delta 12mo C AXL是最重要的数据。

In [89]:
data_file=os.path.join('data',"pone.0218140.s001.xlsx")

patient_info=pd.read_excel(data_file,sheet_name="age, sex, visual acuity")
AXL=pd.read_excel(data_file,sheet_name="AXL")
CR=pd.read_excel(data_file,sheet_name="CR ")

# 以下两个sheet中，顶部有Pre，12mo一行，
# 略去，使得每一行与其他表格中的行位置相等。
aberrometer=pd.read_excel(data_file,sheet_name="aberrometer",header=1) 
cornea=pd.read_excel(data_file,sheet_name="orbscan II",header=1) 
data_frames=[patient_info,AXL,CR,aberrometer,cornea]

并不是所有的人都测量了所有的参数，所以将Patient ID和眼别整合到一起，形成一个新的eyeID。

In [91]:
for d in data_frames:
    d["Patient"].fillna(method='ffill',inplace = True)
    d["eyeID"]=d["Patient"]+" "+d['OD1, OS2'].map(str)


In [95]:
df = reduce(lambda left,right: pd.merge(left,right,on='eyeID'), data_frames)

In [96]:
df

Unnamed: 0,Patient_x,"Sex (male = 1, female = 2","OD1, OS2_x",Age,log UCVA,log BCVA,eyeID,Patient_y,"OD1, OS2_y",Pre C AXL,...,anterior chamber depth,Sim K's astigmatism.1,Kmax.1,Kmin.1,Central corneal thickness.1,3-mm-zone irregularity.1,5-mm-zone irregularity.1,pupil diameter.1,white-to-white.1,anterior chamber depth.1
0,#1,2.0,1.0,9.0,0.69897,0.045757,#1 1.0,#1,1.0,23.57,...,3.08,0.7,44.3,43.6,494.0,1.9,2.1,5.0,11.2,2.98
1,#1,,2.0,,0.30103,0.0,#1 2.0,#1,2.0,23.46,...,3.1,1.3,45.4,44.1,492.0,1.8,2.0,5.0,11.2,2.96
2,#2,2.0,1.0,9.0,0.69897,0.0,#2 1.0,#2,1.0,24.2,...,3.09,2.2,42.3,40.1,549.0,2.9,3.5,5.2,11.6,3.07
3,#2,,2.0,,0.69897,0.045757,#2 2.0,#2,2.0,24.09,...,3.04,1.5,42.5,40.9,553.0,3.6,3.4,4.4,11.5,3.02
4,#3,2.0,1.0,9.0,0.522879,0.0,#3 1.0,#3,1.0,24.23,...,3.14,0.7,41.5,40.8,568.0,2.6,3.8,4.4,11.4,3.11
5,#3,,2.0,,0.522879,0.0,#3 2.0,#3,2.0,24.11,...,3.15,0.4,42.0,41.6,572.0,1.4,1.8,4.6,11.5,3.1
6,#4,2.0,1.0,9.0,0.522879,0.0,#4 1.0,#4,1.0,23.07,...,3.13,0.9,44.2,43.4,511.0,2.0,2.3,4.9,11.1,3.16
7,#4,,2.0,,0.522879,0.0,#4 2.0,#4,2.0,22.96,...,2.99,0.8,44.0,43.2,519.0,2.4,2.9,5.0,11.2,3.11
8,#5,2.0,1.0,9.0,0.522879,0.0,#5 1.0,#5,1.0,24.9,...,3.27,0.6,41.0,40.4,534.0,3.3,3.2,4.9,11.9,3.26
9,#5,,2.0,,0.522879,0.0,#5 2.0,#5,2.0,24.86,...,3.08,1.5,41.7,40.2,529.0,6.2,8.1,5.2,12.2,3.25
