### Logistic regression Final Model

####  $$\Theta=[\theta_0...\theta_6]$$
#### $$\hat p=\sigma({-X^T\Theta})=\frac{1}{1+exp(-X^T\Theta)} $$
#### 1. intercept θ0: -0.40939392
#### 2. ∆MDS-UPDRS 3 θ1: 0.69060146
#### 3. ∆Axial 1 θ2: 0.19562998
#### 4. ∆Tremor θ3: 0.34653553
#### 5. ∆Limb Rigidity θ4: 0.085848
#### 6. ∆Common daily activities θ5: 0.16140227 
#### 7. ∆Bulbar θ6: 0.58873592

In [61]:
from sklearn.linear_model import LogisticRegression
model=LogisticRegression(penalty='none')
model.coef_=np.array([0.69060146,0.19562998,0.34653553, 0.085848,0.16140227,0.58873592]).reshape(1,-1)
model.intercept_=np.array(-0.40939392).reshape(1,1)
model.classes_=np.array([0,1])

### Baseline Variables: These are the features from the baseline data. The last column is the class (1 early progressor, 0 non-early progressor) and is based on the difference in terms of MDS-UPDRS part  between the baseline and the fourth year. The column 'Diff_BL_1yr' is the difference between the baseline and 1yr of the sum of MDS_UPDRS part 3.

In [100]:
import pandas as pd
BL=pd.read_csv('./BL_Features.csv').drop(columns='Unnamed: 0')
BL.info()
BL.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 219 entries, 0 to 218
Data columns (total 39 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   PATNO                     219 non-null    int64  
 1   BMI                       219 non-null    float64
 2   SYSSUP                    219 non-null    float64
 3   HRSUP                     219 non-null    float64
 4   age                       219 non-null    float64
 5   gen                       219 non-null    int64  
 6   EDUCYRS                   219 non-null    int64  
 7   ageonset                  219 non-null    float64
 8   agediag                   219 non-null    float64
 9   DOMSIDE                   219 non-null    float64
 10  Cognitive                 219 non-null    float64
 11  Sleep                     219 non-null    float64
 12  Autonomic_Nervous_System  219 non-null    float64
 13  Bulbar                    219 non-null    float64
 14  Common_dai

Unnamed: 0,PATNO,BMI,SYSSUP,HRSUP,age,gen,EDUCYRS,ageonset,agediag,DOMSIDE,...,DVS_LNS,MCATOT,QUIP_Sum,REM_Sum,SCOPA_TOT,STAI_Sum,DVT_SDM,UPSIT_Sum,Diff_BL_1yr,Class
0,3001,22.156529,146.0,63.0,65.1425,1,16,63.5918,64.2603,2.0,...,17.0,29.0,1.0,4.0,12.0,51.0,48.330002,25.0,8.0,1
1,3002,28.280724,136.0,64.0,67.5781,2,16,65.5205,66.5041,1.0,...,13.0,29.0,1.0,8.0,22.0,69.0,47.5,17.0,10.0,0
2,3003,27.717685,110.0,60.0,56.7178,2,16,51.8274,54.6685,2.0,...,13.0,25.0,0.0,3.0,16.0,51.0,37.5,23.0,15.0,1
3,3010,28.650756,140.0,52.0,46.9657,1,16,45.9657,46.6068,1.0,...,14.0,26.0,0.0,10.0,9.0,72.0,52.0,9.0,12.0,1
4,3012,24.598765,100.0,77.0,58.8192,1,16,55.5462,58.5342,2.0,...,11.0,27.0,0.0,9.0,14.0,71.0,46.0,15.0,13.0,0


### 1 year Variables: These are the features from 1-year data.

In [101]:
yr_1=pd.read_csv('./1_yr_Features.csv').drop(columns='Unnamed: 0')
yr_1.info()
yr_1.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 328 entries, 0 to 327
Data columns (total 27 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   PATNO                     328 non-null    int64  
 1   BMI                       328 non-null    float64
 2   SYSSUP                    328 non-null    float64
 3   HRSUP                     328 non-null    float64
 4   Cognitive                 328 non-null    float64
 5   Sleep                     328 non-null    float64
 6   Autonomic_Nervous_System  328 non-null    float64
 7   Bulbar                    328 non-null    float64
 8   Common_daily_act          328 non-null    float64
 9   Bed                       328 non-null    float64
 10  Gait                      328 non-null    float64
 11  Axial_Sub_1               328 non-null    float64
 12  Limb_Rig_Sub              328 non-null    float64
 13  Tremor_Sub                328 non-null    float64
 14  Append_Sub

Unnamed: 0,PATNO,BMI,SYSSUP,HRSUP,Cognitive,Sleep,Autonomic_Nervous_System,Bulbar,Common_daily_act,Bed,...,Epworth_SUM,GDS_SUM,DVT_TOTAL_RECALL,DVS_LNS,MCATOT,QUIP_Sum,REM_Sum,SCOPA_TOT,STAI_Sum,DVT_SDM
0,3001,22.066947,139.0,66.0,4.0,2.0,6.0,0.0,1.0,0.0,...,3.0,2.0,49.0,17.0,30.0,0.0,5.0,20.0,65.0,43.330002
1,3002,28.160551,120.0,70.0,3.0,2.0,3.0,7.0,5.0,1.0,...,14.0,5.0,52.0,12.0,29.0,0.0,6.0,23.0,74.0,45.0
2,3003,26.97404,110.0,63.0,0.0,4.0,5.0,1.0,2.0,0.0,...,6.0,1.0,49.0,11.0,28.0,0.0,4.0,17.0,47.0,33.75
3,3006,22.195131,145.0,62.0,2.0,0.0,0.0,5.0,4.0,1.0,...,6.0,0.0,32.0,14.0,27.0,0.0,4.0,9.0,41.0,45.0
4,3010,30.09496,141.0,50.0,4.0,3.0,3.0,5.0,7.0,0.0,...,6.0,5.0,39.0,11.0,26.0,0.0,12.0,10.0,75.0,51.0


### To combine the 2 files merge on PATNO and drop the missing values

In [104]:
x=BL.merge(yr_1,on='PATNO',how='left')

In [106]:
x.dropna()

Unnamed: 0,PATNO,BMI_x,SYSSUP_x,HRSUP_x,age,gen,EDUCYRS,ageonset,agediag,DOMSIDE,...,Epworth_SUM_y,GDS_SUM_y,DVT_TOTAL_RECALL_y,DVS_LNS_y,MCATOT_y,QUIP_Sum_y,REM_Sum_y,SCOPA_TOT_y,STAI_Sum_y,DVT_SDM_y
0,3001,22.156529,146.0,63.0,65.1425,1,16,63.5918,64.2603,2.0,...,3.0,2.0,49.0,17.0,30.0,0.0,5.0,20.0,65.0,43.330002
1,3002,28.280724,136.0,64.0,67.5781,2,16,65.5205,66.5041,1.0,...,14.0,5.0,52.0,12.0,29.0,0.0,6.0,23.0,74.0,45.000000
2,3003,27.717685,110.0,60.0,56.7178,2,16,51.8274,54.6685,2.0,...,6.0,1.0,49.0,11.0,28.0,0.0,4.0,17.0,47.0,33.750000
3,3010,28.650756,140.0,52.0,46.9657,1,16,45.9657,46.6068,1.0,...,6.0,5.0,39.0,11.0,26.0,0.0,12.0,10.0,75.0,51.000000
4,3012,24.598765,100.0,77.0,58.8192,1,16,55.5462,58.5342,2.0,...,8.0,4.0,34.0,13.0,30.0,0.0,11.0,25.0,75.0,43.750000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
213,4112,27.870370,129.0,57.0,53.4603,1,18,53.1456,53.1838,2.0,...,6.0,4.0,56.0,17.0,27.0,0.0,3.0,9.0,79.0,52.000000
214,4113,22.892820,120.0,70.0,33.7178,2,14,30.9918,31.8384,1.0,...,1.0,1.0,27.0,7.0,23.0,0.0,1.0,2.0,80.0,39.000000
215,4115,26.989619,140.0,75.0,66.5753,1,16,64.5589,66.4000,2.0,...,2.0,2.0,53.0,12.0,26.0,0.0,2.0,9.0,50.0,51.000000
216,4117,30.986310,133.0,70.0,59.9315,2,13,58.5699,59.7014,2.0,...,3.0,3.0,34.0,8.0,26.0,1.0,4.0,6.0,75.0,52.500000
