PPMI Parkinson's Progressive Markers Initiative
This repository has code that is used to process data from the PPMI dataset. There are no data files stored in this repository.
-
Apply for data access at PPMI- Download Data
-
After your application is approved and you can log in
- click on
Download
->Study-Data
. - Click the link on the left hand side (bottom) where it says
ALL
- Download
ALL tabular data
(csv format) andALL documents and zip files
[48.0 MB]
- click on
-
Clone this github repository
-
Save the downloaded files in the data_docs directory of this cloned repository. Extract any zip files if necessary.
-
Run the create_ppmi_database.py script which will create the ppmi database in the database directory
cd scripts/python
python scripts/create_ppmi_database.py
-
Install DB Browser for SQLite and use it to open the database that was created for you in the database directory.
-
DB Browser
will give you information on all the tables in the database. You can browser the data by table. After you have extracted information into the data_docs directory (see Setup the database), you will get further information on what the columns represent by looking at the files in the docs directory.- Page_Descriptions.csv A general idea of what the tables are named
- Data_Dictionary.csv A general idea of what the columns in the table are named
- Code_List.csv A code book that tells us how certain variables are coded
- Derived Variables Definition A document that tells us how to create derived variables like UPDRS scores etc.
-
Use SQL scripts to extract information from the database.
WITH A AS ( /* Calculated Part III UPDRS Score for everyone */
SELECT PATNO, EVENT_ID ,
NP3SPCH + NP3FACXP + NP3RIGN + NP3RIGRU + NP3RIGLU + PN3RIGRL +
NP3RIGLL + NP3FTAPR + NP3FTAPL + NP3HMOVR + NP3HMOVL + NP3PRSPR +
NP3PRSPL + NP3TTAPR + NP3TTAPL + NP3LGAGR + NP3LGAGL + NP3RISNG +
NP3GAIT + NP3FRZGT + NP3PSTBL + NP3POSTR + NP3BRADY + NP3PTRMR +
NP3PTRML + NP3KTRMR + NP3KTRML + NP3RTARU + NP3RTALU + NP3RTARL +
NP3RTALL + NP3RTALJ + NP3RTCON as UPDRS_SCORE
FROM NUPDRS3 WHERE
PAG_NAME = "NUPDRS3"
ORDER BY PATNO)
/* Extract UPDRS scores at Baseline and sort in descending order */
SELECT * from A WHERE EVENT_ID="BL" ORDER BY UPDRS_SCORE DESC
The output in SQLite Browser looks something like this for the first few rows. It can be exported to a csv file for further analysis.
PATNO | EVENT_ID | UPDRS_SCORE |
---|---|---|
42142 | BL | 72 |
42435 | BL | 72 |
40763 | BL | 71 |
42439 | BL | 67 |
40751 | BL | 66 |
42312 | BL | 62 |
50496 | BL | 61 |
51652 | BL | 61 |
41489 | BL | 60 |
You can find more complex sql scripts in the scripts directory