EBEncoding - efficient bitwise encoding for temporal (medical) data
Episode Bitwise Encoding is an encoding method designed for abstracting multiple medication episodes of a patient related to a certain event, e.g., an adverse drug event. This being said, such encoding is actually domain independent and can be applied to any temporal events.
The motivation of such encoding is to make time series data easy to be consumed. One time series (e.g., a drug usage in past months) can be encoded as one numberic value. Multiple time series (e.g., multipharmacy) can be encoded as a vector. Therefore, after encoded, such data can be easily analysed (e.g., used in off-the-shelf machine learning algorithms). In our first use case, it is used for predicting adverse drug events for patients with mental health disorders.
- (7 December 2016) Mimic III data encoding functions added including postgres DAO and mimic events encoding. Play with the
MEEncoder.pyto get a feeling.
- (11 November 2016) Cross correlation (similar to the correlaiton in signal processing) between two encodings is implemented. The correlation result is a list of 2-element tuple, of which the first element is the time shift and the second is the value of the correlation based on the time shift. This correlation enables the calculation of various time correlations between two encodings, e.g., which one is earlier than the other and how many time units; time delay analysis: at what time shift the correlation value achieves the maximum value. Also, a negative time delay analysis is on its way, which will be very useful for analysing the effectiveness of treatment episodes for certain symptoms/disorders.
An example of usage can be found at the function of
- (23 September 2016) Define classess of EBEncoding and EBVector with opertators. The update was put on a new branch, which was set as the default branch. A sample usage file was added and the applicaiton of the encoding/vectors/matrix in Adverse Drug Event analytics was implemented in the
The EBEncoding.py contains the encoding class and vector class definition. Two usage examples:
- the general usage example is available here
- the application of the encoding in Adverse Drug Event Analytics is here
- prefer encoding a real EHR data? check
test_sepsis_encoding()in mimic events encoding if you have access to Mimic III.
Analytics using the coding
- Association Analysis of Adverse Drug Events and Polyphamacy
- Drug-drug interaction analysis: using SVD (Singular Value Decomposition) on the matrix of drug-drug interaction Episode Encodings over 47k Adverse Events has revealed some potential new knowledge. The top 5 singular vectors after removing known causes of the ADE are visualised here. The absolute y values represent the significances of each drug pair in terms of its correlation to the adverse event. (This study is an ongoing work and more details will be updated soon.)
##Questions? This is my ongoing work (2016) at Kings College London. Any questions please email: email@example.com.
##citation If you find this useful, please cite the following publication.
- Wu, Honghan, Zina M. Ibrahim, Ehtesham Iqbal, and Richard JB Dobson. “Encoding Medication Episodes for Adverse Drug Event Prediction.” In Research and Development in Intelligent Systems XXXIII: Incorporating Applications and Innovations in Intelligent Systems XXIV, pp. 245-250. Springer International Publishing, 2016.