Python app for labelling ECG text files exported from the BARD EP system
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
exports
icons
ml
processing
settings
ui
README.md
main.py

README.md

BARD-labeller

Python app for labelling ECG text files exported from the BARD EP system. Labelling is done with simple drag & drop. Able to export specific segments of an ECG to python numpy files in batches.

With thanks to Dr Ahran Arnold, Ben Cullin and Rohin Reddy.

About

This program is poorly documented because I am a terrible person. I apologise but very happy to answer any questions if you email me at james@jph.am

Requirements

  • PyQt5
  • Pandas
  • SciPy
  • pywt (conda install pywavelets)
  • Statsmodels (conda install -c statsmodels statsmodels)

Running the program

Just run main.py from the root and you'll be greeted with: welcome screen

Reporting cases

First, place all the exported *.txt files for specific case in specific subfolder within the "./data/" directory, e.g. "./data/case1/", then select that folder using the 'load patient' icon in the top left corner.

Second, fill in the procedure ID, procedure date and procedure type (protocol) on the left side. These parameters are associated with the patient (the active folder) rather than individual text files. This program comes with labelling protocols for 3 types of studies:

  1. Accessory pathway ablations
  2. Ventricular ectopy ablations
  3. AVNRT ablations To add more protocols, edit the "./settings/settings.py" file, which should be fairly self-explanatory.

Third, select the text file you wish to label using the 'file selector' drop down box.

The plot can be moved by holding left mouse and dragging. The plot can be stretched vertically or horizontally by holding mouse 2 and dragging. Zooming/unzooming is performed by scrolling.

Use the buttons on the left to add either range sliders. Labelled segments appear green. Red buttons are those which are marked as 'mandatory' in the protocol settings.

A labelled plot might look like the following:

label example

To delete labels, double-click their entry in the labels tables visible on the right hand side.

Exporting cases

When you first choose to export, you'll be forced to select a folder under which all the labelled data exist. It doesn't matter if they're nested. The window will initially appear blank, until you choose a case type to filter and click the filter button in the bottom left. You should then get a view of all of the valid files, like below:

export example

You will then be able to export a dataset using the export option on the right.

Note:

  • Ensure the correct 'start' and 'stop' options are selected on the left. To export a whole labelled cardiac cycle for every study where it is present, you would choose 'rr START' to 'rr STOP', whereas to export a sinus QRS-T segment you would choose 'sinus pwave START' to 'sinus pwave STOP'.

  • After clicking filter, it will tell you how many valid studies will be exported. You can choose to export all of these, or exclude 1% or 5% of the outliers (judged by duration) by choosing the '99%' or '95%' export range options'. This allows you to easily exclude cases which may well be errors, or whose inclusion could result in you greatly increasing your sequence length at the expense of only a couple of cases.

  • Export buffer can be a positive or negative integer which allows you to add on or take off a number of datapoints from each cycle, e.g. to ensure you have an isoelectric line before your QRS, or after your T wave, if you chose that section to export.

Exported files are exported as numpy arrays which can be easily read into python, nested in an identical directory structure to the import dataseries, to the 'exports' folder within the program directory:

exported files

Processing exported cases

I image users would have their own ways of associating labels with the exported numpy files, e.g. depending on whether they want to perform classification or segmentation taks.

However, the DataSeries class can be used to load a directory of exports, e.g. as follows:

dataseries = DataSeries(path="../../exports/ap_qrs_all",
                        labelfunction=DataSeries.get_labels_by_case_ap_left_right,
                        caselabels="C:\\Users\\James\\Box\\CLAIM-Data\\CLAIM-AP\\test.xlsx",
                        classes=('L','R'),
                        include_only=("NN2",1))

The DataSeries class has lots of (poorly documented) features, but this example would:

  • Load all npy files under the "aps_qrs_all" folder
  • Load the file 'test.xlsx' and use the supplied labelfunction to load labels from it. The example here uses the LeftVsRight column to assign the labels from the classes options, only including cases where the NN2 column is equal to 1.
  • Save the labels to the DataSeries.caselabels dictionary, where the key is the Case ID column and the value is the numpy array.

This can then easily be used to e.g. fit a neural network, in the following manner, using 4-fold cross-validation:

for fold in range(4):
    (train_x, train_y, train_n, train_caseids),\
    (test_x, test_y, test_n, test_caseids) = dataseries.get_train_test_data(reverse=True, fold_num=fold,
                                                                            downsample_ratio=downsample_ratio)

    seq_length = train_x.shape[1]
    input_channels = train_x.shape[2]

    if fold == 0:
        print(f"Train: {train_x.shape} from {train_n} cases")
        print(f"Test: {test_x.shape} from {test_n} cases")

    # Load your e.g. keras model of choice
    
    model.fit(train_x, train_y, batch_size=32, epochs=10, validation_data=(test_x, test_y))