You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: report/results.tex
+21-22
Original file line number
Diff line number
Diff line change
@@ -2,30 +2,28 @@
2
2
\section{Results} \label{results}
3
3
4
4
\subsection{A new efficient \py{} package}
5
-
Several algorithms to extract features from univariate time series had already
5
+
Several algorithms to extract features from univariate time series have already
6
6
been implemented in the \py{} package \pyeeg{}\cite{bao_pyeeg:_2011}.
7
-
Unfortunately, some of them were critically slow, and could therefore not realistically have been used in the present study.
7
+
Unfortunately, some of them were critically slow, and could therefore not realistically be used in the present study.
8
8
Preliminary investigation of \pyeeg{} source code revealed that runtimes may be improved mainly by vectorising expressions and pre-allocating of temporary arrays.
9
9
Therefore, systematic reimplementation of all algorithms in \pyeeg{} was undertaken.
10
-
Very significant improvement in performance were achieved for almost all functions(table~\ref{tab:benchmark}).
10
+
Very significant improvements in performance were achieved for almost all functions(table~\ref{tab:benchmark}).
11
11
Critically, sample and approximate entropies\cite{richman_physiological_2000} became usable in
12
12
reasonable time.
13
13
For instance, 40h of CPU time were originally required to compute sample entropy on a all 5 second epochs in a 24h recording.
14
14
The improved implementation cut this down to 55min.
15
15
\input{./tables/benchmark}
16
16
17
-
Importantly, several mathematical inconsistencies between the original code and the mathematical definitions were also noticed.
17
+
Importantly, several mathematical inconsistencies between the original code and the mathematical definitions were also detected.
18
18
This affected five of the eight reimplemented functions(table~\ref{tab:benchmark}, rightmost column).
19
-
Detail of the corrections performed are provided, as notes, in the documentation of the new package (see appendix, section 2.3).
20
-
Numerical results for the three functions were consistent throughout optimisation.
19
+
Details of the corrections performed are provided, as notes, in the documentation of the new package (see appendix, section 2.3).
20
+
Numerical results for the three functions were consistent throughout optimisation.
21
21
22
22
In order to facilitate feature extraction, several data structures and routines, which were not
23
-
available in \pyeeg{},
24
-
were also implemented
25
-
in a new \py{} package named \pr{}.
23
+
available in \pyeeg{}, were also implemented in a new \py{} package named \pr{}.
26
24
Briefly, extensions of \texttt{numpy} arrays\cite{walt_numpy_2011} providing
27
25
meta-data, sampling frequency, and other attributes were used to represent time series.
28
-
User friendly indexing with string representing time was also developed.
26
+
User friendly indexing with strings representing time was also developed.
29
27
In addition, a container for time series of discrete annotation levels, each linked to a confidence level, was built.
30
28
Importantly, a container for multiple time series, which supports heterogeneous (between time series) sampling frequencies was implemented.
31
29
The new package also provides visualization, input/output, and wrappers for resampling and discrete wavelet decomposition.
@@ -34,17 +32,17 @@ \subsection{A new efficient \py{} package}
\subsection{Twenty variables can generate accurate predictions}
37
-
Including temporal information (see next section) wil result in multiplication of the number of variable, rendering computation difficult, and prediction potentially less accurate.
35
+
Including temporal information (see next section) will result in multiplication of the number of variables, rendering computation difficult, and prediction potentially less accurate.
38
36
Therefore, recursive feature elimination\cite{menze_comparison_2009} based on
39
37
Gini variable importance was undertaken.
40
38
Starting with all 164 variables, random forests were trained, and the number of features was reduced by a factor $1.5$ by eliminating the least important variables.
41
39
For each iteration, the stratified cross-validation error (see material and methods) was computed (fig.~\ref{fig:variable_elimination}). Five replicates were performed.
42
40
43
41
\input{./figures/variable_elimination}
44
42
The predictive accuracy globally increases with the number of variables.
45
-
Interestingly, using only the two most important variable already results in less that 20\% average error.
46
-
However, this increase is very moderate for ($p>9$) this indicates that dimensionality can be considerably reduced without largely impacting accuracy.
47
-
For further investigation, $p=21$ was considered to be a good compromise between error and computational load. Their relative importances are listed in table~\ref{tab:importances}.
43
+
Interestingly, using only the two most important variables already results in less that 20\% average error.
44
+
Additionally, this increase in predictive accuracy is very moderate for ($p>9$). This indicates that dimensionality can be considerably reduced without largely impacting accuracy.
45
+
For further investigation, $p=21$ was considered to be a good compromise between error and computational load. Their relative importance is listed in table~\ref{tab:importances}.
@@ -83,14 +81,14 @@ \subsection{Structural differences with ground truth}
83
81
84
82
\input{./tables/confus}
85
83
86
-
The global confusion matrix is presented in table \ref{tab:confus}. The overall accuracy was 0.92.
87
-
For \gls{nrem} and wake, both specificity and positive predictive value were above 0.92. However
84
+
The global confusion matrix is presented in table \ref{tab:confus}. The overall accuracy compared to manual scoring was 0.92.
85
+
For \gls{nrem} and wake, both specificity and positive predictive value were above 0.92. However,
88
86
for \gls{rem} epochs, the specificity is only 0.74, and the false detection rate is 15\%.
89
87
90
88
In order to investigate the structural differences between ground truth and the predicted states,
91
89
three metrics describing physiological properties of sleep were computed (fig.~\ref{fig:struct_assess}).
92
90
93
-
Prevalence(fig.~\ref{fig:struct_assess}A) of states is a widely used metric to
91
+
Prevalence of states (fig.~\ref{fig:struct_assess}A) is a widely used metric to
94
92
describe sleep patterns.
95
93
No significant difference was found between the ground truth and predicted prevalences ($p-value > 0.13$ for all, z test on the interaction terms of $\beta$ regression).
96
94
@@ -106,16 +104,17 @@ \subsection{Structural differences with ground truth}
106
104
(t test on the interaction terms of linear mixed model).
107
105
108
106
\subsection{Attribution of confidence score to predictions}
107
+
109
108
Since classification may be inaccurate, it would be interesting to associate `confidence' score to each prediction.
110
-
A entropybased confidence metric $c$ (eq. \ref{eq:entropy}) was defined for this purpose.
111
-
In order to validate it, the average cross-validation error was computed for different ranges of confidence (fig\ref{fig:error}A).
109
+
An entropy-based confidence metric $c$ (eq. \ref{eq:entropy}) was defined for this purpose.
110
+
In order to validate it, the average cross-validation error was computed for different range of confidence (fig\ref{fig:error}A).
112
111
113
112
\input{./figures/error}
114
113
115
114
As expected, the probability of misclassification decreases monotonically with $c$.
116
-
In addition, error rate seem to tend to zero when the confidence value is one, and, for confidences close to zero, the predictor is very inaccurate.
117
-
These characteristics indicate that $c$ can be used to as a supporting value for predictions.
118
-
One application of such a confidence level could be to provide user with an overall quality assessment.
115
+
In addition, error rates tend to zero when the confidence value is one, and, for confidences close to zero, the predictor is very inaccurate.
116
+
These characteristics indicate that $c$ can be used as a supporting value for predictions.
117
+
One application of such a confidence level could be to provide users with an overall quality assessment.
119
118
In addition, it makes it possible to display confidence while visually inspecting a recording (fig\ref{fig:error}C) in order to facilitate resolution of ambiguities.
0 commit comments