# A DNN-Based Low Power ECG Co-Processor Architecture to Classify Cardiac Arrhythmia for Wearable Devices

Meenali Janveja<sup>®</sup>, *Graduate Student Member, IEEE*, Rushik Parmar<sup>®</sup>, Mayank Tantuway<sup>®</sup>, and Gaurav Trivedi<sup>®</sup>, *Member, IEEE* 

Abstract—In this brief, a Deep Neural Network (DNN) based cardiac arrhythmia (CA) classifier is proposed, which can classify ECG beats into normal and different types of arrhythmia beats. An optimized fixed length beat is extracted from a time domain ECG signal and is fed as an input to the proposed classifier. This fixed sized input beat obviates the need to extract handcrafted ECG features and aids in the optimization of our proposed design. The classifier presented in this brief exhibits better or comparable classification accuracy than the previously reported methods, which utilize complex algorithms for CA classification employing patient-independent(subjectoriented) approaches. Moreover, the proposed CA classifier consumes  $8.75\mu W$  at 12kHz, when implemented using 180nm Bulk CMOS technology. The low power realization of the proposed design as compared to well-known state-of-the-art methods makes it suitable for wearable healthcare device applications.

Index Terms-ECG, arrhythmia, DNN, ASIC, low power.

# I. INTRODUCTION

ARDIOVASCULAR diseases (CVD) are one of the primary causes of mortality worldwide, causing millions of casualties every year. Among all the chronic CVDs, cardiac arrhythmia is the most recurrent and is responsible for the maximum number of fatalities due to cardiac arrests. Therefore, development of wearable devices for timely detection of CA would enable on-time treatment to save millions of lives [1]. Recently, number of wearable healthcare devices have become very popular because of real-time monitoring of electrocardiogram (ECG) signals, which aids in effectively lowering the rate of hospitalization due to cardiac issues [2]. Nevertheless, presently available wearable ECG devices heavily rely on the remote hospital setup to diagnose diseases, which require raw ECG data transmission from device to the remote setup via mobile phones or Internet. This causes high power consumption and delay in the diagnosis hindering prompt treatment. However, advancements in the machine learning techniques and semiconductor device

Manuscript received December 1, 2021; revised January 18, 2022; accepted January 19, 2022. Date of publication January 25, 2022; date of current version March 28, 2022. This brief was recommended by Associate Editor C. Condo. (Corresponding author: Meenali Janveja.)

The authors are with the Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati 781039, India (e-mail: meena176102001@iitg.ac.in; rushik\_parmar@iitg.ac.in; mayanktantuway100@gmail.com; trivedi@iitg.ac.in).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSII.2022.3146036.

Digital Object Identifier 10.1109/TCSII.2022.3146036

technology enable embedding of cardiac arrhythmia classifiers in the wearable devices to classify CVD in the real-time, which reduces need for data transmission and alerts an individual on time

The scientific community has developed several machine learning based models to classify cardiac arrhythmia. A CA classifier proposed in [3] and [4] employs SVM to segregate CA beats from the normal heartbeats. However, both the above mentioned works use handcrafted features for the classification, which may affect capability of the classifier to deal with diverse ECGs. Also, classification methods are not as per American National Standard (ANSI/AAMI EC57:1998) prepared by the Association for the Advancement of Medical Instrumentation (AAMI). Authors in [5], [6], [7], [8] and [9] perform CA beat classification as per AAMI standards using different machine/deep learning models. Method reported in [8] has better classification accuracy than the methodology reported in [5], [6], [7]. Despite the appreciable performance, the model utilized in [8] is not suitable for hardware implementation. It employs a complex deep neural network (DNN) with a large number of nodes in the hidden layers, which lead to higher resource utilization making it less suitable to be utilized in wearable device applications. Zhao et al. in [9] uses DNN for CA classification and reports  $13.34\mu W$  power consumption employing a patient-specific approach for CA classification. Although patient-specific classifiers illustrate better performance than the patient-independent classifiers, but they cannot handle ECG diversity and also require expert intervention in the diagnosis [8] making them less suitable for wearable device applications. Therefore, a low power VLSI architecture is proposed in this brief for DNN based patientindependent and subject-oriented cardiac arrhythmia classifier with an acceptable accuracy making it suitable to be used in wearable healthcare devices. This brief is further organized as follows. Section II depicts implementation details of the proposed ECG co-processor. Section III discusses about experimentation and the results. Section IV concludes the manuscript.

# II. PROPOSED CARDIAC ARRHYTHMIA (CA) CLASSIFIER

A typical system-on-chip (SoC) for arrhythmia detection is divided broadly into two blocks. First is analog front end, which is responsible for acquiring an ECG signal and converting it to digital samples. The second block is a coprocesser, which performs ECG features extraction and disease

1549-7747 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

classification. The primary contribution of this manuscript is to design a low power and area efficient ECG co-processor, whose implementation is described below.

### A. ECG Feature Extraction and Beat Alignment

The primary step in classifying CA is to extract fixed-length vectors from the raw ECG signals, which are input to the classification block. The steps to extract feature vector are described in Algorithm 1. The raw ECG signal is first filtered using a bandpass filter of bandwidth 0.5 - 40Hz [10] to remove the noise. This filtered signal is then passed through a feature delineation block, which extracts R peaks. Several methods have been reported previously for the detection of R peaks using ECG signals. However, the methods reported in the literature employ complex mathematical operations, such as, squaring operations [10], [11] or wavelet transforms [12], etc. These methods utilize more resources and memory as they require storing both ECG samples and intermediate output arrays. Therefore, in the proposed work, a simplified R peak extraction algorithm is employed, which utilizes simple difference and comparing operations to delineate R peaks, and stores only initial ECG samples.

Initially, 850 samples of ECG are fed serially to the feature delineation block and the maximum value of these samples is found using a comparator. The ECG samples are stored for further processing and a threshold value is updated as  $\frac{ECG_{max}}{2}$  for finding R peaks. Once the threshold is updated, a counter is initiated to traverse stored ECG samples. R peaks are delineated, if the following two conditions are satisfied. First,  $ECG_i > threshold$ , and second, a zero-crossing is obtained, i.e.,  $MSB(ECG_{i+1} - ECG_i) \wedge MSB(ECG_i - ECG_{i-1}) = 1$ . Thus, the proposed architecture needs no intermediate sample values to be stored, and utilize a comparator and XOR operation to find the R peak making our implementation resource and power efficient as explained in line 1-14 of Algorithm 1. Once the R peaks are marked, respective RR intervals are reckoned as the difference between two consecutive R peaks. Later, the beat segmentation block is enabled, as shown in Figure 1.

In the proposed work, an approach similar to [8] is employed for the beat segmentation and alignment. An ECG beat is taken as samples between two consecutive RR intervals,  $RR_i/2$  and  $RR_{i+1}/2$ . Later, the R peak is aligned in the centre and a fixed-size feature vector is generated by taking 208 samples from both the sides of R peak. Zero paddings are performed, if the number of samples is less than 417 between  $RR_i/2$  and  $RR_{i+1}/2$ . However, it is observed that zero paddings are applied for most of the ECG beats because the difference between  $RR_i/2$  and  $RR_{i+1}/2$  is less than 417 in the maximum ECG excerpts. These unnecessary zero padding adds to redundant hardware resources as they contain no information of an ECG signal. Thus, to reduce the redundant hardware, we gradually decrease input layer size from 417 to 210 without causing any loss in the information and achieve comparable accuracy for the multiclass classification. It is to mention that further reduction in the beat size decreases the accuracy of DNN drastically. Therefore, our proposed methodology employs a similar approach as [8] as explained in lines 15-20 of Algorithm 1 to generate feature vectors of size 210 instead

# Algorithm 1 Pseudocode for Beat Extraction and Classification

```
1: Input: Filtered Samples of ECG
 2: R Peak Extraction and Beat Alignment
    Initialize ECG_{max} = 0
 4: Serially input ECG data ECG_i and store it in reg based memory
 5: if ECG_i > ECG_{max} then
        ECG_{max} = ECG_i
 7: end if
 8: Threshold for R peak: Th = ECG_{max} >> 1
 9: j, i \leftarrow 0
10: if ECG_i > Th and
11: (ECG_{i+1} - ECG_i)[15] \wedge (ECG_i - ECG_{i-1})[15] == 1 then
        R[j] = i
12:
13:
14: end if
15: boundL = (R_{i-1} + R_i) >> 1 and boundR = (R_{i+1} + R_i) >> 1
    if R_i - boundL < 105 then
        beat = zero\_paddings + ECG_i, i \in boundL to R_i
17:
18: else
19:
        beat = ECG_i, i \in R_i - 105 to R_i %Similar process for right side of
     beat taking boundR
20: end if
21: Arrhythmia Classifier
22: k \leftarrow 0 %k Controls the layer number
23: Begin
24: for i = 0, \dots, n\_layer_{k+1} do %Controlling neuron operations
        for j = 0, \dots, n\_layer_k do
            sum_i = sum_i + (input\_sample_i \times W_{ii})
26:
27:
            j \leftarrow j + 1
28:
        end for
29.
         y_i = sum_i + B_i
        if k < 3 and y_i > 0 then
30:
31:
            out_i = y_i
32:
        else
33:
            out_i = 0
34:
        end if
35:
36:
        Repeat from 23 for other layers
37: end for
```



Fig. 1. Architecture for ECG Processing and DNN Implementation.

of 417. It is to mention that reducing beat size optimizes resource utilization by almost half with the acceptable performance.

#### B. Design and Implementation of DNN Based CA Classifier

Once the heartbeat segmentation is completed, the fixedsize feature vector is fed to DNN based classifier. Figure 1 illustrates proposed architecture of the DNN. As it is known, a DNN has input layers and hidden layers having neuron as a fundamental unit comprising of a multiplier and an accumulator (MAC), whose operation can be expressed as  $y = \sum_{i=1}^{N} W_i x_i + B_i$ . Here,  $W_i$  and  $B_i$  are the weight and bias of a neuron, respectively. Therefore, computation complexity and the hardware cost of a DNN are primarily determined by two factors. First is the size of input and hidden layers and, second is the efficiency of multiplier and adder units employed in a MAC. Therefore, we have employed two-stage optimizations to limit the power consumption and resource utilization of the proposed architecture. As already described, the design proposed in [8] has a large number of neurons and consumes large area and power, if implemented on the hardware. Therefore, we propose an optimal neural network, which is suitable for the resource-constrained applications. After reducing input layer size and several epochs of training, the size of DNN is optimized to  $210 \times 35 \times 25 \times 5$ , which sustains sufficient accuracy and optimal computational resources on the hardware. After optimizing DNN layers, a pre-trained network is implemented in the hardware using Verilog HDL.

As we know, weights and biases of the neurons are represented using floating-point number system, which is not only complex but also consumes more power as compared to the fixed point and integer arithmetic operations, when implemented in the hardware. Therefore, weights and biases are presented in 16-bit fixed point format, where first MSB 6 bits are fixed for the integer part and 10 bits are reserved for the decimal part. It is to mention that converting weights represented in the floating-point format to the fixed point does not incur any loss in the accuracy, and simplifies the implementation of a MAC unit.

Further, as shown in Figure 1, DNN has two dense layers and an output layer controlled by a DNN control unit implemented as a Finite State Machine (FSM). This control unit controls an intermediate FSM of every layer. As soon as *Enable* signal of DNN becomes high, an extracted beat of 210 samples is input to the DNN. Later, control unit of layer 1 becomes functional and starts processing the inputs (Lines 21 – 36 of Algorithm 1). Each input and its respective weight, which are stored in 2's complement format, are fed serially to the MAC unit.

As explained above, a MAC unit consists of a multiplier and an adder to accumulate final result. This multiplier requires shifters, half adders and full adders to perform the operation. There are a few previously reported methods, which utilize different kinds of approximate and fast adders and multipliers in the MAC unit for optimizing power [5], [13]. However, approximate multipliers and adders affect overall accuracy of the network, whereas, fast adders and multipliers add to area overheads. Therefore, we propose a novel and optimized sequential shift multiplier with reduced registers to minimize resource utilization. In this multiplier a simple shift-by 1 operation and a single 32-bit adder are employed, whose operations are controlled by a clock signal as shown in Figure 1. Implementation of this multiplier costs extra clock cycles, which increases latency, but it significantly optimizes resource utilization. Since pre-trained weights are less than one, 16 intermediate bits of 32 bit multiplier output is considered as its final output without affecting overall accuracy. This 16-bit output is then sent to the accumulator for addition. The area and power consumption of the proposed MAC unit is reduced by  $2.76\times$  and  $2.34\times$ , respectively, as compared to the conventional MAC unit synthesized at 180nm technology without any penalty to the accuracy. After multiplying and accumulating weights and inputs using a MAC unit, bias is added and the final output of a single neuron is obtained. Further, this MAC unit is shared with other DNN layers obtaining three times improvement in the resource optimization. The output obtained from MAC are fed to the "ReLU" activation function implemented using a simple multiplexer. The layers operate sequentially after having output from the previous layers controlled by DNN control unit. This lead to the optimization of area and power consumption of the proposed MAC unit as compared to the conventional MAC unit realized using 180nm technology. It is to mention that the proposed co-processor performs beat by beat classification in real-time at 12 kHz, which is suitable for low power.

While software training, softmax function is considered as an activation function of the final output layer for multiclass classification. A softmax function is utilized because it is differentiable at all the points and gives an accurate estimation of output posterior probabilities causing no information loss making it suitable for multiclass classification. Since we employ a pre-trained network for the hardware implementation, therefore, for optimizing the architecture, a simple "max" function is employed as an activation function in the final output layer instead of a softmax function, and is implemented using a simple comparator. This further optimizes resources and power consumption making it suitable for ultra low power wearable devices.

## III. RESULTS AND DISCUSSIONS

In this section, a detailed discussion is presented on the outcome of the proposed design, when it is validated with various testcases. The designed architecture is tested and verified on the MIT-BIH arrhythmia database [14]. As per American National Standard (ANSI/AAMI EC57:1998) [15] issued by AAMI, the heartbeats are combined into five classes, namely, normal beat (N), ventricular ectopic beat (V), supraventricular ectopic beat (S), fusion of a normal and a ventricular ectopic beat (F) and unknown beat type (Q). We have utilized two evaluation schemes for estimating the performance of the proposed classifier. First is the subject-oriented approach as described in [6] (Experiment 1), in which training and test sets are completely different from each other as per ANSI/AAMIEC57 standards. Therefore, it gives more realistic estimate on the ability of the classifier in practice. Second is the class-oriented approach to test the model on a larger database and to address data imbalance issue in the MIT-BIH data. In this scheme, data is divided in the ratio of 75:25 for testing and training, respectively (Experiment 2). It is to mention that a model trained for the subject-oriented approach according to AAMI standard is used for the proposed hardware implementation. Further details about data can be obtained from [6]. Also, as suggested by ANSI/AAMIEC57 standard, we have given more importance to the classification of two majority classes of arrhythmia that are Class S and Class V.

TABLE I
COMPARISON RESULTS FOR EXPERIMENT 1
(SUBJECT-ORIENTED SCHEME)

| Methodology              |     | [5]  | [6]  | [7]  | [8]   | Proposed |
|--------------------------|-----|------|------|------|-------|----------|
| Overall Accuracy(%)      |     | 85.9 | 86.4 | 89.9 | 94.7  | 91.6     |
| Class S<br>(S v/s Non S  | Se% | 75.9 | 60.8 | 80.8 | 77.3  | 86.45    |
|                          | Sp% | 95.4 | 97.7 | 96.7 | 97.7  | 82.21    |
|                          | AUC | NA   | NA   | NA   | 0.858 | 0.886    |
| Class V<br>(V v/s Non V) | Se% | 77.7 | 81.5 | 82.2 | 93.7  | 83.37    |
|                          | Sp% | 98.8 | 96.4 | 99.0 | 98.8  | 94.06    |
|                          | AUC | NA   | NA   | NA   | 0.951 | 0.925    |

 ${\bf TABLE~II}\\ {\bf Comparison~Results~for~Experiment~2~(Class-Oriented~Scheme)}$ 

| Method               | [6]  | [17]  | [18]  | [16]  | [19]  | [20]  | Proposed |
|----------------------|------|-------|-------|-------|-------|-------|----------|
| Overall Accuracy (%) | 99.3 | 99.41 | 94.90 | 99.85 | 94.30 | 98.3  | 97.35    |
| Se%                  | NA   | 96.08 | 99.13 | 99.85 | 93.18 | 95.51 | 94.83    |
| Sp%                  | NA   | NA    | 81.44 | NA    | NA    | NA    | 93.25    |

The performance metrics for *Experiment* 1 Experiment 2 are presented in Table I and Table II, respectively. It is observed from Table I, for class V, the proposed classifier has comparable sensitivity and specificity with respect to previously reported works. It also exhibits an overall accuracy of 91.6%, which is more than [5], [6], and a comparable accuracy to [7]. It has only 3.1% less accuracy than the classifier reported in [8]. However, it is to be noted that the previously reported works have complex classifiers, which are not suitable for the low power hardware implementation, and can only be used in a hospital setup having a software platform. Also, [5], [6] and [7] utilize features of ECG as an input to the classifier instead of a raw ECG beat that again adds to the hardware cost. On the other hand, our proposed classifier is optimized for the low power applications for performing real-time arrhythmia detection. Also, it is evident from Table I that our method has comparable AUC values than [8] indicating it to be a promising candidate for the wearable healthcare applications. It can be observed from Table II that the class-oriented scheme has an overall accuracy of 97.35% and the sensitivity of 94.83%, which is again better or comparable to the previously reported works employing class-oriented schemes. It is observed that the method reported in [16] cites highest accuracy and sensitivity as compared to our proposed as well as other state-of-the-art methods. It utilizes stacked convolutional and long short-term memory networks to classify ECG signals in only two classes. However, our method can perform multiclass classification using a simpler model with comparable accuracy and sensitivity. It is also to mention that for a class-oriented approach, an average AUC for Class S and Class V is 0.97, but due to the unavailability of data of other methods, AUC results with other state-of-the-art approaches could not be compared.

# A. Hardware Implementations

- 1) FPGA Implementation: The proposed CA classifier is first implemented on XilinxVirtex 7 FPGA for initial functional verification of the architecture. Table III presents resource utilization of the proposed design. It is observed from Table III that our design utilizes 1.52% of total available resources making it resource-efficient.
- 2) ASIC Implementation: The DNN based CA classifier is implemented as an ASIC employing 180nm Bulk CMOS

TABLE III FPGA RESOURCE UTILIZATION

| Resources       | Total Available | Resources Utilized | Percentage Utilization(%) |
|-----------------|-----------------|--------------------|---------------------------|
| Slice LUT       | 303600          | 11125              | 3.66                      |
| Slice REG       | 607200          | 4884               | 0.80                      |
| F7 MUX          | 151800          | 1080               | 0.71                      |
| F8 MUX          | 75900           | 255                | 0.33                      |
| IOB             | 600             | 24                 | 4                         |
| BUFGCTRL        | 32              | 1                  | 3.125                     |
| Total Resources | 1139132         | 17369              | 1.52                      |



| CMOS Process  | SCL 180nm                        |
|---------------|----------------------------------|
| Area          | $1.32mm^2(1.15mm \times 1.15mm)$ |
| Voltage       | 1.98V                            |
| Frequency     | 12kHz                            |
| Dynamic Power | 7.5403uW                         |
| Static Power  | 1.2097uW                         |

Fig. 2. The layout photograph of Co-processor and its specifications.

TABLE IV
COMPARISON RESULTS OF ASIC IMPLEMENTATION

| Method                 | [9]                  | [21]         | [22]                 | [3]           | Proposed             |
|------------------------|----------------------|--------------|----------------------|---------------|----------------------|
| Process                | 180nm                | 65nm         | 40nm                 | 40nm          | 180nm                |
| Frequency<br>(Hz)      | 10k-25M              | 10k          | 1M                   | 10k           | 12k                  |
| VDD(V)                 | 1.8                  | 1            | 1                    | 1.1           | 1.98                 |
| Power(W)               | $13.34\mu$           | $2.78\mu$    | 14.14m*              | $3.76\mu$     | $8.75\mu$            |
| Area(mm <sup>2</sup> ) | 0.9250               | 0.112        | 0.135                | 0.12          | 1.32                 |
| Model Utilized         | ANN                  | Naive Bayes  | SVM                  | WLC+SVM       | DNN                  |
| Evaluation             | Patient-             | Class-       | Subject-             | Class-        | Subject-             |
| Scheme                 | Specefic (PS)        | Oriented(CO) | Oriented (SO)        | Oriented (CO) | Oriented(SO)         |
| Overall<br>Accuracy(%) | 99.39                | 86           | 88.06                | 98.2          | 91.6                 |
| Figure of<br>Merit(pJ) | $3.36 \times 10^{3}$ | NR           | 7.23×10 <sup>6</sup> | NR            | 2.08×10 <sup>6</sup> |
| Number of<br>Classes   | 5                    | 2            | 3                    | 2             | 5                    |

\*Power is calculated from Energy and Frequency reported in paper [22] \*\*FoM is classification time x power \*\*\*NR:Not Reported

Technology available with SCL Chandigarh. Post placement and routing using Synopsys IC compiler, the design operates at 12kHz consuming power of  $8.75\mu W$  having area of  $1.32mm^2$ .

Table IV presents a comparison of our proposed classifier with some well-known state-of-the-art CA classifiers on the hardware. It is observed from Table IV that our proposed architecture exhibits higher accuracy than the methods reported in [21] and [22]. It is also seen that the overall accuracy of our proposed methodology is less than the methods reported in [9] and [3]. It is to mention that [3] showcases more accuracy because it is designed for two-class classification only. In contrast, our method reports five-class classification as per AAMI standards described in Section III. On the other hand, [9] reports better accuracy for five- class classification. However, the method in [9] utilizes a patient-specific approach instead of a subject-oriented approach to classify arrhythmia into five different classes. Also, [9] employs a conditional biasing scheme, in which the data from different classes are merged to perform classification that leads to better accuracy due to availability of more data for a particular class at the time of training. In comparison, our method do not employ biasing of data in any form and is implemented as per AAMI standards described in [8]. It is to be noted that in the subject-oriented method, classifier is trained on diverse ECG beats and is tested on completely blind data. Therefore, it performs better in the practical wearable healthcare applications by classifing ECG beats correctly even for unknown individuals as well. However, design reported in [9] exhibits the best performance in the patient-specific classification.

Further, it can also be observed from Table IV that our architecture has higher power consumption than the design reported in [3] and [21]. It is to mention that these designs perform two-class classification only utilizing less power than our proposed design. Also, their Figure of Merit (FoM) cannot be compared due to the unavailability of necessary data. It is further observed from Table IV that our proposed architecture has higher FoM compared to design reported in [9] because of the following reasons. In [9], the design is implemented using a patient-specific approach at a fixed heart rate, which requires retraining for every individual under expert intervention for correct marking of the ECG beats. Further, this design employs conditional grouping and biased training, where data of some classes are merged with other classes. Thus, it requires a small network size to achieve high accuracy in the classification of grouped classes. However, the classification of merged classes is not recommended by AAMI. Also, in [9], classification time and power consumption are presented only for the classifier module. This is because of the implementation of pre-processing of ECG signals in the software.

On the other hand, in our proposed work, complete coprocessor including ECG pre-processing and classification is implemented in hardware to classify five classes of arrhythmia without any data grouping. Thus, the delay and power consumption associated with its pre-processing lead to higher FoM. It is to mention that the proposed design can perform classification in real time at 12 kHz only. Therefore, our design exhibits less power consumption as compared to [9], which needs higher operating frequency to sustain its operations. However, the proposed design has better FoM as well as power consumption than [22], in which both ECG pre-processing and classifier are implemented in hardware to perform three-class classification employing subject-oriented approach. Therefore, our proposed design can be the most suitable to be used in the wearable devices.

# IV. CONCLUSION

In this brief, a low power, patient-independent, generic and adaptive ECG cardiac arrhythmia co-processor is proposed incorporating a novel simplified reduced register shift multiplier processor to classify all five types of ECG beats as per AAMI standards. It employs a simple ECG preprocessing method and uses a complete ECG beat as an input to DNN, which allows classifier to capture all the morphological changes in the arrhythmic beats. The utilization of a simplified MAC unit, resource sharing and optimization of softmax function make our co-processor area and power efficient. It consumes  $8.75\mu W$  at 12 kHz with 91.6% accuracy when realized with 180nm CMOS technology. The proposed design has 3.48× better FoM than [22], which also employs patient-independent approach. Thus, it can be stated that the proposed ECG co-processor can be integrated in ultralow power wearable healthcare devices for predicting cardiac arrhythmia efficiently.

#### REFERENCES

- A. B. de Luna, P. Coumel, and J. F. Leclercq, "Ambulatory sudden cardiac death: Mechanisms of production of fatal arrhythmia on the basis of data from 157 cases," *Amer. Heart J.*, vol. 117, no. 1, pp. 151–159, 1989.
- [2] A. L. Bui and G. C. Fonarow, "Home monitoring for heart failure management," J. Amer. Coll. Cardiol., vol. 59, no. 2, pp. 97–104, 2012.
- [3] Z. Chen et al., "An energy-efficient ECG processor with weak-strong hybrid classifier for arrhythmia detection," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 65, no. 7, pp. 948–952, Jul. 2018.
- [4] M. A. Sohail, Z. Taufique, S. M. Abubakar, W. Saadeh, and M. A. B. Altaf, "An ECG processor for the detection of eight cardiac arrhythmias with minimum false alarms," in *Proc. IEEE Biomed. Circuits Syst. Conf. (BioCAS)*, Nara, Japan, 2019, pp. 1–4.
- [5] P. De Chazal, M. O'Dwyer, and R. B. Reilly, "Automatic classification of heartbeats using ECG morphology and heartbeat interval features," *IEEE Trans. Biomed. Eng.*, vol. 51, no. 7, pp. 1196–1206, Jul. 2004.
- [6] C. Ye, B. V. K. V. Kumar, and M. T. Coimbra, "Heartbeat classification using morphological and dynamic features of ECG signals," *IEEE Trans. Biomed. Eng.*, vol. 59, no. 10, pp. 2930–2941, Oct. 2012.
- [7] S. Raj and K. C. Ray, "Sparse representation of ECG signals for automated recognition of cardiac arrhythmias," *Expert Syst. Appl.*, vol. 105, pp. 49–64, Sep. 2018.
- [8] S. S. Xu, M.-W. Mak, and C.-C. Cheung, "Towards end-to-end ECG classification with raw signal extraction and deep neural networks," *IEEE J. Biomed. Health Inform.*, vol. 23, no. 4, pp. 1574–1584, Jul. 2019.
- [9] Y. Zhao, Z. Shang, and Y. Lian, "A 13.34 μW event-driven patient-specific ANN cardiac arrhythmia classifier for wearable ECG sensors," *IEEE Trans. Biomed. Circuits Syst.*, vol. 14, no. 2, pp. 186–197, Apr. 2020.
- [10] W. J. Tompkins, Biomedical Digital Signal Processing. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993.
- [11] S. K. Jain and B. Bhaumik, "An energy efficient ECG signal processor detecting cardiovascular diseases on smartphone," *IEEE Trans. Biomedical Circuits Syst.*, vol. 11, no. 2, pp. 314–323, Apr. 2017.
- [12] A. I. Kalyakulina *et al.*, "Finding morphology points of electrocardiographic-signal waves using wavelet analysis," *Radiophys. Quantum Electron.*, vol. 61, nos. 8–9, pp. 689–703, 2019.
- [13] M. E. Nojehdeh, L. Aksoy, and M. Altun, "Efficient hardware implementation of artificial neural networks using approximate multiply-accumulate blocks," in *Proc. IEEE Comput. Soc. Annu. Symp. VLSI (ISVLSI)*, Limassol, Cyprus, 2020, pp. 96–101.
- [14] G. B. Moody and R. G. Mark, "The impact of the MIT-BIH arrhythmia database," *IEEE Eng. Med. Biol. Mag.*, vol. 20, no. 3, pp. 45–50, May/Jun. 2001.
- [15] Testing and Reporting Performance Results of Cardiac Rhythm and ST Segment Measurement Algorithms, AAMI Recommended Practice/American National Standard AAMI EC57, 1998.
- [16] J. H. Tan et al., "Application of stacked convolutional and long short-term memory network for accurate identification of cad ECG signals," Comput. Biol. Med., vol. 94, pp. 19–26, Mar. 2018.
- [17] T. J. Jun, H. J. Park, N. H. Minh, D. Kim, and Y.-H. Kim, "Premature ventricular contraction beat detection with deep neural networks," in *Proc. 15th IEEE Int. Conf. Mach. Learn. Appl. (ICMLA)*, Anaheim, CA, USA, 2016, pp. 859–864.
- [18] U. R. Acharya, H. Fujita, O. S. Lih, Y. Hagiwara, J. H. Tan, and M. Adam, "Automated detection of arrhythmias using different intervals of tachycardia ECG segments with convolutional neural network," *Inf.* Sci., vol. 405, pp. 81–90, Sep. 2017.
- [19] Y.-C. Yeh, C. W. Chiou, and H.-J. Lin, "Analyzing ECG for cardiac arrhythmia using cluster analysis," *Expert Syst. Appl.*, vol. 39, no. 1, pp. 1000–1010, 2012.
- [20] S. K. Pandey and R. R. Janghel, "Automatic detection of arrhythmia from imbalanced ECG database using CNN model with smote," *Aust. Phys. Eng. Sci. Med.*, vol. 42, no. 4, pp. 1129–1139, 2019.
- [21] N. Bayasi, T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker, and M. Ismail, "Low-power ECG-based processor for predicting ventricular arrhythmia," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 5, pp. 1962–1974, May 2016.
- [22] Y. Xu, Z. Chen, F. Li, and J. Meng, "A granular resampling method and adaptive speculative mechanism-based energy-efficient architecture for multiclass heartbeat classification," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 38, no. 11, pp. 2172–2176, Nov. 2019.