Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time



Data Collection and Analysis of Energy Consumption of Mobile Phones using Machine Learning Techniques


In a modern-day society there is the consensus that smartphones have a dominant role in everyday life. By just pressing a button, someone can not only get up to speed with the current events on a global scale, but also get in touch with people all over the world and find various forms of entertainment. In particular, one of the features that makes smartphones so attractive is the portability they offer, since they utilize batteries. However, batteries have a certain amount of charges in their disposal, consequently the lifespan of a device is directly correlated to its utilization, as well as its charging strategy.

The current thesis focuses on the analysis of mobile phones’ usage and the prediction of the battery’s energy drain. To begin with, for data collection the application “BatteryApp”, which periodically keeps record of the device’s usage and the battery information, was developed. The next step is the grouping of similar uses of devices through Hierarchical Clustering, which does not require an a priori selection for a specific cluster number and does not set limitations regarding the chosen distance function. After that, it was assessed based on its content in order to select the clusters with the higher information value. Lastly, the prediction of the energy drain was constructed by employing a simple linear model, two variants of linear regression, where the penalty concept is introduced (Ridge and Lasso Regression), and a non-linear model, which belongs to the Ensemble Learning category (eXtreme Gradient Boosted trees), with the parameters’ learning procedure being applied to each selected cluster individually.

Georgios Balaouras
Electrical & Computer Engineering Department
Aristotle University of Thessaloniki, Greece
October 2020


pip install numpy
pip install pandas
pip install scikit-learn
pip install scipy
pip install gower
pip install seaborn
pip install matplotlib
pip install xgboost

Raw data variables

Battery informations

Name Description
level The current battery level [0-100].
temperature The current battery temperature in °C.
voltage The current battery voltage level in V.
technology String describing the technology of the battery.
status Categorical variable for the current battery status.
health Categorical variable for the current battery health.
availCapacityPercentage The current remaining battery capacity [0-100].

Phone usage informations

Name Description
usage The estimation of the current CPU load [0-100].
WiFi Boolean variable if WiFi is enabled.
Cellular Boolean variable if Cellular Data Connection is enabled.
Hotspot Boolean variable if the device its used as a WiFi access point.
GPS Boolean variable if GPS is enabled.
Bluetooth Boolean variable if Bluetooth is enabled.
RAM The current percentage of available RAM [0-100].
Brightness The current screen brightness.
isInteractive Boolean variable if the user interacts with the device.

Logistic informations

Name Description
_id Database id unique for each measurement.
ID The unique user ID used for anonymity reasons.
SampleFreq The user specified sampling frequency (Default: 10 seconds).
brandModel The brand and model of the device.
androidVersion The android version of the device.
Timestamp Unix timestamp of each measurement.

Directory Structure

├── ..
 |       ├── BatteryApp: Contains the source code of BatteryApp.
 |       ├── experiment_1: all Sessions of one user.
 |       ├── experiment_2: Sessions under 30 minutes of all users with at least 20 files.
 |       ├── experiment_3: Sessions under 30 minutes of all users with at least 20 files & same total battery capacity.
 |       ├── server: Contains the scripts for hosting the server and the database.
 |       ├── preprocessing: Contains the scripts for exporting and preprocessing the data.
 |       ├── data/csvFiles: Contains the files as exported and checked from the Database.


Java Python Flask & Waitress MongoDB




1.1.2 & 1.4.4



Reach out to me:


Licence: GPL