This repository provides a comprehensive overview of the tools used for analyzing two distinct datasets within the context of the course ELEC-H550 - Embedded System Security, offered at the Université libre de Bruxelles. The research topic, "How can datasets from IoT devices, including population-level and individual-level data, be used to analyze user behaviors, and what are the privacy and security implications of such analyses?", forms the basis of the work conducted here.
- Virgile Devolder
- Martin Devolder
- Corentin Bouffioux
This repository lists and provides documentation for the two tools used to analyze the following datasets:
- Dataset One
- This dataset includes data from approximately 62 million users across four different countries. It is open-source and originates from a previous study available online. Using Python as the primary programming language, various scripts were applied to analyze device usage patterns, compare data across countries, and examine outputs generated by specific devices. Python was chosen for this analysis as it is well-suited for smaller datasets and offers a wide range of practical libraries like Matplotlib and Pandas, which facilitate data visualization and manipulation. This analysis provides a comprehensive view of global IoT device usage.
- Link: https://github.com/snudatalab/SmartSense
- Dataset Two
- This dataset is part of the CIC IoT Dataset 2022, which simulates daily usage patterns of IoT devices. For our research, we focus exclusively on these daily simulations to study user behavior. Rust was chosen as the programming language for its high performance and efficiency, allowing rapid analysis of the dataset. The analysis aims to uncover detailed insights into IoT device interactions while maintaining scalability.
- Link: http://205.174.165.80/IOTDataset/CIC_IOT_Dataset2022/Dataset/5-Active/ https://www.unb.ca/cic/datasets/iotdataset-2022.html
The tools and methodologies documented here aim to:
- Explore user behavior patterns through IoT device data.
- Highlight privacy and security implications associated with such analyses.
Each tool is explained in its respective section, along with guidelines on how to apply them to the datasets. Please refer to the respective documentation files for detailed usage instructions.
For each dataset, in its respective folder, there is a "data" folder with all the generated graphs. You can generate them yourself by going to the tool folder linked to the dataset and following the instructions.
This work is part of the academic requirements for the course ELEC-H550 - Embedded System Security. Special thanks to the course instructors for their guidance and support.
This repository is licensed under MIT license. Please check the LICENSE file for details.