Skip to content

Mock Pipeline

Heet Sankesara edited this page Nov 7, 2022 · 3 revisions

The mock pipeline can be run by specifying the mock configurations in the config.yaml. Refer to Configuration for more details on setting the configuration.

Mock Data

The Mock pipeline downloads the data from the mockdata repository and saves it as a submodule. This repository contains some sample RADAR data for illustrating the pipeline run.

The data is stored as .csv.gz. format, which the I/O module reads and convert into a Spark DataFrame for further processing.

Mock Features

PhoneBatteryChargingDuration

The duration of the phone battery charging every day by each user

StepCountPerDay

The number of steps per day taken by each user.

Mock Output

The output is 2 csv files phone_battery_charging_duration.csv and step_count_per_day.csv.