Try to outperform random forest AL with deep AL
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
datasets
preprocessing
.gitignore
README.md
al.py
al_results.sh
balance.sh
compress.sh
dal.py
dal_cv.sh
dal_eval.py
dal_results.sh
delete_excess_models.py
download_datasets.sh
eval_utils.py
file_utils.py
flip_gradient.py
generate_datasets.sh
generate_tfrecord.py
hyperparameter_tuning_commands.py
hyperparameter_tuning_nevergrad.py
hyperparameter_tuning_results.py
kamiak_check_errors.sh
kamiak_config.sh
kamiak_download.sh
kamiak_eval.srun
kamiak_hyperparam.srun
kamiak_queue_all.sh
kamiak_queue_all_folds.sh
kamiak_queue_all_targets.sh
kamiak_tflogs.sh
kamiak_train.srun
kamiak_train_single.srun
kamiak_upload.sh
kamiak_windows.srun
load_data.py
model.py
pickle_data.py
plot.py
pool.py
tcn.py
tensorboard_some.sh
vrnn.py

README.md

Deep Activity Learning (DAL)

Goals:

  • Compare random forest activity learning (AL) with deep activity learning
  • Compare previous AL features with simpler feature vector
  • Add domain adaptation and generalization to deep network for further improvement

Steps:

  • Preprocess data extracting the desired features, creating time-series windows, and cross validation train/test splits (see generate_datasets.sh, preprocessing/, etc.)
  • Run and compare AL (al.py) and DAL (dal.py) on the datasets

Datasets

This is designed to work on smart home datasets in the formats of those on the CASAS website. To download some smart home data to preprocessing/orig, convert into the appropriate annotated format, and output to preprocessing/raw, run:

./download_datasets.sh

Preprocessing

To apply activity label and sensor translations, generate the desired feature representations and time-series windows, and create the .tfrecord files:

./generate_datasets.sh

Note: a lot of the Bash scripts use my multi-threading/processing script /scripts/threading to drastically speed up the preprocessing, so you'll want to either remove those statements or download the script.

Running

AL

To train AL (uses random forests) and compute the results:

./al_results.sh

DAL

If running locally on your computer, run the cross validation training:

./dal_cv.sh

If running on a cluster (after editing kamiak_config.sh):

./kamiak_upload.sh
./kamiak_queue_all.sh flat --dataset=al.zip --features=al --flat
./kamiak_queue_all.sh flat-da --dataset=al.zip --features=al --flat --adapt

# on your computer
./kamiak_tflogs.sh # during training, to download the logs/models

Then, to pick the best models based on the validation results above (unless using domain adaptation, then pick the last model) and evaluate on the entire train and test sets for comparison with AL:

./dal_results.sh flat --features=al # set "from" to either kamiak or cv
./dal_results.sh flat-da --features=al --last