Skip to content

hwfluid/block-random

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A block-random algorithm for learning on distributed, heterogeneous data

Python environments

The quickest way to setup the Python environment is to use pipenv install and then pipenv shell.

Benchmarks on classification EMNIST data sets

Reproducing the benchmark results from the paper entails the following steps:

  1. Download and unzip the EMNIST data sets into ./data/emnist-dataset folder

  2. Train CNN on different data sets with shuffled, sorted and block-random orderings, at different batch sizes:

    $ python classification-tests.py -d fashion
    $ python classification-tests.py -d digits
    $ python classification-tests.py -d letters
    $ python classification-tests.py -d byclass
    $ python classification-tests.py -d bymerge
    $ python classification-tests.py -d balanced
    $ python classification-tests.py -d mnist"

    The outputs are stored in the batch_size_study directory.

  3. Plot the results: $ python plot.py

Predicting $\tau_{ij}$ for LES of channel flow

Reproducing the channel flow results from the paper entails the following steps:

  1. Generate the filtered data from the DNS: $ python scaling.py. This creates scaled.npy in the data directory which has the filtered velocities, gradients and $\tau_ij$ terms.

  2. Generate the training and test data for the various runs:

    $ python gen_data.py -o shuffled-1m -p shuffled -b 16 -n 1000000
    $ python gen_data.py -o shuffled-16 -p shuffled -b 16
    $ python gen_data.py -o block-16 -p block -b 16
    $ python gen_data.py -o sorted-16 -p sorted -b 16
    
  3. Perform hyperparameter sweeps using the shell script:

    $ sh parameter_sweeps.sh 1
    $ sh parameter_sweeps.sh 2
    $ sh parameter_sweeps.sh 3
    $ sh parameter_sweeps.sh 4
    
  4. Plot comparisons of results: $ python compare_runs.py -r runs

  5. Train the models using different types of algorithms: $ sh model_runs.sh

  6. Plot a given model result: $ python plot_run.py -r runs/${directory_name}

Citation for this work

@article{Mohan19,
    author    = {P. Mohan, M. T. Henry de Frahan, R. King, and R. W. Grout},
    title     = {A block-random algorithm for learning on distributed, heterogeneous data},
    journal   = {arXiv:1903.00091},
    year      = {2019}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TeX 48.2%
  • Python 43.1%
  • Makefile 7.1%
  • Shell 1.6%