Multiresolution Tensor Learning

This repository contains code for the multiresolution tensor learning model (MRTL).

Paper: Jung Yeon (John) Park, Kenneth (Theo) Carr, Stephan Zheng, Yisong Yue, Rose Yu Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis, ICML 2020

Requirements

Make sure miniconda or Anaconda is installed (https://docs.anaconda.com/anaconda/install/). Create and activate the environment using the provided environment file.

conda env create --name $NAME -f environment.yml
conda activate $NAME

Description

Description of subfolders:

data/: process raw data and create pytorch dataset
config/: global configuration parameters
train/: contains models
visualization/: plotting tools

Basketball

Dataset and Preprocessing

STATS SportsVU player tracking data for the NBA 2012-2013 season was used Yue et al. 2014. As this data is proprietary, this repo only contains preprocessing code.

See raw.py and read_raw.py.

Read raw data

python data/basketball/read_raw.py \
    --input-dir $RAW_DATA_DIR \
    --output-dir $OUTPUT_DIR

The read_raw.py produces text files containing all the used/discarded possessions, an intermediate pickle file containing all columns cleaned_raw.pkl, and the final preprocessed data full_data.pkl.

Training

Run prepare_bball.py and run_bball.py with arguments to run a single experiment. prepare_bball.py creates the training dataset and sets miscellaneous parameters for a single trial (logger, seed, etc.)

python prepare_bball.py \
    --root-dir $RUN_DIR \
    -data-dir $DATA_DIR

python run_bball.py \
    --root-dir $RUN_DIR \
    --data-dir $DATA_DIR \ 
    --type $TYPE \ 
    --stop-cond $STOP_CONDITION \ 
    --batch_size $BATCH_SIZE \ 
    --sigma $SIGMA \ 
    --K $K \ 
    --step_size 1 \ 
    --gamma 0.95 \ 
    --full_lr $FULL_LR \ 
    --full_reg $FULL_REG \ 
    --low_lr $LOW_LR \ 
    --low_reg $LOW_REG

Helper scripts are provided in src/ to do 10 trials of fixed vs multi resolution and stop_condition experiments.

Climate

Dataset and Preprocessing

There are two datasets, one consisting of precipitation over the U.S. (PRISM) and one consisting of global sea surface salinity and sea surface temperature (EN4).

PRISM data was accessed at https://prism.oregonstate.edu/. Monthly precipitation (ppt) data from 1895-2019 was used. Data from 1895-1980 can be downloaded from the "Historical Past" page, and data from 1981-2018 can be downloaded from the "Recent Years" page.
EN4 data was accessed at https://www.metoffice.gov.uk/hadobs/en4/download-en4-2-1.html. Objective analyses from 1900-2018 were used.

To run our code, first download all raw data into a single directory (downloaded files are in .zip format). Then, unzip the files and aggregate the data using the following command.

python data/climate/extract_data.py \
    --data_dir $DATA_DIR

Next, preprocess the oceanic and precipitation data into separate data files for all resolutions, using the following command. The files are saved in the netCDF4 format.

python data/climate/get_multires.py \
    --data_dir $DATA_DIR

Training

Run run_climate.py with arguments to run a single experiment. The method argument should be one of {mrtl, fixed, random}. run_climate_stop_cond.py compares the various stopping conditions. Results are saved in $SAVE_DIR.

python run_climate.py \
    --data_dir $DATA_DIR \
    --save_dir $SAVE_DIR \
    --experiment_name $RUN_NAME \
    --method mrtl
    --K $K

python run_climate_stop_cond.py \
    --data_dir $DATA_DIR \
    --save_dir $SAVE_DIR \
    --experiment_name $RUN_NAME \
    --n_trials $TRIALS
    --K $K

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiresolution Tensor Learning

Requirements

Description

Basketball

Dataset and Preprocessing

Read raw data

Training

Climate

Dataset and Preprocessing

Training

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
config		config
data		data
run/climate		run/climate
src		src
train		train
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cp_als.py		cp_als.py
environment.yml		environment.yml
prepare_bball.py		prepare_bball.py
run_bball.py		run_bball.py
run_climate.py		run_climate.py
run_climate_stop_cond.py		run_climate_stop_cond.py
utils.py		utils.py

License

Rose-STL-Lab/mrtl

Folders and files

Latest commit

History

Repository files navigation

Multiresolution Tensor Learning

Requirements

Description

Basketball

Dataset and Preprocessing

Read raw data

Training

Climate

Dataset and Preprocessing

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages