Skip to content
This repository has been archived by the owner on Mar 25, 2023. It is now read-only.

Ziyuan Chen & Zhirong Chen, 2022 Summer Research @ ZJU

License

Notifications You must be signed in to change notification settings

AllenHeartcore/AQOURSNet_rsch22su

Repository files navigation

AQOURSNet: Time2Graph Rework

Ziyuan Chen, Zhirong Chen | July 2022
Summer Research @ Yang Yang Lab, Zhejiang University

This work is protected under the MIT License.
Copyright (c) 2022 Ziyuan Chen & Zhirong Chen unless otherwise noted.


Diagram

Running the Program

$ pip install -r Requirements.txt
$ python main.py ucr_dataset/dataset [--argument ARGUMENT]

For instance,

$ python main.py ucr_dataset/Strawberry --smpratio 1 --tail mlp --lr 5e-4 --dtw --kmedians --amp

Possible arguments are presented below.

Category Argument Description Default
Dataset dataset Name of dataset Required
--seed Random seed 42
--device Device to use cuda if available
else cpu
Shapelet
& Graph
--nshapelet Number of shapelets to extract 30
--nsegment Number of segments for mapping 20
--smpratio Pos/Neg ratio for up/downsampling
(set to 0 = disable biased sampling,
forced to 0 for multi-class datasets)
0
--maxiter Max number of KMeans iterations 300
--tol Tolerance of KMeans 0.0001
--percent Percentile for pruning weak edges 30
GAT*** --dhidden Hidden dimension 256
--dembed Embedding dimension of graph
(output dimension of GAT)
64
--nlayer Number of layers 4
--nhead Number of attention heads 8
--negslope Negative slope of LeakyReLU 0.2
--dropout Dropout rate 0.5
--tail Type of prediction tail
(One of none, linear, mlp, resnet)
linear
Training --nepoch Number of epochs 100
--nbatch Number of mini-batches 16
--optim Optimization algorithm for learning
(See torch.optim algorithms for a list)
Adam
--lr Learning rate 0.001
--wd Weight decay 0.001
--amp Switch* for using Automatic Mixed Precision
(Forced to False unless device is cuda)
False
--f1 Switch* for reporting F1 score in place of loss
(Forced to 0 for multi-class datasets)
False
Enhancements
--ts2vec Switch* for using TS2Vec** False
--ts2vec-dhidden Hidden dimension of TS2Vec encoder 64
--ts2vec-dembed Embedding dimension of TS2Vec encoder 320
--ts2vec-nlayer Number of layers in TS2Vec encoder 10
--dtw Switch* for using Dynamic Time Warping False
--dtw-dist Pointwise distance function of DTW (See
scipy.spatial.distance.cdist for a list)
euclidean
--dtw-step Local warping step pattern of DTW
(See dtw/stepPattern.py for a list)
symmetric2
--dtw-window Windowing function of DTW (One of
none, sakoechiba, itakura, slantedband)
none
--kmedians Switch* for using KMedians in place of
KMeans in clustering
False

* Switches have action='store_true': their presence means True, and absence means False.
  Usage like ... --switch True or ... --switch False would result in a parsing error.

** The condensed ts2vec.py (--ts2vec options) has not been thoroughly tested. Use with caution.
  In case it fails, delete ts2vec.py, and clone yuezhihan/ts2vec under the same folder.

*** For fine-tuned GAT and out-of-the-box TS2Vec, refer to rong-hash/Time2GraphRework.

Model Pipeline

  1. Data preparation
    • read_dataset
  2. Time series ---extract---> Shapelets
    • extract_shapelets (Wrapper)
    • kmeans
    • TS2VEC (Enhancement)
  3. Time series & shapelets ---embed---> Series embedding
    • embed_series
    • dtw (Enhancement)
  4. Series embedding ---construct---> Graph
    • adjacency_matrix
  5. Graph ---embed---> Graph embedding
    • GraphDataset (Wrapper)
    • graph_dataloader (Wrapper)
    • NeuralNetwork
    • GAT
  6. Graph embedding ---predict---> Predicted classes
    • MultilayerPerceptron
    • FCResidualNetwork
    • train, test (Wrapper)

Call Hierarchy

  • main.py
    • utils.py (#0, #4 Wrapper)
    • construct_graph.py (#1 Wrapper, #1, #2, #3)
      • ts2vec.py (#1 Enhancement)
      • dtw (#2 Enhancement)
    • network.py (#4, #5)
      • xgboost - TO BE IMPLEMENTED
    • process.py (#4 Wrapper, #5 Wrapper)

Folder Structure

  • Stanford_CS224w - a prerequisite course
    • *.py - condensed code for GNNs like GCN, GraphSAGE, and GAT, adapted from the course materials
    • dataset - datasets required by the demos (raw only)
    • deepsnap - auxiliary code from snap-stanford@GitHub. Copyright (c) 2019 DeepSNAP Team
  • ref_papers - papers associated with shapelets providing essential background knowledge
  • cached_programs - historical versions and experiments of KMeans, SVM, MLP, ResNet, Time2Vec, hierarchies, etc.
    • WARNING: Codes in the cache are not optimized for environment compatibility and may not run properly.
  • affiliated_licenses - LICENSEs for code segments from yuezhihan, subhadarship, DTW, and pyg-team.
  • ucr_dataset - a neatly formatted version of the UCR Dataset in compressed .npz
    • The numpy arrays contained in each file have keys train_data, train_label, test_data, test_label
    • *_data has shape (num_samples, num_features), *_label has shape (num_samples,)
  • presentation - presentation materials including slides and diagrams

Credits

Easter Egg

Some of you may wonder where the name "AQOURSNet" actually comes from……