repository directory #20

JimengShi · 2022-10-14T01:50:35Z

Hi Goerge, thanks for your open-source codes. It is very clear and organized.

But I am new to use the shell script, could you please give a directory tree of the entire repository? That would be very helpful to understand the architecture. I am confused about where I should put the downloaded data and where I should make the experiments folder. Currently, I am trying with the following tree:

experiments
src
- datasets
- models
- regression
- utils
- main.py
- optimizers.py
- options.py
- running.py

After cd mvts_transformer, I run python src/main.py --output_dir experiments --comment "regression from Scratch" --name FloodModeling1_fromScratch_Regression --records_file Regression_records.xls --data_dir Datasets/Regression/FloodModeling1/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task regression, but it shows No files found using: Datasets/Regression/FloodModeling1/*.

The text was updated successfully, but these errors were encountered:

xiqxin1 · 2022-10-18T02:26:48Z

Hi, did you solve this problem? Could we have a discussion by email?

gzerveas · 2022-10-18T13:54:44Z

Hi Jimeng, the path specified by --data_dir should exist and contain the files with the time series data (in your case, because you are using --data_class tsra, it should be in the TSRA format). Furthermore, the Python interpreter should be able to resolve that path, depending on what is your current directory when running python main.py. From the error you got, I am guessing that you were running the command from within the repo directory, but you hadn't created within the same directory a hierarchy of directories Datasets/Regression/FloodModeling1. For this reason, the failsafe option is to specify an absolute path for --data_dir, pointing to wherever you put your directory with the data. Imagine all your data is in a directory called MyDataDir; if you keep it in your home directory (assuming you are on Linux/MacOS), then you should specify: --data_dir ~/MyDataDir.

JimengShi · 2022-10-18T14:41:05Z

Appreciate it, George. It was fixed when I specify the absolute path. Thanks!

JimengShi · 2022-10-18T14:49:47Z

Sorry to bother you, George. I have one more question: could you give us some intuitions about why you just used the Transformer encoder in your work? I did not find some explanations for this in the paper. Thanks!

gzerveas · 2022-10-18T15:34:04Z

No problem! I give some motivation behind using only the transformer encoder in Section 3.1 of the KDD proceedings version of the paper:

The main rationale behind this choice is that the decoder is first and foremost a component for generative tasks. The encoder builds a latent representation of the input, and the decoder, while looking at this latent input representation, learns to generate a statistically likely continuation of what it has already generated (which, during training, is what we are supplying as "ground truth"). If we were only interested in e.g. time-series forecasting, especially with a fluid future prediction horizon, then an encoder-decoder architecture might have been a good (or possibly, an even more suitable) choice.
However: (a) encoder-only architectures have also shown very strong results in generative modeling (e.g. autoregressive, GPT-style models), and (b) here we are interested also in a variety of tasks, such as classification and regression. A decoder architecture is unsuitable, or at least redundant when dealing with such tasks; for example, it needs a whole sequence as an input, and there is no good (non-redundant) way of encoding the desired output (i.e. a class or a single value) as an input for the decoder module. Instead, what we need is the model to extract a good latent representation of the input sequence (which is what the encoder does) and then use this to predict the single value we are interested in (for classification or regression), with the help of e.g. a single or a couple of dense layers.
Even in case we would like to do imputation of missing values, the encoder-decoder architecture is in fact equivalent to the encoder-only approach, but uses many more parameters. By contrast, if, for example, instead of a specified number of missing values/time steps we would only provide the model with a beginning and an end of a time series, and we would like it to guess what would be an appropriate middle part, of undetermined length, then an encoder-decoder architecture would have been an appropriate choice (as there should be many possible fitting parts, with differing lengths). I hope this helps.

xiqxin1 · 2022-10-18T15:37:50Z

Hi George, thanks, did you mean that in order to keep the model's generability, we choose the encoder as the pre-train model? (refer to your comment: A decoder architecture is unsuitable, or at least redundant when dealing with such tasks; for example, it needs a whole sequence as an input, and there is no good)

JimengShi · 2022-10-18T16:27:08Z

Clear explanations! It makes sense. @gzerveas

JimengShi closed this as completed Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

repository directory #20

repository directory #20

JimengShi commented Oct 14, 2022

xiqxin1 commented Oct 18, 2022

gzerveas commented Oct 18, 2022 •

edited

JimengShi commented Oct 18, 2022

JimengShi commented Oct 18, 2022

gzerveas commented Oct 18, 2022

xiqxin1 commented Oct 18, 2022

JimengShi commented Oct 18, 2022

repository directory #20

repository directory #20

Comments

JimengShi commented Oct 14, 2022

xiqxin1 commented Oct 18, 2022

gzerveas commented Oct 18, 2022 • edited

JimengShi commented Oct 18, 2022

JimengShi commented Oct 18, 2022

gzerveas commented Oct 18, 2022

xiqxin1 commented Oct 18, 2022

JimengShi commented Oct 18, 2022

gzerveas commented Oct 18, 2022 •

edited