Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repository directory #20

Closed
JimengShi opened this issue Oct 14, 2022 · 7 comments
Closed

repository directory #20

JimengShi opened this issue Oct 14, 2022 · 7 comments

Comments

@JimengShi
Copy link

Hi Goerge, thanks for your open-source codes. It is very clear and organized.

But I am new to use the shell script, could you please give a directory tree of the entire repository? That would be very helpful to understand the architecture. I am confused about where I should put the downloaded data and where I should make the experiments folder. Currently, I am trying with the following tree:

  • experiments
  • src
    - datasets
    - models
    - regression
    - utils
    - main.py
    - optimizers.py
    - options.py
    - running.py

After cd mvts_transformer, I run python src/main.py --output_dir experiments --comment "regression from Scratch" --name FloodModeling1_fromScratch_Regression --records_file Regression_records.xls --data_dir Datasets/Regression/FloodModeling1/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task regression, but it shows No files found using: Datasets/Regression/FloodModeling1/*.

@xiqxin1
Copy link

xiqxin1 commented Oct 18, 2022

Hi, did you solve this problem? Could we have a discussion by email?

@gzerveas
Copy link
Owner

gzerveas commented Oct 18, 2022

Hi Jimeng, the path specified by --data_dir should exist and contain the files with the time series data (in your case, because you are using --data_class tsra, it should be in the TSRA format). Furthermore, the Python interpreter should be able to resolve that path, depending on what is your current directory when running python main.py. From the error you got, I am guessing that you were running the command from within the repo directory, but you hadn't created within the same directory a hierarchy of directories Datasets/Regression/FloodModeling1. For this reason, the failsafe option is to specify an absolute path for --data_dir, pointing to wherever you put your directory with the data. Imagine all your data is in a directory called MyDataDir; if you keep it in your home directory (assuming you are on Linux/MacOS), then you should specify: --data_dir ~/MyDataDir.

@JimengShi
Copy link
Author

Appreciate it, George. It was fixed when I specify the absolute path. Thanks!

@JimengShi
Copy link
Author

Sorry to bother you, George. I have one more question: could you give us some intuitions about why you just used the Transformer encoder in your work? I did not find some explanations for this in the paper. Thanks!

@gzerveas
Copy link
Owner

No problem! I give some motivation behind using only the transformer encoder in Section 3.1 of the KDD proceedings version of the paper:

transformer_encoder_only

The main rationale behind this choice is that the decoder is first and foremost a component for generative tasks. The encoder builds a latent representation of the input, and the decoder, while looking at this latent input representation, learns to generate a statistically likely continuation of what it has already generated (which, during training, is what we are supplying as "ground truth"). If we were only interested in e.g. time-series forecasting, especially with a fluid future prediction horizon, then an encoder-decoder architecture might have been a good (or possibly, an even more suitable) choice.
However: (a) encoder-only architectures have also shown very strong results in generative modeling (e.g. autoregressive, GPT-style models), and (b) here we are interested also in a variety of tasks, such as classification and regression. A decoder architecture is unsuitable, or at least redundant when dealing with such tasks; for example, it needs a whole sequence as an input, and there is no good (non-redundant) way of encoding the desired output (i.e. a class or a single value) as an input for the decoder module. Instead, what we need is the model to extract a good latent representation of the input sequence (which is what the encoder does) and then use this to predict the single value we are interested in (for classification or regression), with the help of e.g. a single or a couple of dense layers.
Even in case we would like to do imputation of missing values, the encoder-decoder architecture is in fact equivalent to the encoder-only approach, but uses many more parameters. By contrast, if, for example, instead of a specified number of missing values/time steps we would only provide the model with a beginning and an end of a time series, and we would like it to guess what would be an appropriate middle part, of undetermined length, then an encoder-decoder architecture would have been an appropriate choice (as there should be many possible fitting parts, with differing lengths). I hope this helps.

@xiqxin1
Copy link

xiqxin1 commented Oct 18, 2022

Hi George, thanks, did you mean that in order to keep the model's generability, we choose the encoder as the pre-train model? (refer to your comment: A decoder architecture is unsuitable, or at least redundant when dealing with such tasks; for example, it needs a whole sequence as an input, and there is no good)

@JimengShi
Copy link
Author

Clear explanations! It makes sense. @gzerveas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants