the transformer to be applied to classification #18

hongjianyuan · 2020-07-27T03:47:36Z

How should I change the transformer to be applied to classification, such as seq2seq (many to many), how should I change it in the last layer of the model

maxjcohen · 2020-07-27T08:21:34Z

Hi, I believe the most straight forward solution would be to keep the original architecture, and only change the output module. Currently, I have a linear transformation followed by a sigmoid activation, I would start by simply replacing the activation with a softmax, and see from there.

hongjianyuan · 2020-07-27T08:24:25Z

I currently want to input 250 features, segment them, and output the categories of these 250 features, so I just need to change the output module to softmax?

maxjcohen · 2020-07-27T08:30:48Z

Yes, set d_input=250, d_ouptut to the number of class, and replace the sigmoid by a softmax, you should have a functional segmentation algorithm.

hongjianyuan · 2020-07-27T08:32:41Z

Thank you very much

hongjianyuan · 2020-08-12T07:18:04Z

是的，设置d_input=250，d_ouptut上课的人数，并通过SOFTMAX更换乙状结肠，你应该有一个功能分割算法。

If it is the category of these 250 features, the output is like 250*4

MJimitater · 2021-01-26T10:38:56Z

Hi @maxjcohen , thanks for your great repo!

Is it possible to change the transformer to understand sequence classification (many-to-one)?

maxjcohen · 2021-01-27T08:17:54Z

Hi, nothing is stopping you from setting d_output = 1, in order for the Transformer to behave as a many-to-one model. In practice, every hidden state will be computed with a dimension d_model, and later aggregated in the last layer to output a single value. Note that this process in different from how traditional architectures, such as RNN based networks, handle many-to-one predictions.

MJimitater · 2021-01-27T11:49:32Z

Thank you for your reply @maxjcohen ! How exactly do you mean its different? From the way a RNN-model would take hidden states as further input?

maxjcohen · 2021-01-29T08:39:17Z

RNN carry a memory-like hidden state across time steps, while the Transformer has no notion of memory and compute time steps in parallel instead.

maxjcohen pinned this issue Aug 17, 2020

maxjcohen mentioned this issue Aug 17, 2020

A more detailed Readme. #19

Closed

hongjianyuan closed this as completed Oct 13, 2020

maxjcohen mentioned this issue Oct 21, 2021

Get Error/Applying Univariate Time Series Dataset #51

Closed

maxjcohen mentioned this issue Dec 5, 2022

How to change the program to a classification model ? #60

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the transformer to be applied to classification #18

the transformer to be applied to classification #18

hongjianyuan commented Jul 27, 2020

maxjcohen commented Jul 27, 2020

hongjianyuan commented Jul 27, 2020

maxjcohen commented Jul 27, 2020

hongjianyuan commented Jul 27, 2020

hongjianyuan commented Aug 12, 2020

MJimitater commented Jan 26, 2021

maxjcohen commented Jan 27, 2021

MJimitater commented Jan 27, 2021

maxjcohen commented Jan 29, 2021

the transformer to be applied to classification #18

the transformer to be applied to classification #18

Comments

hongjianyuan commented Jul 27, 2020

maxjcohen commented Jul 27, 2020

hongjianyuan commented Jul 27, 2020

maxjcohen commented Jul 27, 2020

hongjianyuan commented Jul 27, 2020

hongjianyuan commented Aug 12, 2020

MJimitater commented Jan 26, 2021

maxjcohen commented Jan 27, 2021

MJimitater commented Jan 27, 2021

maxjcohen commented Jan 29, 2021