Is Cost actually a Layer? #1311

wangkuiyi · 2017-02-10T17:56:01Z

It seems that it is.

Consider the following network for image classification:

image -> conv -> softmax

If we are going to train it by minimizing the squared error, we can add

image -> conv -> softmax \
                          -> squared_error
                 label --/

So it seems that the "model" should contain only the part from image to softmax, and at training time we add label and squared_error.

The text was updated successfully, but these errors were encountered:

jacquesqiao · 2017-02-11T00:03:58Z

Yes, I agree with you.
According to the discussion of in the last few days, Cost is not a layer but a independent concept, the topology of neural network only contains image -> conv -> softmax, Cost will be used in train and test.

wangkuiyi · 2017-02-11T05:04:06Z

Hi @jacquesqiao , @helinwang and I learned something more from @emailweixu this afternoon that it is not flexible enough to express the to-be-trained network by a composition of model and cost.

@emailweixu gave us an example problem: suppose that we are going to learn a text embedding, f, of our inputs in the way that if we have "search result A is closer to query Q than search result B" as a training instance, we should have sim{f(A), f(Q} > sim{f(B), f(Q)}.

In order to learn the f, we need to construct the following 3-branch network:

A -> f -\
Q -> f --> cost
B -> f -/

In this example, the model is actually x -> f, but in above network, we'd have to replicate the model three times in order to learn the f. This is why that we cannot say that the to-be-trained network is a composition of a model and a cost.

Actually, I remember that @reyoung once said that he thought that a model should include the cost. But a precise statement should be that the cost should be included in the to-be-trained network, instead of the model.

After this discussion, we reached a point that

in order to do training, we need a to-be-trained network,
this to-be-trained network might consist one or more cost and one or more replication of the model.
Once the to-be-trained network is trained, we should be able to extract the model part from it and use it for serving/inference.

wangkuiyi · 2017-02-11T05:17:10Z

By this conclusion

cost is a layer, and
it is not sufficient to define a topology as a "model" and a "cost",

I am closing this issue and open #1315 for the discussion of the concept network.

wangkuiyi assigned jacquesqiao, reyoung and emailweixu and unassigned jacquesqiao Feb 10, 2017

wangkuiyi mentioned this issue Feb 10, 2017

Trainer, Model, and Inferencer #1312

Closed

wangkuiyi closed this as completed Feb 11, 2017

wangkuiyi mentioned this issue Feb 11, 2017

How to describe and use Network #1315

Closed

lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024

Fix the bug of pruning dw_conv. (PaddlePaddle#1311)

7e3e14a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is Cost actually a Layer? #1311

Is Cost actually a Layer? #1311

wangkuiyi commented Feb 10, 2017 •

edited

Loading

jacquesqiao commented Feb 11, 2017

wangkuiyi commented Feb 11, 2017 •

edited

Loading

wangkuiyi commented Feb 11, 2017 •

edited

Loading

Is Cost actually a Layer? #1311

Is Cost actually a Layer? #1311

Comments

wangkuiyi commented Feb 10, 2017 • edited Loading

jacquesqiao commented Feb 11, 2017

wangkuiyi commented Feb 11, 2017 • edited Loading

wangkuiyi commented Feb 11, 2017 • edited Loading

wangkuiyi commented Feb 10, 2017 •

edited

Loading

wangkuiyi commented Feb 11, 2017 •

edited

Loading

wangkuiyi commented Feb 11, 2017 •

edited

Loading