-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to describe and use Network #1315
Comments
Here summarizes an idea from @helinwang and @emailweixu that changes the concepts listed in #1297 into the following:
For how to describe networks and how to use it for convenient training, testing, and inference/serving, please see following comments. |
Example 1. Sharing Parameters between LayersWe use the 3-branch ranking model in this example. For your convenience, I copy-a-paste the model's topology as follows:
The following program trains the topology including the cost, and then use the sub-network in the trained topology in inference: def f(in):
e = paddle.layer.embedding(in, parameter_name="embedding")
o = paddle.layer.softmax(e, parameter_name="semantic")
return o
# Create 3 topologies (subnets), they share parameters because all
# correspoinding layers have the same parameter names.
fA = f(paddle.layer.data(input_name="A"))
fB = f(paddle.layer.data(input_name="B"))
fQ = f(paddle.layer.data(input_name="Q"))
topology = paddle.layer.less_than(
paddle.layer.cross_entropy(fA, fQ),
paddle.layer.corss_entropy(fB, fQ))
# Derive parameters required in topology and create them in model.
parameters = paddle.parameters.create(topology)
# Estimate parameters used in topology from data.
paddle.train(topology, parameters, reader=read_ranking_model_data)
# Inference using fA (or fB or fC, as they share their parameters).
[testA, testB, testQ] = read_ranking_model_data()
print "The sematic-vector of testA: ", paddle.infer(fA, parameters, testA) |
Exmaple 2. Sharing Parameters between "Models"We use GAN in def G(in):
# over-simplified example as G has only one layers:
return paddle.layer.fc(in, parameter_name="G")
def D(in, parameters_mutable);
# again, over-simplified:
return paddle.layer.fc(in, parameters_name="D", parameters_mutable)
# Construct the first topology, which contains both D and G.
# By learning this topology, we update parameters of G.
d0 = paddle.layer.should_be_false(
D(G(paddle.layer.data()),
False)) # Don't update the parameter of D here.
# Construct a second topology d1, which contains only D. By
# training this topology, we update parameters of D. Note
# that d1 share parameters with d0.
d1 = paddle.layer.should_be_true(D(paddle.layer.data()))
# Create parameters from a list of multiple topologies (models) for
# the chance to share parameters between these topologies.
parameters = paddle.parameters.create([d0, d1])
# Iterative training of GAN.
for ...:
train(d0, parameters, reader=read_from_rng)
train(d1, parameters, reader=read_from_realistic_images)
# Use d1 for inference:
print "D thinks a batch of images are realistic ", infer(d1, parameters, read_mnist_images) |
Maybe a parameter pool( To be trained Neural Network = a parameter pool + train network topology. Is the |
Maybe instead of specifying which parameter not to update here:
We can specify in |
train函数里面,添加 event_handler的callback 附之前讨论的代码: def train_reader():
yield {'pixel': pixels, 'label': labels} # return a data batch.
# Observe callback is used for plotting or logging the training process.
# The type of event parameter could be various. The intermediate result for
# training is in event instance.
def callback(event):
if isinstance(event, FinishTrainOneBatch):
print event.pass_id, event.batch_id, "Cost = ", event.cost, "Error Rate = ", event.metric[0]
print "output layer's output is ", event.activation['output']
if event.batch_id % 1000 == 0: # Even, we could save check point during callback.
with open('check_point_%d' % event.batch_id, 'w') as stream:
optimizer.check_point(stream)
else:
pass
optimizer.train(train_reader=train_reader, test_reader=None, # Test reader shared the same
# format of train reader. Could be None if no test data.
cost=CrossEntropy(input=model.topology.output_layer, # the network's output layer.
label=DataReader("label")), # Label is get from data_reader's 'label' field.
metric=[ErrorRateMetric(input=model.topology.output_layer, label=DataReader("label"))], # same logic above
observe_callback=callback
) |
Added issue for separating |
if we need to put things about cost in a special namespace, like paddle.layer.cost.cross_entropy
paddle.layer.cost.less_than |
Update API design doc according to discussions in issue #1315
…addlePaddle#1315) * Add yacs and numpy to requirements; Update LayoutXLM README.md * try_import yacs
We'd thought that a DL framework should implement concepts like model and cost. But we realized that these are not flexible enough to describe deep learning problems. Instead, we need the concept network. For more about this derivation, please refer to #1311.
In this issue, we are going to figure out how should we build a network and its parameters, and how can we train a network and use part of it (the model) for inference/serving.
The text was updated successfully, but these errors were encountered: