Paddle API v4 proposal #10152

helinwang · 2018-04-24T00:56:02Z

Pytorch uses Python to call each operator, the following API uses Python to call each compiled function.

fit_a_line.py

class FitALine(fluid.Program):
  DEFAULT_READER_FILE_PATH =  './data.recordio'
  DEFAULT_READER_BATCH_SIZE = 128 

  # @network decorator will be used by the compiler to generate a
  # ProgramDesc block.  It can optionally take inputs, which represents a
  # mapping from the input vars to vars in the block
  @network()
  def train_step(self):
    reader = fluid.batch_reader(file=DEFAULT_READER_FILE_PATH,
                                               batch_size=DEFAULT_READER_BATCH_SIZE,
                                               shape=[[13], [1]], 
                                               dtype=['float32', 'float32'],
                                               format='recordio')
    x, y = self.reader.next_item()
    with fluid.var_scope('prediction'):
      # Since we want to be able to access the same weights and bias during
      # inference, we need to namespace the variables.
      # fluid.var_scope block guard will create a new UniqueNameGenerator when
      # we enter the block, and rollback to the previous UniqueNameGenerator when
      # the block exits.
      y_predict = fluid.layers.fc(input=x, size=1, act=None)

    cost = fluid.layers.square_error_cost(input=y_predict, label=y)
    avg_cost = fluid.layers.mean(cost, name='avg_cost')
    sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
    sgd_optimizer.minimize(avg_cost)
    return avg_cost

  @network("x")
  def infer(self):
    x = fluid.layers.data(name='x', shape=[13], dtype='float32')
    with fluid.var_scope('prediction'):
      return fluid.layers.fc(input=x, size=1, act=None)

main.py

fit_a_line = FitALine(batch_size=256).Compile()

while i in range(1000):
  avg_cost = fit_a_line.train_step()

y_results = fit_a_line.infer([3,4,6,3,5,7,8,6,5,4,1,5,8])

Transpiler

Trainer

> TRAINING_ROLE=TRAINER python main.py --distributed main=train_step

PServer

> TRAINING_ROLE=PSERVER paddle run train.py --distributed main=train_step

The text was updated successfully, but these errors were encountered:

cs2be · 2018-04-24T01:56:22Z

this looks good. one question, when is the reader initialized? when the first iteration is ran? we probably need to implicitly do this, maybe by creating a static or shared reader?

abhinavarora · 2018-04-24T18:18:52Z

Is the assumption here that we can return a tuple if we want to capture multiple variables during the training steps? For example, we should be able to do return y_predict, avg_cost.

varunarora · 2018-04-24T19:42:04Z

How about:

fit_a_line.py

...
  def train_step(batch_size):
    ...
    return self.program({ 'avg_cost': avg_cost })
...

main.py

from fit_a_line import FitALine

trainer = FitALine().train_step(batch_size=256)

while i in range(1000):
  cost = trainer()['avg_cost']

y_results = FitALine().infer([3,4,6,3,5,7,8,6,5,4,1,5,8])

wangkuiyi · 2018-04-24T21:22:08Z

Thanks for this design @cs2be and @helinwang ! I have a few questions:

Should users be able to define arbitrary methods with arbitrary names and call program.run('method_name'), or only a few pre-determined methods with certain names could be overloaded in class FitALine? If it is the latter case, could we have a list of pre-determined methods that are overloadable?
It seems, and I just want to confirm, that each class (FitALine) will be compiled into a ProgramDesc and each method in the class a block?
How could we map method arguments to block's inputs? Do we need to define RecordIO read operators in the blocks, and assume that each argument corresponds to a field?

cs2be · 2018-04-24T23:44:39Z

Hi @wangkuiyi,

This is a good point, we are still debating on this, since transpiler may need to know what method the main program is in. Currently for transpilation step, we let users define what the main program method is.
We are thinking that each method will be compiled to a ProgramDesc. This will allow the users to run them in any order they want. We still need to refine this idea.
This is a good point, I'll discuss with team to see what a good solution for this is.

helinwang · 2018-04-25T00:21:44Z

Thanks for reviewing @wangkuiyi ! @cs2be and me discussed, here are the replies:

Should users be able to define arbitrary methods with arbitrary names and call program.run('method_name'), or only a few pre-determined methods with certain names could be overloaded in class FitALine? If it is the latter case, could we have a list of pre-determined methods that are overloadable?

Many thanks to @cs2be for coming out this idea. I think the power of this idea comes from it enables the state to be shared cross many methods:

program = FitALine().Compile()
# program.train knows the block ID,
# so program desc does not have to store function name.
program.train()
program.infer()

In the above code, train and infer shares the same state (scope).

The user could come up with many things that they want to do: train, infe, train_step, save_model, load_model, ...
So I would prefer that we give the user this flexibility to define whatever method they want. It may be more coherent with the programming language idea: a programer can write any method with his favorite programming language.

It seems, and I just want to confirm, that each class (FitALine) will be compiled into a ProgramDesc and each method in the class a block?

Yes, that is correct.

How could we map method arguments to block's inputs? Do we need to define RecordIO read operators in the blocks, and assume that each argument corresponds to a field?

Great point, @cs2be and I have discussed and updated the example. The block input will be a var, created by fluid.layers.data. The name mapping is indicated by @Input('x'), so program.infer([3,4,6,3,5,7,8,6,5,4,1,5,8]) would know which var to set [3,4,6,3,5,7,8,6,5,4,1,5,8] to.

  @Input('x')
  def infer(x):
    x = fluid.layers.data(name='x', shape=[13], dtype='float32')
    return fluid.layers.fc(input=x, size=1, act=None)

Summary: Pytorch uses Python to call each operator, this API uses Python to call each compiled function.

panyx0718 · 2018-04-25T01:53:47Z

I think this proposal looks mostly good!

One question:

> TRAINING_ROLE=TRAINER python main.py --distributed main=train_step

How does the Compile() logic magically knows how to transpile a program into
trainer part and ps part?

In more detail, how does the transpiler know which operators and variables should
be placed in trainer and others be placed in parameter servers?

helinwang · 2018-04-25T17:45:33Z

@panyx0718 thanks for the review!

How does the Compile() logic magically knows how to transpile a program into
trainer part and ps part?
In more detail, how does the transpiler know which operators and variables should
be placed in trainer and others be placed in parameter servers?

Our current transpiler implementation does "magically" make everything work. Indeed, it is hacky because a lot of assumptions has been made. To improve, the transpiler needs to know more information, such as "which operators belongs to the optimizers", "which operators belong to the gradient calculating pass", "which operators are explicitly defined by the user". I think we should store these information in the program desc.

Do you think the API makes sense, regardless of the current transpiler implementation (or assuming we have implemented it in the non-hacky way)?

abhinavarora · 2018-04-25T22:33:20Z

Can we change fluid.var_scope to fluid.scope because Variable is a C++ concept in Paddle and should not be exposed to Python users

panyx0718 · 2018-04-26T01:13:53Z

@helinwang

I think the skeleton looks fine.
Though I still don't know how to achieve the following with "--distributed" option

the transpiler needs to know more information, such as "which operators belongs to the optimizers", "which operators belong to the gradient calculating pass", "which operators are explicitly defined by the user"

There can be several configuration options for "distributed" training

panyx0718 · 2018-04-26T01:16:31Z

@PaddleCI

Other Paddle team members can take a look at this proposal.

jacquesqiao · 2018-04-26T03:33:50Z

Can @network() support training multiple networks, like GAN?

helinwang · 2018-04-26T17:38:22Z

@jacquesqiao thanks for reviewing! Yes I think so. Multiple class methods annotated by @network() share variable by using block guard with fluid.var_scope("shared_name_scope") when writing the Fluid program.
The "compiled program" generated by fluid.Program.Compile owns the scope, which contains the parameters of however many networks.

shanyi15 · 2018-08-15T11:04:03Z

您好，此issue在近一个月内暂无更新，我们将于今天内关闭。若在关闭后您仍需跟进提问，可重新开启此问题，我们将在24小时内回复您。因关闭带来的不便我们深表歉意，请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

helinwang changed the title ~~Paddle API v4 - fetch data~~ Paddle API v4 - fetch data during training steps Apr 24, 2018

wangkuiyi mentioned this issue Apr 24, 2018

Design Doc: Complete Fluid #10103

Closed

helinwang assigned cs2be, abhinavarora, varunarora, wangkuiyi and tonyyang-svail Apr 24, 2018

helinwang changed the title ~~Paddle API v4 - fetch data during training steps~~ Paddle API v4 - explicit initialization, and fetch data during training steps Apr 24, 2018

helinwang assigned panyx0718 Apr 25, 2018

cs2be mentioned this issue Apr 25, 2018

Paddle API v4 Proposal #9912

Closed

helinwang assigned kexinzhao Apr 25, 2018

helinwang mentioned this issue Apr 25, 2018

Paddle V4 API - Word to Vec #10214

Closed

helinwang changed the title ~~Paddle API v4 - explicit initialization, and fetch data during training steps~~ Paddle API v4 proposal Apr 26, 2018

helinwang assigned reyoung, jacquesqiao and JiayiFeng Apr 26, 2018

shanyi15 closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paddle API v4 proposal #10152

Paddle API v4 proposal #10152

helinwang commented Apr 24, 2018 •

edited by sidgoyal78

Loading

cs2be commented Apr 24, 2018

abhinavarora commented Apr 24, 2018

varunarora commented Apr 24, 2018 •

edited

Loading

wangkuiyi commented Apr 24, 2018

cs2be commented Apr 24, 2018

helinwang commented Apr 25, 2018 •

edited

Loading

panyx0718 commented Apr 25, 2018

helinwang commented Apr 25, 2018

abhinavarora commented Apr 25, 2018

panyx0718 commented Apr 26, 2018 •

edited

Loading

panyx0718 commented Apr 26, 2018

jacquesqiao commented Apr 26, 2018

helinwang commented Apr 26, 2018 •

edited

Loading

shanyi15 commented Aug 15, 2018

Paddle API v4 proposal #10152

Paddle API v4 proposal #10152

Comments

helinwang commented Apr 24, 2018 • edited by sidgoyal78 Loading

fit_a_line.py

main.py

Transpiler

Trainer

PServer

cs2be commented Apr 24, 2018

abhinavarora commented Apr 24, 2018

varunarora commented Apr 24, 2018 • edited Loading

fit_a_line.py

main.py

wangkuiyi commented Apr 24, 2018

cs2be commented Apr 24, 2018

helinwang commented Apr 25, 2018 • edited Loading

panyx0718 commented Apr 25, 2018

helinwang commented Apr 25, 2018

abhinavarora commented Apr 25, 2018

panyx0718 commented Apr 26, 2018 • edited Loading

panyx0718 commented Apr 26, 2018

jacquesqiao commented Apr 26, 2018

helinwang commented Apr 26, 2018 • edited Loading

shanyi15 commented Aug 15, 2018

helinwang commented Apr 24, 2018 •

edited by sidgoyal78

Loading

varunarora commented Apr 24, 2018 •

edited

Loading

helinwang commented Apr 25, 2018 •

edited

Loading

panyx0718 commented Apr 26, 2018 •

edited

Loading

helinwang commented Apr 26, 2018 •

edited

Loading