value in file_outputs is not being passed to input parameters correctly #957

ritazh · 2019-03-11T20:34:03Z

What happened:
String value inception was written into a file ritazh/k8s-ml@be7f520 and it's set as the file_output: file_outputs={'output': '/model_name.txt'}. The file content is then passed in as an input parameter here:

 op2 = dsl.ContainerOp(
     name='inception',
     image='elsonrodriguez/model-server:1.6',
     command=[
        '/usr/bin/tensorflow_model_server',
     ],
     arguments=[
        '--port', 9000,
        '--model_name', op1.output,
        '--model_base_path', 's3://mybucket/models/myjob2/export/inception/',
     ]

Expected:
op2 pod will run with the argument --model_name=inception as the file content written in /model_name.txt is passed into op2 from op1.output.

Actual:
kubectl describe po

spec:
  containers:
  - args:
    - --port
    - "9000"
    - --model_name
    - inception
    - --model_base_path
    - s3://mybucket/models/myjob2/export/inception/
    command:
    - /usr/bin/tensorflow_model_server

Job Output:

usage: /usr/bin/tensorflow_model_server
Flags:
	--port=8500                      	int32	port to listen on
	--enable_batching=false          	bool	enable batching

Hack that works:

arguments=[
        '--port=9000',
        '--model_name=inception',
        '--model_base_path=s3://mybucket/models/myjob2/export/inception/',
        op1.output,
     ]

kubectl describe po:

containers:
  - args:
    - --port=9000
    - --model_name=inception
    - --model_base_path=s3://mybucket/models/myjob2/export/inception/
    - inception
    command:
    - /usr/bin/tensorflow_model_server

The entire pipeline definition with the hack:
https://github.com/ritazh/pipelines/blob/fa2a3de2169ce9e8ceef36911ca5c6a4c2199d0d/samples/azure/retrain-pipeline.py#L87-L89

The text was updated successfully, but these errors were encountered:

Ark-kun · 2019-03-14T22:24:57Z

Expected:
op2 pod will run with the argument --model_name=inception as the file content written in

I'm not sure why this is being expected.
In Linux system, command-line arguments is an array of strings.
In Kubernetes, Container's command and args follow that design - they are arrays of strings.
In KF Pipeline, ContainerOp's command and arguments follow that design - they are arrays of strings.

Each argument is a separate array entry. There is no special magical logic inserting = between some arguments. The arguments remain separate array items.

It's generally not possible and not desired to join some arguments with =.

So, the current behavior is by desing (by design of Linux, Kubernetes, Argo and KF Pipelines)

arguments=['--port', 9000, '--model_name', op1.output, '--model_base_path', 's3://mybucket/models/myjob2/export/inception/']

resolves to

arguments=['--port', 9000, '--model_name', 'inception', '--model_base_path', 's3://mybucket/models/myjob2/export/inception/']

Usually, the programs do not really require the = character between the parameter name and value (e.g. when using the argparse python module).

In your case, though the program seems to require =. If you cannot ask the program author to fix the command-line arguments parsing, you can use the following workaround.

BTW, I see that you pass ['--port', 9000] instead of ['--port=9000']. If you pass ['--port', 9000], you get ['--port', 9000]. If you need to pass ['--port=9000'] then pass ['--port=9000'].

Workaround:

 arguments=[
    '--port=9000',
    '--model_name={}'.format(str(op1.output)),
    '--model_base_path=s3://mybucket/models/myjob2/export/inception/',
 ]

ritazh · 2019-03-15T05:34:22Z

Thanks for looking into this @Ark-kun. I applied the changes/workaround you suggested in the pipeline definition and the resulted pipeline yaml:

- container:
      args:
      - --port=9000
      - --model_name={{inputs.parameters.retrain-output}}
      - --model_base_path
      - s3://mybucket/models/myjob2/export/inception/
      command:
      - /usr/bin/tensorflow_model_server

and the resulted pod created by the pipeline:

spec:
  containers:
  - args:
    - --port=9000
    - --model_name=inception
    - --model_base_path
    - s3://mybucket/models/myjob2/export/inception/
    command:
    - /usr/bin/tensorflow_model_server

But I'm still getting the same error from the serving pod:

usage: /usr/bin/tensorflow_model_server
Flags:
	--port=8500                      	int32	port to listen on
	--enable_batching=false          	bool	enable batching

Ark-kun · 2019-03-15T21:41:23Z

I applied the changes/workaround you suggested in the pipeline definition and the resulted pipeline yaml:

There might have been some mistake in the process.
You see, I've written '--model_base_path=s3://mybucket/models/myjob2/export/inception/', but it looks like you have --model_base_path, s3://mybucket/models/myjob2/export/inception/ (note the = vs , ).

Can you please check it?

P.S. You should probably pass the feedback to the program author that they should use more standard command-line parsing libraries that handle both --param=value and --param value syntaxes.

ritazh · 2019-03-16T01:58:31Z

@Ark-kun You are right! and that worked! Thanks alot for suggesting the changes. It might be useful to state that pipeline requires the images to use standard command-line parsing or have an example that passes in the param values with =.

By the way the code itself is just tensorflow serving, which requires us to pass the values this way: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/model_servers/main.cc#L92

Ark-kun · 2019-03-19T21:34:51Z

Good to hear that everything is working now.

It might be useful to state that pipeline requires the images to use standard command-line parsing

KF Pipelines do not impose any requirements on the program or it's command-line parsing. It does not change the command-line in any way. If you use separate arguments, it will pass them separately. If you use =-joined arguments, kfp will pass them like that. If you write something like +param1;-param2;@@param3==value3;$param4:=value4, then that's exactly the argument that will be passed.

have an example that passes in the param values with =.

That's a good idea. We should do that.

…ing/node-license-tools (kubeflow#957) Bumps [path-parse](https://github.com/jbgutierrez/path-parse) from 1.0.6 to 1.0.7. - [Release notes](https://github.com/jbgutierrez/path-parse/releases) - [Commits](https://github.com/jbgutierrez/path-parse/commits/v1.0.7) --- updated-dependencies: - dependency-name: path-parse dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

minor fix

Ark-kun self-assigned this Mar 14, 2019

Ark-kun closed this as completed Mar 19, 2019

magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this issue Oct 22, 2023

correct Kubeflow spelling (kubeflow#957)

a0baf41

minor fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

value in file_outputs is not being passed to input parameters correctly #957

value in file_outputs is not being passed to input parameters correctly #957

ritazh commented Mar 11, 2019

Ark-kun commented Mar 14, 2019

ritazh commented Mar 15, 2019

Ark-kun commented Mar 15, 2019

ritazh commented Mar 16, 2019 •

edited

Loading

Ark-kun commented Mar 19, 2019 •

edited

Loading

value in file_outputs is not being passed to input parameters correctly #957

value in file_outputs is not being passed to input parameters correctly #957

Comments

ritazh commented Mar 11, 2019

Ark-kun commented Mar 14, 2019

ritazh commented Mar 15, 2019

Ark-kun commented Mar 15, 2019

ritazh commented Mar 16, 2019 • edited Loading

Ark-kun commented Mar 19, 2019 • edited Loading

ritazh commented Mar 16, 2019 •

edited

Loading

Ark-kun commented Mar 19, 2019 •

edited

Loading