add: Make function_name parametrized #166

mrshu · 2020-02-14T08:09:01Z

Hello everyone,

First of all, thanks a ton for putting this tool/library together -- especially in resource-stranded environments, it does have a potential to literally save lives!

One small problem I was fighting with while using it was the score function it uses in the generated modules. When they are used as drop-in replacements for trained models, using score is a bit strange, as the API generally provides function like predict or predict_proba. It would therefore be of great help to me if this name could be dynamically changed and I would not have to do so manually.

Please do let me know if something like this sounds like a sensible addition. I'd be happy to update the code so that it reflect your vision, so please feel free to just let me know whenever that may be the case.

Thanks!

Currently m2cgen generates a module in various languages that has a
"score"/"Score" function/method. This is not always desirable, as many
of the trained modules that are to be exported may provide their
prediction via API functions with different names (such as predict).
This commit adds a way of specifying the name of the function both via
the CLI and in the exporters (that is, in the export_to_ funcitons)
by specifying the function_name option/parameter while keeping the
default set to "score"/"Score" for backwards compatilibity.

Signed-off-by: mr.Shu mr@shu.io

* Currently `m2cgen` generates a module in various languages that has a "score"/"Score" function/method. This is not always desirable, as many of the trained modules that are to be exported may provide their prediction via API functions with different names (such as `predict`). * This commit adds a way of specifying the name of the function both via the CLI and in the exporters (that is, in the `export_to_` funcitons) by specifying the `function_name` option/parameter while keeping the default set to "score"/"Score" for backwards compatilibity. Signed-off-by: mr.Shu <mr@shu.io>

* Add `function_name` to mock CLI args to be used in the tests. Signed-off-by: mr.Shu <mr@shu.io>

coveralls · 2020-02-14T14:41:00Z

Coverage increased (+0.04%) to 95.878% when pulling 6c2a081 on mrshu:mrshu/parametrize-function_name into d28cb00 on BayesWitnesses:master.

izeigerman · 2020-02-14T15:54:24Z

@mrshu Thanks a lot for submitting this PR together with such a detailed description!
I think this is a valuable updated and as far as I remember @StrikerRUS mentioned the importance of it some time ago too.
I'll take at it soon.

mrshu · 2020-02-14T22:31:07Z

Thank you for your kind words @izeigerman!

Right off the bat I'd like to tell you that the way the --function_name CLI option is handed probably introduces a bug -- it currently always defaults to "score" which was not the case for C# and Powershell before. To ensure that it works as before we'd have to deal with some conditional defaults (as in, after --language is set and --function_name is set, ensure Score is the default in case --language was C# or Powershell) and I was not exactly sure where would something like that make the most sense.

I am therefore looking forward to your review and any further discussion!

mrshu · 2020-02-14T22:35:47Z

Also, I am not completely sold on calling it function_name -- sklearn-porter for instance uses --method_name which may make a bit more sense in this case as well. Please do let me know what you think and I'll be happy to switch it over.

Thanks!

StrikerRUS · 2020-02-15T03:28:50Z

@izeigerman

I think this is a valuable updated and as far as I remember @StrikerRUS mentioned the importance of it some time ago too.

To be honest, I don't remember I was saying this 😃
I remember I spoke about argument name: #109 (comment).

@mrshu

To ensure that it works as before we'd have to deal with some conditional defaults

I'm not sure that it is worthy of further consideration. As we are still in alpha phase, we are free to change a lot without any deprecation warnings and worrying about the backward compatibility. Especially for just a naming conventions.

m2cgen/setup.py

Line 20 in d28cb00

"Development Status :: 3 - Alpha",

Please add a test for your PR. You may use the following tests as examples:

m2cgen/tests/test_cli.py

Lines 89 to 126 in d28cb00

    
           def test_class_name(): 
        
               infile = _get_pickled_trained_model() 
        
               mock_args = _get_mock_args( 
        
                   infile=infile, language="java", class_name="TestClassName") 
        
               generated_code = cli.generate_code(mock_args).strip() 
        
               assert generated_code.startswith("public class TestClassName") 
        
           def test_package_name(): 
        
               infile = _get_pickled_trained_model() 
        
               mock_args = _get_mock_args( 
        
                   infile=infile, language="java", package_name="foo.bar.baz") 
        
               generated_code = cli.generate_code(mock_args).strip() 
        
               assert generated_code.startswith("package foo.bar.baz;") 
        
           def test_module_name(): 
        
               infile = _get_pickled_trained_model() 
        
               mock_args = _get_mock_args( 
        
                   infile=infile, language="visual_basic", module_name="TestModule") 
        
               generated_code = cli.generate_code(mock_args).strip() 
        
               assert generated_code.startswith("Module TestModule") 
        
           def test_namespace(): 
        
               infile = _get_pickled_trained_model() 
        
               mock_args = _get_mock_args( 
        
                   infile=infile, language="c_sharp", namespace="Tests.ML") 
        
               generated_code = cli.generate_code(mock_args).strip() 
        
               assert generated_code.find("namespace Tests.ML {") != -1

* Add a test for the `--function_name` CLI option. Signed-off-by: mr.Shu <mr@shu.io>

mrshu · 2020-02-15T12:43:52Z

Thanks a lot for your comments @StrikerRUS!

I did add a test for the CLI part of the PR -- if you think it would be worth it, I am happy to add some for the "API" part as well.

StrikerRUS

@mrshu Thanks a lot for your PR!
Please check my comments below.

m2cgen/cli.py

m2cgen/exporters.py

StrikerRUS · 2020-02-16T01:20:29Z

m2cgen/exporters.py

    return _export(model, interpreter)


-def export_to_c_sharp(model, namespace="ML", class_name="Model", indent=4):
+def export_to_c_sharp(model, namespace="ML", class_name="Model", indent=4,
+                      function_name="Score"):


< Here and in all other places >
I believe it can be score for the consistency. Refer to #166 (comment)

I'm not sure that it is worthy of further consideration. As we are still in alpha phase, we are free to change a lot without any deprecation warnings and worrying about the backward compatibility. Especially for just a naming conventions.

Hey @StrikerRUS shouldn't we at least comply with basic code style principles of a target language? Like in this case method names are capitalized in C#. I believe this basic logic is worth keeping.

I really do not have a strong opinion here, but since C# seems to be using PascalCasing by definition, I would vote for being consistent with that.

To be honest, I don't have a strong opinion on keeping code style for default values too. I just proposed the way which is the easiest to maintain 🙂 .
And then we'll need to apply conditional default values for the CLI part: #166 (comment).

@StrikerRUS All right, I went ahead and implemented the "default values for the CLI" part such that they directly use the default value of the function_name keyword argument.

It relies on inspecting the export function (the inspect module introduced in Python 3.5 so that should not be an issue) and might therefore feel a bit hacky, but it allows us to specify the default only once (namely in the keyword argument) and be done with it -- it should automatically down-propagate to the CLI options as well.

I also added a test which should give us a greater confidence that this works.

If you do not like this approach, I am happy to revert that particular commit and just leave it as it was.

Thanks again!

@mrshu Thanks a lot for implementing conditional default values! I like your approach very much! Just please add a comment about that default value is conditional and will be set later based on the export function signature.

parser.add_argument( "--function_name", "-fn", dest="function_name", type=str, default=None, # some comment here help="Name of the function in the generated code.")

Thanks @StrikerRUS , I just did that.

Please feel free to point out a simpler version of the comment -- I may have overcomplicated it quite a bit.

m2cgen/interpreters/c/interpreter.py

m2cgen/interpreters/php/interpreter.py

izeigerman · 2020-02-17T03:34:04Z

@StrikerRUS, thanks for reviewing this PR!

To be honest, I don't remember I was saying this 😃

Oops, sorry. I confused 2 things 🤦‍♂

* Make sure function_name is actually used in the interpreters. Signed-off-by: mr.Shu <mr@shu.io>

* Make sure that the `function_name` docstring ends with a dot rather than the default value. * Add dot to various CLI argument's help messages. Signed-off-by: mr.Shu <mr@shu.io>

mrshu · 2020-02-17T21:43:01Z

@StrikerRUS Thanks for your feedback!

I believe I've addressed most of the things you've pointed out except for the C# one where I'd like to wait for your opinion.

If you still see something worth updating here, feel free to let me know -- I'll gladly update it.

Thanks!

* Make sure the default function_name (when not provided via arguments) gets backpropagated from the actual function definitions (in other words, make sure that the keyword arguments are respected when the function_name is not specified via CLI). Signed-off-by: mr.Shu <mr@shu.io>

* Add a comment on function_name being conditionally set later. Signed-off-by: mr.Shu <mr@shu.io>

StrikerRUS

@mrshu Thank you very much for your valuable contribution! LGTM!

izeigerman

Great stuff, thank you 👍

mrshu added 2 commits February 13, 2020 23:53

tests: Add function name to mock args

8695dd0

* Add `function_name` to mock CLI args to be used in the tests. Signed-off-by: mr.Shu <mr@shu.io>

mrshu requested a review from izeigerman February 14, 2020 15:32

tests: Test for the function_name CLI option

9e848c9

* Add a test for the `--function_name` CLI option. Signed-off-by: mr.Shu <mr@shu.io>

StrikerRUS requested changes Feb 16, 2020

View reviewed changes

interpreters: Actually use function_name

9fdc02a

* Make sure function_name is actually used in the interpreters. Signed-off-by: mr.Shu <mr@shu.io>

izeigerman mentioned this pull request Feb 17, 2020

Dart lang support #167

Merged

docstrings: Make function_name end with a dot

403d7a1

* Make sure that the `function_name` docstring ends with a dot rather than the default value. * Add dot to various CLI argument's help messages. Signed-off-by: mr.Shu <mr@shu.io>

StrikerRUS self-requested a review February 19, 2020 04:17

mrshu added 2 commits February 19, 2020 09:47

cli: Add a comment on function_name

6c2a081

* Add a comment on function_name being conditionally set later. Signed-off-by: mr.Shu <mr@shu.io>

StrikerRUS approved these changes Feb 20, 2020

View reviewed changes

izeigerman approved these changes Feb 20, 2020

View reviewed changes

izeigerman merged commit a4ef90f into BayesWitnesses:master Feb 20, 2020

StrikerRUS mentioned this pull request Feb 20, 2020

comply with naming style of languages #170

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add: Make function_name parametrized #166

add: Make function_name parametrized #166

mrshu commented Feb 14, 2020

coveralls commented Feb 14, 2020 •

edited

izeigerman commented Feb 14, 2020

mrshu commented Feb 14, 2020

mrshu commented Feb 14, 2020

StrikerRUS commented Feb 15, 2020

mrshu commented Feb 15, 2020

StrikerRUS left a comment

StrikerRUS Feb 16, 2020

izeigerman Feb 17, 2020

mrshu Feb 17, 2020

StrikerRUS Feb 19, 2020 •

edited

mrshu Feb 19, 2020 •

edited

StrikerRUS Feb 19, 2020

mrshu Feb 20, 2020

izeigerman commented Feb 17, 2020 •

edited

mrshu commented Feb 17, 2020

StrikerRUS left a comment

izeigerman left a comment

add: Make function_name parametrized #166

add: Make function_name parametrized #166

Conversation

mrshu commented Feb 14, 2020

coveralls commented Feb 14, 2020 • edited

izeigerman commented Feb 14, 2020

mrshu commented Feb 14, 2020

mrshu commented Feb 14, 2020

StrikerRUS commented Feb 15, 2020

mrshu commented Feb 15, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

StrikerRUS Feb 16, 2020

Choose a reason for hiding this comment

izeigerman Feb 17, 2020

Choose a reason for hiding this comment

mrshu Feb 17, 2020

Choose a reason for hiding this comment

StrikerRUS Feb 19, 2020 • edited

Choose a reason for hiding this comment

mrshu Feb 19, 2020 • edited

Choose a reason for hiding this comment

StrikerRUS Feb 19, 2020

Choose a reason for hiding this comment

mrshu Feb 20, 2020

Choose a reason for hiding this comment

izeigerman commented Feb 17, 2020 • edited

mrshu commented Feb 17, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

izeigerman left a comment

Choose a reason for hiding this comment

coveralls commented Feb 14, 2020 •

edited

StrikerRUS Feb 19, 2020 •

edited

mrshu Feb 19, 2020 •

edited

izeigerman commented Feb 17, 2020 •

edited