New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically Generated Go Bindings #1492

Open
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@yaswagner
Contributor

yaswagner commented Aug 13, 2018

This pull request is an attempt at adding a Go binding generator to mlpack's codebase. The generated bindings include a C with C++ utility functions which can then be used by Go's cgo tool who lets Go packages call C code. The matrix used for the go bindings are from the gonum package (https://github.com/gonum).

The implementation of two parameters, precisely the matrix with dataset info parameter and the vectors of strings parameter, are still work in progress and are yet to be implemented.

GENERATING THE BINDING

First, to generate the bindings get mlpack in your go workspace:

$  go get https://github.com/yaswagner/mlpack/

Then in your terminal go to the mlpack folder create your build directory and build cmake as desired.
For example:

$ cd ${GOPATH}/src/mlpack.
$ mkdir build
$ cd build
$ cmake -D BUILD_GO_BINDINGS=ON -D BUILD_PYTHON_BINDINGS=OFF ../
$ make 

You can generate a specific binding to generate by using the mlpack_go_{method} target.
For instance, for the pca method:

$ make mlpack_go_pca

USING THE BINDINGS

To use the bindings in Go, you must import the mlpack package and gonum package.

For example:

import (
	"mlpack/build/src/mlpack/bindings/go/mlpack"
	"gonum.org/v1/gonum/mat"
)

Optional parameters need to be set in the methods config struct and then we can set our desired parameters.
For instance, for pca, we can initialize the config struct called param and set verbosity to true by doing:

param := mlpack.InitializePca()
param.Verbose = true

We can then pass our optional parameters and call the Pca method as such:

output := mlpack.Pca(input, param)

For output parameters we do not wish to use, we can use an underscore.
For example, for the perceptron, if we wish to only use the 'output_model' and not 'output'"

_, output_model := mlpack.Perceptron(param)

In order to get documentation about a method, we use 'godoc'.
For example, for the pca method we would type on the command line:

godoc mlpack/build/src/mlpack/bindings/go/mlpack Pca

TESTING THE BINDINGS

Finally, to test the binding, Go's 'testing' is used. Simply go to the go test file following directory as such:

$ cd ${GOPATH}/src/mlpack/src/bindings/go/tests/gofile

Then test by using the 'go test' command tool, as such:

$ go test -v
@rcurtin

This is really awesome. I'm so glad to see this support working. I need to do a little testing on it, but all the experiments I've done so far (on older revisions or your gobindings) branch have worked great. I left a lot of little comments that I hope are helpful, and I have a couple higher-level comments:

  • "Idiomatic" Go code uses camel-casing for method names, i.e., 'mlpack.DecisionTree', but I think the code you have here would do things like mlpack.Decision_tree. Should we change that by making a simple "underscores-to-camelcase" converter? I think maybe we should do it for method parameters too. Let me know what you think. (Also what to do with methods that should be capitalized is a bit tricky too... I suppose technically it should be "PCA" not "Pca", but I don't know if it makes any sense to make an acronym detector...)

  • Should we add add_go_binding() to all the relevant CMakeLists.txt files? I think it should build for just about anything except a binding with a PARAM_MATRIX_WITH_INFO(). If we add those, I can run a once-over test for each of the bindings by hand just as a sanity check (although the tests you wrote look good, so probably everything will work fine).

  • Do you think it would be possible to integrate the Go build fully with CMake, so that a user would only need to do the following?

    • Configure CMake with 'cmake ../', which would find Go if available.
    • Build with 'make' (or 'make go' or similar), which would generate a fully working Go binding, and if the user set their GOPATH right (or something like this) they could use the binding directly as it sits in build/src/mlpack/bindings/go/.
    • Install with 'make install', and the Go bindings will be dropped in wherever the right place is for Go packages along with any necessary .so files that were built. This step could also make it possible to have a standalone package that could be uploaded to Go's repositories, so I think you could type 'go get mlpack' and bindings get installed (precompiled or not). (I'm not so sure on the last step, maybe my idea is a bit misguided.)

Let me know what you think. And overall great work this summer. I'm not sure what emoji to use, I guess somewhere in the comments I tried the product of multiple emojis, so I can do that again: 👍 * 💯 * 🥇

Once we can get each of the little issues handled, we can merge this in and then I'll update the rest of the website and everything to advertise Go bindings, which should hopefully be attractive to some Go users who want to do machine learning.

# nothing and leave this file.
if (NOT BUILD_GO_BINDINGS)
not_found_return("Not building Go bindings.")
endif ()

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

The not_found_return() macro was defined for the Python bindings, so instead we should modify this a little bit---

macro (go_not_found_return message)
  message(STATUS "${message}")
  macro (add_go_binding name)
    # Do nothing.
  endmacro ()

  return()
endmacro ()

And then we can use go_not_found_return() instead.

Also I think we should add a check for whether Go is on the system. Looks like CMake doesn't have a package finder script for this, so maybe we would need to adapt some support or something. I'll look into this and see what I can find---the simple option would be just using CMake to find the location of the go executable; the more in-depth option would be finding some CMake code that sets GO_PATH right for the build, etc., etc.

@rcurtin

rcurtin Aug 18, 2018

Member

The not_found_return() macro was defined for the Python bindings, so instead we should modify this a little bit---

macro (go_not_found_return message)
  message(STATUS "${message}")
  macro (add_go_binding name)
    # Do nothing.
  endmacro ()

  return()
endmacro ()

And then we can use go_not_found_return() instead.

Also I think we should add a check for whether Go is on the system. Looks like CMake doesn't have a package finder script for this, so maybe we would need to adapt some support or something. I'll look into this and see what I can find---the simple option would be just using CMake to find the location of the go executable; the more in-depth option would be finding some CMake code that sets GO_PATH right for the build, etc., etc.

# Add a macro to build a go binding.
macro (add_go_binding name)
if (BUILD_GO_BINDINGS)

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Oh, I see, you're handling the case where the bindings aren't enabled down here. But if BUILD_GO_BINDINGS is not true, actually CMake will not make it down here because it will return after the check at the top of the file. So I think it might be better to have a separate definition of add_go_binding for the case where the bindings are not being built.

@rcurtin

rcurtin Aug 18, 2018

Member

Oh, I see, you're handling the case where the bindings aren't enabled down here. But if BUILD_GO_BINDINGS is not true, actually CMake will not make it down here because it will return after the check at the top of the file. So I think it might be better to have a separate definition of add_go_binding for the case where the bindings are not being built.

add_dependencies(mlpack_go_${name} build_go_${name})
add_dependencies(mlpack_go_${name} go_copy)
add_dependencies(mlpack_go_${name} go_util)

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Crazy how much overhead is needed here to make this all work. But it does work. :)

@rcurtin

rcurtin Aug 18, 2018

Member

Crazy how much overhead is needed here to make this all work. But it does work. :)

@@ -0,0 +1,49 @@
/*
* @file generate_h_${PROGRAM_NAME}.cpp

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

A minor issue, but shouldn't the file be named generate_h.cpp.in not generate_go.h.in?

@rcurtin

rcurtin Aug 18, 2018

Member

A minor issue, but shouldn't the file be named generate_h.cpp.in not generate_go.h.in?

@@ -0,0 +1,49 @@
/*
* @file generate_go_${PROGRAM_NAME}.cpp

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Actually I think this one should be named generate_go.cpp.in, since it's a(n unconfigured) C++ file not a Go file.

@rcurtin

rcurtin Aug 18, 2018

Member

Actually I think this one should be named generate_go.cpp.in, since it's a(n unconfigured) C++ file not a Go file.

d.name != "help" && d.name != "info" &&
d.name != "version")
inputOptions.push_back(it->first);
}

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Here also I think we are not using inputOptions or outputOptions.

@rcurtin

rcurtin Aug 18, 2018

Member

Here also I think we are not using inputOptions or outputOptions.

std::string def = "nil";
if (std::is_same<T, bool>::value)
def = "false";

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

This is only enabled if the type has a serialize() function, so bool can't match here. Also this seems to be the same code as the overload above, so you could change the SFINAE and use only one function (assuming they are exactly the same, I think they are):

template<typename T>
void PrintMethodConfig(
    const util::ParamData& d,
    const size_t indent,
    const typename boost::enable_if<arma::is_arma_type<T>>::type* = 0,
    const typename boost::enable_if<data::HasSerialize<T>>::type* = 0)
@rcurtin

rcurtin Aug 18, 2018

Member

This is only enabled if the type has a serialize() function, so bool can't match here. Also this seems to be the same code as the overload above, so you could change the SFINAE and use only one function (assuming they are exactly the same, I think they are):

template<typename T>
void PrintMethodConfig(
    const util::ParamData& d,
    const size_t indent,
    const typename boost::enable_if<arma::is_arma_type<T>>::type* = 0,
    const typename boost::enable_if<data::HasSerialize<T>>::type* = 0)
namespace go {
/**
* Print parameter with it's default value for a standard option type.

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Tiny typo that jumped out at me, this should be its. Sorry for being pedantic. :)

@rcurtin

rcurtin Aug 18, 2018

Member

Tiny typo that jumped out at me, this should be its. Sorry for being pedantic. :)

)
func TestRunBindingNoFlag(t *testing.T) {
t.Log("Test that when we run the binding correctly (with correct input parameters), we get the expected output.")

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

Can you make sure the lines are 80 characters or less please? (I guess the automatic style checker doesn't check .go files.)

@rcurtin

rcurtin Aug 18, 2018

Member

Can you make sure the lines are 80 characters or less please? (I guess the automatic style checker doesn't check .go files.)

@@ -0,0 +1,168 @@
/**

This comment has been minimized.

@rcurtin

rcurtin Aug 18, 2018

Member

I can't comment on the right file, so this comment is kinda off in nowhere, but do we need to include the file src/mlpack/bindings/go/tests/gofile/gofile? It seems like it's a binary file, maybe a build artifact or something?

@rcurtin

rcurtin Aug 18, 2018

Member

I can't comment on the right file, so this comment is kinda off in nowhere, but do we need to include the file src/mlpack/bindings/go/tests/gofile/gofile? It seems like it's a binary file, maybe a build artifact or something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment