Fix Type Error (Check type of Every parameter) #1717

Yashwants19 · 2019-02-13T08:16:21Z

Address #1710

TODOS

Casting 2-d to 1-d array or 1-d to 2-d array
Type check for every parameter
Add Test for Thrown Exceptions

rcurtin

This seems to work just fine for copy_all_inputs. Do you want to address the types of every input parameter in a separate PR or this one? I'm happy to approve this one as-is if you can handle the couple simple comments I left.

src/mlpack/bindings/python/print_pyx.cpp

Yashwants19 · 2019-02-16T01:10:11Z

You suggest should I use this PR or go for another PR ??

rcurtin · 2019-02-16T06:25:57Z

Up to you, no difference from my side.

Yashwants19 · 2019-02-16T06:48:35Z

I will go for new PR :)

Yashwants19 · 2019-02-18T17:51:10Z

Hi @rcurtin Please review this. If I have to change something or not??

rcurtin · 2019-02-18T17:53:33Z

Hi @Yashwants19, didn't I already say somewhere else that I will review everything when I have a chance? You can see there are nearly 60 PRs open, most of which need review. I do all these reviews in my free time, and I don't have infinite free time.

Yashwants19 · 2019-02-18T17:56:48Z

Sorry sir my mistake.

rcurtin

Hi @Yashwants19, thanks for your continued work on these issues. I'm a bit confused because I see two things being solved here and neither are completely done, but they are on the right track. We can do both in this one PR, or you can split them into two PRs if you like. The two things are:

making sure copy_all_inputs has the right type. That code seems to be just fine, but we should add this type check for every parameter, not just copy_all_inputs. We should also add a test for this (make sure that an exception is thrown when we call with a wrong type).
casting a 2-d array to a 1-d array when needed; that also seems to work, but the 1-d to 2-d case doesn't seem to be done yet and we should add tests for those too.

Let me know what you'd like to do and how you'd like to proceed. Definitely this is the right direction so we are almost there. 👍

src/mlpack/bindings/python/print_input_processing.hpp

rcurtin · 2019-02-21T15:09:33Z

src/mlpack/bindings/python/print_input_processing.hpp

+	  << ".shape[0] == 1 or " << d.name << "_tuple[0].shape[1] == 1:" 
+	  << std::endl;
+      std::cout << prefix << "  " << prefix << "  " << d.name << "_tuple[0] = " 
+	  << d.name << "_tuple" << "[0].ravel()" << std::endl;


I think you can put these two bits together ("_tuple" << "[0].ravel()"). And actually I think you can simplify the code overall if only lines 135-139 are in the body of the if---everything else appears to be the same.

src/mlpack/bindings/python/print_input_processing.hpp

Yashwants19 · 2019-02-22T06:56:36Z

We can add task list here. Can do both task on this PR only.

rcurtin

@Yashwants19 looking good, let me know when the rest of the implementation is done. We should also add tests for each of the situations you are fixing. 👍

src/mlpack/bindings/python/print_input_processing.hpp

Yashwants19 · 2019-03-10T03:55:41Z

Hi @rcurtin I was waiting for your review. I will complete these implementation as soon as possible. Thank You :)

Yashwants19 · 2019-03-10T13:48:56Z

Hi @rcurtin This is ready for review.

Yashwants19 · 2019-03-12T15:50:57Z

Hi @rcurtin Suggest me should I add more type for typecheck and I will add them as soon as possible

rcurtin

Nice work, looking good. 👍 (Sorry it takes me so long to get to these. I'd have loved to merge this earlier, but I'm just not able to find the time to review immediately.)

I left a couple comments that could help simplify the code. Let me know what you think or if I can clarify any of those.

Also, the tests look great, but we should also test the following cases:

Test passing two-dimensional unsigned matrix to ucol_in.
Test passing one-dimensional array to matrix_in. (Might need to add another parameter for this, since I think matrix_in expects a certain size.)
Test passing one-dimensional unsigned array to matrix_in.
Test passing one-dimensional array to matrix_with_info_in.

Those should be super easy adaptations of tests you've already written. 👍

Thanks again for the hard work with this. This will significantly improve the user experience from Python.

src/mlpack/bindings/python/print_input_processing.hpp

rcurtin · 2019-03-14T01:23:37Z

src/mlpack/bindings/python/print_input_processing.hpp

+    {
+      std::cout << prefix << "if " << name << " is not " << def << ":"
+        << std::endl;
+    }


It looks like the output here is different for bools... why? Maybe I missed something.

Here first I check the parameter as it is bool and then I pass the parameter as True or False

I see, but the code is not doing the isinstance() check if the default value is not False.

I think you are doing the def == "False" check as another way if checking that the type T is bool, but honestly I think we can avoid the check entirely if we use a function like GetPrintableType<>() for the type part of the isinstance() call.

I see, the different code is needed for bool because the defaults for all the others (which can be checked without a type) is None, but with bool we have to do the value check on the inside.

Anyway, it would probably be nice to add an empty line after the code above (but before the SetParam[] is printed), just to break the code up logically a bit. 👍

rcurtin · 2019-03-14T01:24:32Z

src/mlpack/bindings/python/print_input_processing.hpp

+    {
+      std::cout << prefix << "  if isinstance(" << name << ", int):"
+        << std::endl;
+    }


Actually I think you can use GetPrintableType() here... take a look (get_printable_type.hpp) and tell me what you think.

std::cout << prefix << " if isinstance(" << name << ", " << GetPrintableType<T>(d) << "):" << std::endl;

That could simplify this code quite a lot.

If we use GetPrintableType() at the place of int or float or str it is not recognized by python as there is not datatype name double or string

That's a fair point. The GetPrintableType<>() functions are used solely for printing documentation, so I would be okay adapting the typenames returned by that function to be the correct Python type. Specifically I can see in the code that double should change to float and string to str. But this still does not fully solve the problem for list of ... and the matrix types.

I think that for list types, we should make another overload of this function PrintInputProcessing() that is specifically for vector types. This would mean using SFINAE to make the following function:

template<typename T> void PrintInputProcessing( const util::ParamData& d, const size_t indent, const typename boost::enable_if<util::IsStdVector<T>::type* = 0, const typename boost::disable_if<arma::is_arma_type<T>>::type* = 0, const typename boost::disable_if<data::HasSerialize<T>>::type* = 0, const typename boost::disable_if<std::is_same<T, std::tuple<data::DatasetInfo, arma::mat>>>::type* = 0)

and the implementation of that function would be basically the same as the one I am commenting on here, except that we would need to test two things: isinstance(name, list) and, if the length of the list is greater than 0, isinstance(name[0], PT) where PT is the Python type of the list element (you could access this type in C++ as T::value_type, so I think you could call GetPrintableType<T::value_type>() to get the right result here.

In addition, because of the SFINAE signature, you'll have to add the following parameter to all of the other PrintInputProcessing() overloads (except the one at the bottom of the file):

const typename boost::disable_if<util::IsStdVector<T>>::type* = 0

Do let me know if I can clarify any of this. SFINAE code can be ugly and confusing...

src/mlpack/bindings/python/print_input_processing.hpp

src/mlpack/bindings/python/tests/test_python_binding_main.cpp

src/mlpack/bindings/python/tests/test_python_binding.py

Yashwants19 · 2019-03-15T08:45:12Z

Hi @rcurtin This PR is set to be merged

rcurtin · 2019-03-16T19:19:47Z

src/mlpack/bindings/python/print_input_processing.hpp

+    {
+      std::cout << prefix << "if " << name << " is not " << def << ":"
+        << std::endl;
+    }


I see, but the code is not doing the isinstance() check if the default value is not False.

I think you are doing the def == "False" check as another way if checking that the type T is bool, but honestly I think we can avoid the check entirely if we use a function like GetPrintableType<>() for the type part of the isinstance() call.

rcurtin · 2019-03-16T19:34:09Z

src/mlpack/bindings/python/print_input_processing.hpp

+    {
+      std::cout << prefix << "  if isinstance(" << name << ", int):"
+        << std::endl;
+    }


That's a fair point. The GetPrintableType<>() functions are used solely for printing documentation, so I would be okay adapting the typenames returned by that function to be the correct Python type. Specifically I can see in the code that double should change to float and string to str. But this still does not fully solve the problem for list of ... and the matrix types.

I think that for list types, we should make another overload of this function PrintInputProcessing() that is specifically for vector types. This would mean using SFINAE to make the following function:

template<typename T> void PrintInputProcessing( const util::ParamData& d, const size_t indent, const typename boost::enable_if<util::IsStdVector<T>::type* = 0, const typename boost::disable_if<arma::is_arma_type<T>>::type* = 0, const typename boost::disable_if<data::HasSerialize<T>>::type* = 0, const typename boost::disable_if<std::is_same<T, std::tuple<data::DatasetInfo, arma::mat>>>::type* = 0)

and the implementation of that function would be basically the same as the one I am commenting on here, except that we would need to test two things: isinstance(name, list) and, if the length of the list is greater than 0, isinstance(name[0], PT) where PT is the Python type of the list element (you could access this type in C++ as T::value_type, so I think you could call GetPrintableType<T::value_type>() to get the right result here.

In addition, because of the SFINAE signature, you'll have to add the following parameter to all of the other PrintInputProcessing() overloads (except the one at the bottom of the file):

const typename boost::disable_if<util::IsStdVector<T>>::type* = 0

Do let me know if I can clarify any of this. SFINAE code can be ugly and confusing...

rcurtin · 2019-03-16T19:38:21Z

src/mlpack/bindings/python/tests/test_python_binding_main.cpp

+  {
+    arma::Mat<size_t> out =
+        move(CLI::GetParam<arma::Mat<size_t>>("s_umatrix_in"));
+    out.row(0)*= 2.0;


You should be able to just do out *= 2 here, assuming your intention is to multiply every element by 2 (I think that all inputs to smatrix_in and s_umatrix_in have only one row).

src/mlpack/bindings/python/tests/test_python_binding_main.cpp

Yashwants19 · 2019-03-17T09:19:11Z

Sorry @rcurtin just ignore that I accidentally closed this Pull request while typing and had to reopen it

ADD CONVERSION FROM 2D TO 1D ADD CONVERSION FROM 1D TO 2D ADD TEST FOR EXPECTION THROWN AND CONVERSIONS

Yashwants19 · 2019-03-17T12:32:36Z

Hi @rcurtin I have made the changes as suggested

rcurtin

Hey @Yashwants19, thanks again for the hard work. 👍 I think we are just about ready to merge this, which will be really nice since it solves so many problems. I only have one more major comment about the code (see the comments on to_matrix() and to_matrix_with_info() throwing TypeErrors on their own), plus some tiny style issues I found, and when I reviewed the tests more in-depth I had a couple more questions and things to point out.

Thanks!

rcurtin · 2019-03-20T01:42:19Z

src/mlpack/bindings/python/print_input_processing.hpp

+    {
+      std::cout << prefix << "if " << name << " is not " << def << ":"
+        << std::endl;
+    }


I see, the different code is needed for bool because the defaults for all the others (which can be checked without a type) is None, but with bool we have to do the value check on the inside.

Anyway, it would probably be nice to add an empty line after the code above (but before the SetParam[] is printed), just to break the code up logically a bit. 👍

rcurtin · 2019-03-20T01:42:36Z

src/mlpack/bindings/python/print_input_processing.hpp

  std::cout << prefix << "# Detect if the parameter was passed; set if so."
      << std::endl;
  if (!d.required)
  {
-    std::cout << prefix << "if " << name << " is not " << def << ":"
+    if (GetPrintableType<T>(d)== "bool")


Tiny little style issue; this should be GetPrintableType<T>(d) == "bool".

rcurtin · 2019-03-20T01:44:59Z

src/mlpack/bindings/python/print_input_processing.hpp

  }
  std::cout << std::endl; // Extra line is to clear up the code a bit.
 }

-/**
- * Print input processing for a matrix type.
- */


Can you add a comment to this method? Print input processing for a vector type. or similar would be fine.

rcurtin · 2019-03-20T01:45:48Z

src/mlpack/bindings/python/print_input_processing.hpp

+   *  if param_name is not None:
+   *    if isinstance(param_name, list):
+   *      if len(param_name) > 0 :
+   *        if isinstance(param_name[0],str):


More little style issues---we can do

if len(param_name) > 0:

and

if isinstance(param_name[0], str):

rcurtin · 2019-03-20T01:47:19Z

src/mlpack/bindings/python/print_input_processing.hpp

+ *   else:
+ *     raise TypeError("'param_name' must have type 
+ *         '(np.ndarray,pd.DataFrame,pd.Series)'!")
+ */


This comment would seem more appropriate in the body of the function, like the rest of the methods. 👍

rcurtin · 2019-03-20T01:55:33Z

src/mlpack/bindings/python/tests/test_python_binding.py

@@ -640,13 +677,13 @@ def testMatrixAndInfoPandas(self):

    for j in range(10):
      self.assertEqual(output['matrix_and_info_out'][j, 4], z[cols[4]][j])
-
+      


No need to add blank spaces to the line (sorry, another trivial comment. At least they are easy to fix :)).

rcurtin · 2019-03-20T01:56:29Z

src/mlpack/bindings/python/tests/test_python_binding.py

-    x = pd.DataFrame(np.random.rand(10, 4), columns=list('abcd'))
-    x['e'] = pd.Series(['a', 'b', 'c', 'd', 'a', 'b', 'e', 'c', 'a', 'b'],
+    x = pd.DataFrame(np.random.rand(9, 4), columns=list('abcd'))
+    x['e'] = pd.Series(['a', 'b', 'c', 'd', 'a', 'b', 'e', 'c', 'a' ],


Why the switch to 9 from 10 elements?

src/mlpack/bindings/python/tests/test_python_binding.py

rcurtin · 2019-03-20T02:00:32Z

src/mlpack/bindings/python/tests/test_python_binding.py

+    output = test_python_binding(string_in='hello',
+                                 int_in=12,
+                                 double_in=4.0,
+                                 matrix_and_info_in=z)


Hmm, technically I think the idea would be that we pass x['e'] directly here?

Here I pass x i.e. 1-D and it is without column from matrix_and_info_in

Well, now I'm a little bit confused. I thought that this test was testing that passing a Pandas series worked, but it seems like you're just passing a 2-D Pandas dataframe with one column.

Previously I was testing that if we can pass 1-D matrix after casting it to 2-D (as matrix_and_info_in_reshape) from arma_numpy.numpy_to_mat_d()

Ah, okay, but I don't think that what we're passing is actually a 1-d Pandas object. For that we'd need to pass a Series object, so I guess we'd need to do z[0] here in the call for matrix_and_info_in.

By the way, everything else in the PR looks good to me, so I'm ready to approve it once we handle this last comment.

Yashwants19 · 2019-03-20T16:41:52Z

Hi @rcurtin This is ready for review. I have made the changes as suggested

rcurtin · 2019-03-21T00:25:40Z

src/mlpack/bindings/python/mlpack/matrix_utils.py

+    if not hasattr(x, '__len__') and \
+        not hasattr(x, 'shape') and \
+        not hasattr(x, '__array__'):
+      raise TypeError("given argument is not array-like")


Ah, sorry, I meant that the check is already in these functions, and it's this one right here. So I think we can leave to_matrix() and to_matrix_with_info() as they were before the changes you made, since they already throw TypeErrors if they are not array-like. (A nice side-effect of this way of checking, just for __len__, shape, and __array__, means that other array-like things can also be passed to Python bindings, not just numpy and Pandas matrices.)

Yashwants19 · 2019-03-21T23:32:47Z

Hi @rcurtin I have made the changes as suggested.

Yashwants19 · 2019-03-23T13:42:33Z

Hi @rcurtin This PR is set to be merged

rcurtin · 2019-03-24T23:01:49Z

src/mlpack/bindings/python/mlpack/matrix_utils.py

@@ -81,7 +81,10 @@ def to_matrix_with_info(x, dtype, copy=False):

  if isinstance(x, np.ndarray):
    # It is already an ndarray, so the vector of info is all 0s (all numeric).
-    d = np.zeros([x.shape[1]], dtype=np.bool)
+    if len(x.shape) < 2:
+      d = np.zeros(0, dtype=np.bool)


Shouldn't this just be 1 in this case though? Correct me if I'm wrong. This might work and run, but I suspect that there is an invalid memory access from the mlpack side if d really has length 0. This comment applies to the other d = np.zeros(0, dtype=np.bool) too.

Hi @rcurtin I was also bit confused about this. Thank you for correcting me. :)

Yashwants19 · 2019-03-24T23:20:49Z

Hi @rcurtin if no one is working on #1492(go bindings). I would like to continue this.

rcurtin

@Yashwants19 thanks so much for your hard work on this. I think it is ready for merge. As for the Go bindings, please do feel free to pick them up. If I remember right I had reviewed the PR so there were a lot of comments to be addressed. Also it will need to be adapted to the new Markdown bindings, which changed a few things, but I can help with that when the time comes. 👍

Yashwants19 · 2019-03-26T11:17:02Z

Thanks @rcurtin without your help, it couldn't be implemented. I will start working on Go bindings as soon as my exams are over. :)

mlpack-bot

Second approval provided automatically after 24 hours. 👍

rcurtin · 2019-03-28T15:44:13Z

@Yashwants19 thanks so much for the hard work on this. It's really great to have this merged and it fixes some very big problems. I'll resolve all the related tickets and hopefully nobody will need to reopen them. :)

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels Feb 13, 2019

rcurtin added c: automatic bindings t: bugfix and removed s: unlabeled labels Feb 14, 2019

rcurtin reviewed Feb 15, 2019

View reviewed changes

src/mlpack/bindings/python/print_pyx.cpp Outdated Show resolved Hide resolved

src/mlpack/bindings/python/print_pyx.cpp Outdated Show resolved Hide resolved

rcurtin removed the s: unanswered label Feb 15, 2019

rcurtin mentioned this pull request Feb 15, 2019

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). #1710

Closed

rcurtin reviewed Feb 21, 2019

View reviewed changes

Yashwants19 changed the title ~~Fix Value Error (Check type copy_all_inputs)~~ Fix Type Error (Check type of Every parameter) Feb 25, 2019

rcurtin reviewed Mar 9, 2019

View reviewed changes

src/mlpack/bindings/python/print_input_processing.hpp Outdated Show resolved Hide resolved

rcurtin reviewed Mar 14, 2019

View reviewed changes

Yashwants19 force-pushed the master branch 2 times, most recently from 834dcac to bb3302b Compare March 15, 2019 02:55

rcurtin reviewed Mar 16, 2019

View reviewed changes

Yashwants19 closed this Mar 17, 2019

Yashwants19 force-pushed the master branch from d32eec1 to 153899e Compare March 17, 2019 07:37

Yashwants19 reopened this Mar 17, 2019

Yashwants19 force-pushed the master branch from a79e915 to 72d143e Compare March 17, 2019 10:34

ADD TYPE CHECK

19874ad

ADD CONVERSION FROM 2D TO 1D ADD CONVERSION FROM 1D TO 2D ADD TEST FOR EXPECTION THROWN AND CONVERSIONS

Yashwants19 force-pushed the master branch from 72d143e to 19874ad Compare March 17, 2019 10:44

rcurtin reviewed Mar 20, 2019

View reviewed changes

Yashwants19 added 5 commits March 20, 2019 21:28

ADD MORE TEST AS SUGGESTED

ce78260

Update print_input_processing.hpp

b1a9fc0

Add type check in python file

bfd4e06

Resolve Style Checks

404753d

Resolve 1-D pass from matrix_and_info_in

c939807

rcurtin reviewed Mar 21, 2019

View reviewed changes

Yashwants19 added 2 commits March 21, 2019 07:20

Update matrix_utils.py

a85c720

Update test_python_binding.py

6aca7d9

Yashwants19 added 2 commits March 23, 2019 16:14

Update matrix_utils.py

c1d3171

Update test_python_binding.py

2907ec1

rcurtin reviewed Mar 24, 2019

View reviewed changes

Update matrix_utils.py

f853797

rcurtin approved these changes Mar 25, 2019

View reviewed changes

mlpack-bot bot approved these changes Mar 26, 2019

View reviewed changes

rcurtin merged commit f853797 into mlpack:master Mar 28, 2019

		@@ -640,13 +677,13 @@ def testMatrixAndInfoPandas(self):

		for j in range(10):
		self.assertEqual(output['matrix_and_info_out'][j, 4], z[cols[4]][j])

Fix Type Error (Check type of Every parameter) #1717

Fix Type Error (Check type of Every parameter) #1717

Conversation

Yashwants19 commented Feb 13, 2019 • edited Loading

rcurtin left a comment

Choose a reason for hiding this comment

Yashwants19 commented Feb 16, 2019 • edited Loading

rcurtin commented Feb 16, 2019 • edited Loading

Yashwants19 commented Feb 16, 2019 • edited Loading

Yashwants19 commented Feb 18, 2019 • edited Loading

rcurtin commented Feb 18, 2019 • edited Loading

Yashwants19 commented Feb 18, 2019 • edited Loading

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 commented Feb 22, 2019

rcurtin left a comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 10, 2019

Yashwants19 commented Mar 10, 2019

Yashwants19 commented Mar 12, 2019

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 17, 2019

Yashwants19 commented Mar 17, 2019

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 Mar 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 Mar 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 20, 2019

Choose a reason for hiding this comment

Yashwants19 commented Mar 21, 2019

Yashwants19 commented Mar 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 24, 2019

rcurtin left a comment

Choose a reason for hiding this comment

Yashwants19 commented Mar 26, 2019

mlpack-bot bot left a comment

Choose a reason for hiding this comment

rcurtin commented Mar 28, 2019

Yashwants19 commented Feb 13, 2019 •

edited

Loading

Yashwants19 commented Feb 16, 2019 •

edited

Loading

rcurtin commented Feb 16, 2019 •

edited

Loading

Yashwants19 commented Feb 16, 2019 •

edited

Loading

Yashwants19 commented Feb 18, 2019 •

edited

Loading

rcurtin commented Feb 18, 2019 •

edited

Loading

Yashwants19 commented Feb 18, 2019 •

edited

Loading

Yashwants19 Mar 20, 2019 •

edited

Loading

Yashwants19 Mar 21, 2019 •

edited

Loading