Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Type Error (Check type of Every parameter) #1717

Merged
merged 11 commits into from Mar 28, 2019

Conversation

Projects
None yet
2 participants
@Yashwants19
Copy link
Contributor

commented Feb 13, 2019

Address #1710

TODOS

  • Casting 2-d to 1-d array or 1-d to 2-d array
  • Type check for every parameter
  • Add Test for Thrown Exceptions
@rcurtin
Copy link
Member

left a comment

This seems to work just fine for copy_all_inputs. Do you want to address the types of every input parameter in a separate PR or this one? I'm happy to approve this one as-is if you can handle the couple simple comments I left.

Show resolved Hide resolved src/mlpack/bindings/python/print_pyx.cpp Outdated
Show resolved Hide resolved src/mlpack/bindings/python/print_pyx.cpp Outdated
@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Feb 16, 2019

You suggest should I use this PR or go for another PR ??

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 16, 2019

Up to you, no difference from my side.

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Feb 16, 2019

I will go for new PR :)

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Feb 18, 2019

Hi @rcurtin Please review this. If I have to change something or not??

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

Hi @Yashwants19, didn't I already say somewhere else that I will review everything when I have a chance? You can see there are nearly 60 PRs open, most of which need review. I do all these reviews in my free time, and I don't have infinite free time.

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Feb 18, 2019

Sorry sir my mistake.

@rcurtin
Copy link
Member

left a comment

Hi @Yashwants19, thanks for your continued work on these issues. I'm a bit confused because I see two things being solved here and neither are completely done, but they are on the right track. We can do both in this one PR, or you can split them into two PRs if you like. The two things are:

  • making sure copy_all_inputs has the right type. That code seems to be just fine, but we should add this type check for every parameter, not just copy_all_inputs. We should also add a test for this (make sure that an exception is thrown when we call with a wrong type).
  • casting a 2-d array to a 1-d array when needed; that also seems to work, but the 1-d to 2-d case doesn't seem to be done yet and we should add tests for those too.

Let me know what you'd like to do and how you'd like to proceed. Definitely this is the right direction so we are almost there. 👍

Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp
Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
<< ".shape[0] == 1 or " << d.name << "_tuple[0].shape[1] == 1:"
<< std::endl;
std::cout << prefix << " " << prefix << " " << d.name << "_tuple[0] = "
<< d.name << "_tuple" << "[0].ravel()" << std::endl;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 21, 2019

Member

I think you can put these two bits together ("_tuple" << "[0].ravel()"). And actually I think you can simplify the code overall if only lines 135-139 are in the body of the if---everything else appears to be the same.

Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp
@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Feb 22, 2019

We can add task list here. Can do both task on this PR only.

@Yashwants19 Yashwants19 changed the title Fix Value Error (Check type copy_all_inputs) Fix Type Error (Check type of Every parameter) Feb 25, 2019

@rcurtin
Copy link
Member

left a comment

@Yashwants19 looking good, let me know when the rest of the implementation is done. We should also add tests for each of the situations you are fixing. 👍

Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 10, 2019

Hi @rcurtin I was waiting for your review. I will complete these implementation as soon as possible. Thank You :)

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 10, 2019

Hi @rcurtin This is ready for review.

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 12, 2019

Hi @rcurtin Suggest me should I add more type for typecheck and I will add them as soon as possible

@rcurtin
Copy link
Member

left a comment

Nice work, looking good. 👍 (Sorry it takes me so long to get to these. I'd have loved to merge this earlier, but I'm just not able to find the time to review immediately.)

I left a couple comments that could help simplify the code. Let me know what you think or if I can clarify any of those.

Also, the tests look great, but we should also test the following cases:

  • Test passing two-dimensional unsigned matrix to ucol_in.
  • Test passing one-dimensional array to matrix_in. (Might need to add another parameter for this, since I think matrix_in expects a certain size.)
  • Test passing one-dimensional unsigned array to matrix_in.
  • Test passing one-dimensional array to matrix_with_info_in.

Those should be super easy adaptations of tests you've already written. 👍

Thanks again for the hard work with this. This will significantly improve the user experience from Python.

Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
{
std::cout << prefix << "if " << name << " is not " << def << ":"
<< std::endl;
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 14, 2019

Member

It looks like the output here is different for bools... why? Maybe I missed something.

This comment has been minimized.

Copy link
@Yashwants19

Yashwants19 Mar 14, 2019

Author Contributor

Here first I check the parameter as it is bool and then I pass the parameter as True or False

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 16, 2019

Member

I see, but the code is not doing the isinstance() check if the default value is not False.

I think you are doing the def == "False" check as another way if checking that the type T is bool, but honestly I think we can avoid the check entirely if we use a function like GetPrintableType<>() for the type part of the isinstance() call.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

I see, the different code is needed for bool because the defaults for all the others (which can be checked without a type) is None, but with bool we have to do the value check on the inside.

Anyway, it would probably be nice to add an empty line after the code above (but before the SetParam[] is printed), just to break the code up logically a bit. 👍

{
std::cout << prefix << " if isinstance(" << name << ", int):"
<< std::endl;
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 14, 2019

Member

Actually I think you can use GetPrintableType() here... take a look (get_printable_type.hpp) and tell me what you think.

std::cout << prefix << "  if isinstance(" << name << ", " << GetPrintableType<T>(d)
    << "):" << std::endl;

That could simplify this code quite a lot.

This comment has been minimized.

Copy link
@Yashwants19

Yashwants19 Mar 14, 2019

Author Contributor

If we use GetPrintableType() at the place of int or float or str it is not recognized by python as there is not datatype name double or string

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 16, 2019

Member

That's a fair point. The GetPrintableType<>() functions are used solely for printing documentation, so I would be okay adapting the typenames returned by that function to be the correct Python type. Specifically I can see in the code that double should change to float and string to str. But this still does not fully solve the problem for list of ... and the matrix types.

I think that for list types, we should make another overload of this function PrintInputProcessing() that is specifically for vector types. This would mean using SFINAE to make the following function:

template<typename T>
void PrintInputProcessing(
    const util::ParamData& d,
    const size_t indent,
    const typename boost::enable_if<util::IsStdVector<T>::type* = 0,
    const typename boost::disable_if<arma::is_arma_type<T>>::type* = 0,
    const typename boost::disable_if<data::HasSerialize<T>>::type* = 0,
    const typename boost::disable_if<std::is_same<T,
        std::tuple<data::DatasetInfo, arma::mat>>>::type* = 0)

and the implementation of that function would be basically the same as the one I am commenting on here, except that we would need to test two things: isinstance(name, list) and, if the length of the list is greater than 0, isinstance(name[0], PT) where PT is the Python type of the list element (you could access this type in C++ as T::value_type, so I think you could call GetPrintableType<T::value_type>() to get the right result here.

In addition, because of the SFINAE signature, you'll have to add the following parameter to all of the other PrintInputProcessing() overloads (except the one at the bottom of the file):

const typename boost::disable_if<util::IsStdVector<T>>::type* = 0

Do let me know if I can clarify any of this. SFINAE code can be ugly and confusing...

Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp
Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
Show resolved Hide resolved src/mlpack/bindings/python/print_input_processing.hpp Outdated
Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding_main.cpp
Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding.py
Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding.py Outdated

@Yashwants19 Yashwants19 force-pushed the Yashwants19:master branch 2 times, most recently from 834dcac to bb3302b Mar 15, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

Hi @rcurtin This PR is set to be merged

{
std::cout << prefix << "if " << name << " is not " << def << ":"
<< std::endl;
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 16, 2019

Member

I see, but the code is not doing the isinstance() check if the default value is not False.

I think you are doing the def == "False" check as another way if checking that the type T is bool, but honestly I think we can avoid the check entirely if we use a function like GetPrintableType<>() for the type part of the isinstance() call.

{
std::cout << prefix << " if isinstance(" << name << ", int):"
<< std::endl;
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 16, 2019

Member

That's a fair point. The GetPrintableType<>() functions are used solely for printing documentation, so I would be okay adapting the typenames returned by that function to be the correct Python type. Specifically I can see in the code that double should change to float and string to str. But this still does not fully solve the problem for list of ... and the matrix types.

I think that for list types, we should make another overload of this function PrintInputProcessing() that is specifically for vector types. This would mean using SFINAE to make the following function:

template<typename T>
void PrintInputProcessing(
    const util::ParamData& d,
    const size_t indent,
    const typename boost::enable_if<util::IsStdVector<T>::type* = 0,
    const typename boost::disable_if<arma::is_arma_type<T>>::type* = 0,
    const typename boost::disable_if<data::HasSerialize<T>>::type* = 0,
    const typename boost::disable_if<std::is_same<T,
        std::tuple<data::DatasetInfo, arma::mat>>>::type* = 0)

and the implementation of that function would be basically the same as the one I am commenting on here, except that we would need to test two things: isinstance(name, list) and, if the length of the list is greater than 0, isinstance(name[0], PT) where PT is the Python type of the list element (you could access this type in C++ as T::value_type, so I think you could call GetPrintableType<T::value_type>() to get the right result here.

In addition, because of the SFINAE signature, you'll have to add the following parameter to all of the other PrintInputProcessing() overloads (except the one at the bottom of the file):

const typename boost::disable_if<util::IsStdVector<T>>::type* = 0

Do let me know if I can clarify any of this. SFINAE code can be ugly and confusing...

{
arma::Mat<size_t> out =
move(CLI::GetParam<arma::Mat<size_t>>("s_umatrix_in"));
out.row(0)*= 2.0;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 16, 2019

Member

You should be able to just do out *= 2 here, assuming your intention is to multiply every element by 2 (I think that all inputs to smatrix_in and s_umatrix_in have only one row).

Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding_main.cpp

@Yashwants19 Yashwants19 force-pushed the Yashwants19:master branch from d32eec1 to 153899e Mar 17, 2019

@Yashwants19 Yashwants19 reopened this Mar 17, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 17, 2019

Sorry @rcurtin just ignore that I accidentally closed this Pull request while typing and had to reopen it

@Yashwants19 Yashwants19 force-pushed the Yashwants19:master branch from a79e915 to 72d143e Mar 17, 2019

ADD TYPE CHECK
ADD CONVERSION FROM 2D TO 1D
ADD CONVERSION FROM 1D TO 2D
ADD TEST FOR EXPECTION THROWN AND CONVERSIONS

@Yashwants19 Yashwants19 force-pushed the Yashwants19:master branch from 72d143e to 19874ad Mar 17, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 17, 2019

Hi @rcurtin I have made the changes as suggested

@rcurtin
Copy link
Member

left a comment

Hey @Yashwants19, thanks again for the hard work. 👍 I think we are just about ready to merge this, which will be really nice since it solves so many problems. I only have one more major comment about the code (see the comments on to_matrix() and to_matrix_with_info() throwing TypeErrors on their own), plus some tiny style issues I found, and when I reviewed the tests more in-depth I had a couple more questions and things to point out.

Thanks!

{
std::cout << prefix << "if " << name << " is not " << def << ":"
<< std::endl;
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

I see, the different code is needed for bool because the defaults for all the others (which can be checked without a type) is None, but with bool we have to do the value check on the inside.

Anyway, it would probably be nice to add an empty line after the code above (but before the SetParam[] is printed), just to break the code up logically a bit. 👍

std::cout << prefix << "# Detect if the parameter was passed; set if so."
<< std::endl;
if (!d.required)
{
std::cout << prefix << "if " << name << " is not " << def << ":"
if (GetPrintableType<T>(d)== "bool")

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

Tiny little style issue; this should be GetPrintableType<T>(d) == "bool".

}
std::cout << std::endl; // Extra line is to clear up the code a bit.
}

/**
* Print input processing for a matrix type.
*/

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

Can you add a comment to this method? Print input processing for a vector type. or similar would be fine.

* if param_name is not None:
* if isinstance(param_name, list):
* if len(param_name) > 0 :
* if isinstance(param_name[0],str):

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

More little style issues---we can do

if len(param_name) > 0:

and

if isinstance(param_name[0], str):
* else:
* raise TypeError("'param_name' must have type
* '(np.ndarray,pd.DataFrame,pd.Series)'!")
*/

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

This comment would seem more appropriate in the body of the function, like the rest of the methods. 👍

@@ -640,13 +677,13 @@ def testMatrixAndInfoPandas(self):

for j in range(10):
self.assertEqual(output['matrix_and_info_out'][j, 4], z[cols[4]][j])

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

No need to add blank spaces to the line (sorry, another trivial comment. At least they are easy to fix :)).

x = pd.DataFrame(np.random.rand(10, 4), columns=list('abcd'))
x['e'] = pd.Series(['a', 'b', 'c', 'd', 'a', 'b', 'e', 'c', 'a', 'b'],
x = pd.DataFrame(np.random.rand(9, 4), columns=list('abcd'))
x['e'] = pd.Series(['a', 'b', 'c', 'd', 'a', 'b', 'e', 'c', 'a' ],

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

Why the switch to 9 from 10 elements?

Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding.py
Show resolved Hide resolved src/mlpack/bindings/python/tests/test_python_binding.py
output = test_python_binding(string_in='hello',
int_in=12,
double_in=4.0,
matrix_and_info_in=z)

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 20, 2019

Member

Hmm, technically I think the idea would be that we pass x['e'] directly here?

This comment has been minimized.

Copy link
@Yashwants19

Yashwants19 Mar 20, 2019

Author Contributor

Here I pass x i.e. 1-D and it is without column from matrix_and_info_in

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 21, 2019

Member

Well, now I'm a little bit confused. I thought that this test was testing that passing a Pandas series worked, but it seems like you're just passing a 2-D Pandas dataframe with one column.

This comment has been minimized.

Copy link
@Yashwants19

Yashwants19 Mar 21, 2019

Author Contributor

Previously I was testing that if we can pass 1-D matrix after casting it to 2-D (as matrix_and_info_in_reshape) from arma_numpy.numpy_to_mat_d()

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 22, 2019

Member

Ah, okay, but I don't think that what we're passing is actually a 1-d Pandas object. For that we'd need to pass a Series object, so I guess we'd need to do z[0] here in the call for matrix_and_info_in.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 22, 2019

Member

By the way, everything else in the PR looks good to me, so I'm ready to approve it once we handle this last comment.

Yashwants19 added some commits Mar 20, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

Hi @rcurtin This is ready for review. I have made the changes as suggested

if not hasattr(x, '__len__') and \
not hasattr(x, 'shape') and \
not hasattr(x, '__array__'):
raise TypeError("given argument is not array-like")

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 21, 2019

Member

Ah, sorry, I meant that the check is already in these functions, and it's this one right here. So I think we can leave to_matrix() and to_matrix_with_info() as they were before the changes you made, since they already throw TypeErrors if they are not array-like. (A nice side-effect of this way of checking, just for __len__, shape, and __array__, means that other array-like things can also be passed to Python bindings, not just numpy and Pandas matrices.)

Yashwants19 added some commits Mar 21, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

Hi @rcurtin I have made the changes as suggested.

Yashwants19 added some commits Mar 23, 2019

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 23, 2019

Hi @rcurtin This PR is set to be merged

@@ -81,7 +81,10 @@ def to_matrix_with_info(x, dtype, copy=False):

if isinstance(x, np.ndarray):
# It is already an ndarray, so the vector of info is all 0s (all numeric).
d = np.zeros([x.shape[1]], dtype=np.bool)
if len(x.shape) < 2:
d = np.zeros(0, dtype=np.bool)

This comment has been minimized.

Copy link
@rcurtin

rcurtin Mar 24, 2019

Member

Shouldn't this just be 1 in this case though? Correct me if I'm wrong. This might work and run, but I suspect that there is an invalid memory access from the mlpack side if d really has length 0. This comment applies to the other d = np.zeros(0, dtype=np.bool) too.

This comment has been minimized.

Copy link
@Yashwants19

Yashwants19 Mar 24, 2019

Author Contributor

Hi @rcurtin I was also bit confused about this. Thank you for correcting me. :)

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2019

Hi @rcurtin if no one is working on #1492(go bindings). I would like to continue this.

@rcurtin
Copy link
Member

left a comment

@Yashwants19 thanks so much for your hard work on this. I think it is ready for merge. As for the Go bindings, please do feel free to pick them up. If I remember right I had reviewed the PR so there were a lot of comments to be addressed. Also it will need to be adapted to the new Markdown bindings, which changed a few things, but I can help with that when the time comes. 👍

@Yashwants19

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

Thanks @rcurtin without your help, it couldn't be implemented. I will start working on Go bindings as soon as my exams are over. :)

@mlpack-bot

mlpack-bot bot approved these changes Mar 26, 2019

Copy link

left a comment

Second approval provided automatically after 24 hours. 👍

@rcurtin rcurtin merged commit f853797 into mlpack:master Mar 28, 2019

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rcurtin

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

@Yashwants19 thanks so much for the hard work on this. It's really great to have this merged and it fixes some very big problems. I'll resolve all the related tickets and hopefully nobody will need to reopen them. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.