New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when scoring an image using any trained classifier that isn't "Fast Gentle Boosting" #178

Closed
daviddao opened this Issue May 4, 2016 · 17 comments

Comments

Projects
None yet
3 participants
@daviddao
Contributor

daviddao commented May 4, 2016

From CellProfiler Forum Post:

Hello CPA team!

When using the classifier in CPA I get the following error when trying to score an image using any trained classifier that isn't "Fast Gentle Boosting".

An error occurred in the program:

TypeError: ufunc 'isinf' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
  File "cpa\classifier.pyc", line 1460, in OnScoreImage
  File "cpa\classifier.pyc", line 1509, in ScoreImage
  File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 157, in processData
  File "numpy\lib\type_check.pyc", line 374, in nan_to_num
  File "numpy\lib\ufunclike.pyc", line 113, in isposinf

I also get a similar error when scoring all images, or when trying to fetch positive/negative/uncertain objects. The similarity is in the last 4 lines of the call stack. Looking at the source for cpa/multiclasssql.py, it looks like this try-except calls np.nan_to_num (which is raising the TypeError) once in both the try and except blocks. It may be that cell_data is not getting cleaned up properly before being passed to np.nan_to_num. For example, when np.nan_to_num is called on a numpy array containing a None, the above TypeError gets raised.

I didn't see any issues related to this on the CPA GitHub page so I figured I'd share it here.

I'm on 64-bit Windows 8.1 and using the CPA nightly build. The error also occurs when using the current stable 2.2.1 CPA build.

@daviddao

This comment has been minimized.

Show comment
Hide comment

@daviddao daviddao added the Bug label May 4, 2016

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 4, 2016

Collaborator

Can you ask whether they have any idea what data types are there?

On Wed, May 4, 2016 at 7:32 AM, David Dao notifications@github.com wrote:

Assigned #178
#178 to
@jhung0 https://github.com/jhung0.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

Collaborator

jhung0 commented May 4, 2016

Can you ask whether they have any idea what data types are there?

On Wed, May 4, 2016 at 7:32 AM, David Dao notifications@github.com wrote:

Assigned #178
#178 to
@jhung0 https://github.com/jhung0.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 4, 2016

Collaborator

Is there a dictionary or something?
np.nan_to_num(np.array([{}]))
recreates

Collaborator

jhung0 commented May 4, 2016

Is there a dictionary or something?
np.nan_to_num(np.array([{}]))
recreates

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 4, 2016

Collaborator

Also if dtype is object

Collaborator

jhung0 commented May 4, 2016

Also if dtype is object

@daviddao

This comment has been minimized.

Show comment
Hide comment
@daviddao

daviddao May 5, 2016

Contributor

Can we check for that automatically? Return an error if dtype is object.

Contributor

daviddao commented May 5, 2016

Can we check for that automatically? Return an error if dtype is object.

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 5, 2016

Collaborator

yeah we can add a line to check the type

On Thu, May 5, 2016 at 8:44 AM, David Dao notifications@github.com wrote:

Can we check for that automatically? Return an error if dtype is object.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

Collaborator

jhung0 commented May 5, 2016

yeah we can add a line to check the type

On Thu, May 5, 2016 at 8:44 AM, David Dao notifications@github.com wrote:

Can we check for that automatically? Return an error if dtype is object.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 6, 2016

Collaborator

Now an exception should be raised if the type is object. If not, it should print the data type.

Collaborator

jhung0 commented May 6, 2016

Now an exception should be raised if the type is object. If not, it should print the data type.

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 23, 2016

I posted in the forums, but forgot to cross-post here:

I've installed the latest nightly (with commit 1ada1d61), loaded my training set, trained the classifier, and tried to score an image but hit the new exception: Exception: data type is object.

I'm going to guess this means there may be some missing values in the data exported from the CellProfiler pipeline? Is there an easy way to fill these or make sure the types are consistent in my data set?

jonchar commented May 23, 2016

I posted in the forums, but forgot to cross-post here:

I've installed the latest nightly (with commit 1ada1d61), loaded my training set, trained the classifier, and tried to score an image but hit the new exception: Exception: data type is object.

I'm going to guess this means there may be some missing values in the data exported from the CellProfiler pipeline? Is there an easy way to fill these or make sure the types are consistent in my data set?

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 24, 2016

Collaborator

Ok, I might have found the problem...
Try it after 5966d42

Collaborator

jhung0 commented May 24, 2016

Ok, I might have found the problem...
Try it after 5966d42

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 25, 2016

Just downloaded the latest nightly and recieved essentially the original error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
  File "cpa\classifier.pyc", line 1460, in OnScoreImage
  File "cpa\classifier.pyc", line 1509, in ScoreImage
  File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 160, in processData
  File "numpy\lib\type_check.pyc", line 374, in nan_to_num
  File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can look for in my data set that would help debug this? NaNs or missing values perhaps?

jonchar commented May 25, 2016

Just downloaded the latest nightly and recieved essentially the original error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
  File "cpa\classifier.pyc", line 1460, in OnScoreImage
  File "cpa\classifier.pyc", line 1509, in ScoreImage
  File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
  File "cpa\multiclasssql.pyc", line 160, in processData
  File "numpy\lib\type_check.pyc", line 374, in nan_to_num
  File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can look for in my data set that would help debug this? NaNs or missing values perhaps?

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 26, 2016

Collaborator

Sorry about that.
If you could look through your data and try to find anything that might be
a problem, that would be really helpful! You could use a mysql/sqlite table
viewer or the CPA table viewer (of the per object table).

On Thu, May 26, 2016 at 1:21 AM, Jon Charest notifications@github.com
wrote:

Just downloaded the latest nightly and recieved essentially the original
error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
File "cpa\classifier.pyc", line 1460, in OnScoreImage
File "cpa\classifier.pyc", line 1509, in ScoreImage
File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 160, in processData
File "numpy\lib\type_check.pyc", line 374, in nan_to_num
File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can
look for in my data set that would help debug this? NaNs or missing values
perhaps?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

Collaborator

jhung0 commented May 26, 2016

Sorry about that.
If you could look through your data and try to find anything that might be
a problem, that would be really helpful! You could use a mysql/sqlite table
viewer or the CPA table viewer (of the per object table).

On Thu, May 26, 2016 at 1:21 AM, Jon Charest notifications@github.com
wrote:

Just downloaded the latest nightly and recieved essentially the original
error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
File "cpa\classifier.pyc", line 1460, in OnScoreImage
File "cpa\classifier.pyc", line 1509, in ScoreImage
File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 160, in processData
File "numpy\lib\type_check.pyc", line 374, in nan_to_num
File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can
look for in my data set that would help debug this? NaNs or missing values
perhaps?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 26, 2016

Collaborator

Also, there should be something printed, either data type 1 .... or data
type 2 ....
Let me know what gets printed

On Thu, May 26, 2016 at 8:43 AM, Jane Hung jyhung@broadinstitute.org
wrote:

Sorry about that.
If you could look through your data and try to find anything that might be
a problem, that would be really helpful! You could use a mysql/sqlite table
viewer or the CPA table viewer (of the per object table).

On Thu, May 26, 2016 at 1:21 AM, Jon Charest notifications@github.com
wrote:

Just downloaded the latest nightly and recieved essentially the original
error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
File "cpa\classifier.pyc", line 1460, in OnScoreImage
File "cpa\classifier.pyc", line 1509, in ScoreImage
File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 160, in processData
File "numpy\lib\type_check.pyc", line 374, in nan_to_num
File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can
look for in my data set that would help debug this? NaNs or missing values
perhaps?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

Collaborator

jhung0 commented May 26, 2016

Also, there should be something printed, either data type 1 .... or data
type 2 ....
Let me know what gets printed

On Thu, May 26, 2016 at 8:43 AM, Jane Hung jyhung@broadinstitute.org
wrote:

Sorry about that.
If you could look through your data and try to find anything that might be
a problem, that would be really helpful! You could use a mysql/sqlite table
viewer or the CPA table viewer (of the per object table).

On Thu, May 26, 2016 at 1:21 AM, Jon Charest notifications@github.com
wrote:

Just downloaded the latest nightly and recieved essentially the original
error:

An error occurred in the program:
TypeError: ufunc 'isinf' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Traceback (most recent call last):
File "cpa\classifier.pyc", line 1460, in OnScoreImage
File "cpa\classifier.pyc", line 1509, in ScoreImage
File "cpa\generalclassifier.pyc", line 78, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 127, in FilterObjectsFromClassN
File "cpa\multiclasssql.pyc", line 160, in processData
File "numpy\lib\type_check.pyc", line 374, in nan_to_num
File "numpy\lib\ufunclike.pyc", line 113, in isposinf

Is there a way I can sanitize my input? Or alternatively, something I can
look for in my data set that would help debug this? NaNs or missing values
perhaps?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#178 (comment)

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 26, 2016

I definitely have some values that are None when I load the per-object table in the CPA table viewer.

jonchar commented May 26, 2016

I definitely have some values that are None when I load the per-object table in the CPA table viewer.

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 26, 2016

I didn't see anything else printed with the error message.

jonchar commented May 26, 2016

I didn't see anything else printed with the error message.

@jhung0

This comment has been minimized.

Show comment
Hide comment
@jhung0

jhung0 May 27, 2016

Collaborator

I had a line handling None, but it might have messed up the dtype of the numpy array and so led to the error.
Please try 1ad708d

Collaborator

jhung0 commented May 27, 2016

I had a line handling None, but it might have messed up the dtype of the numpy array and so led to the error.
Please try 1ad708d

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 27, 2016

On the latest nightly I'm able to score an image using a random forest. I haven't tried the other classifiers yet, but will do soon. Thanks!

jonchar commented May 27, 2016

On the latest nightly I'm able to score an image using a random forest. I haven't tried the other classifiers yet, but will do soon. Thanks!

@jonchar

This comment has been minimized.

Show comment
Hide comment
@jonchar

jonchar May 31, 2016

Just reporting here that I can successfully use the "Score Image" or "Score All" functions with all of the classifiers using the version after 1ad708d.

For reference, the problem arose from having None types in the data set. These values were in columns relating to center of mass / mass displacement measurements within primary and tertiary objects in the data set.

Thanks again!

jonchar commented May 31, 2016

Just reporting here that I can successfully use the "Score Image" or "Score All" functions with all of the classifiers using the version after 1ad708d.

For reference, the problem arose from having None types in the data set. These values were in columns relating to center of mass / mass displacement measurements within primary and tertiary objects in the data set.

Thanks again!

@jhung0 jhung0 closed this Jun 2, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment