unorderable types: str() > int() #29

simonm3 · 2016-10-31T20:08:59Z

I get the above error message. Works fine if I exclude the object columns.

simonm3 · 2016-10-31T21:02:48Z

Some columns contained errors e.g. a numeric column had NaN values as " NA".
Would be useful if such failure reported the offending columns rather than failing.

JosPolfliet · 2016-11-04T09:24:16Z

Good point, I never thought about column types that are not numeric, character or dates. They should indeed just be ignored with a warning message that the type is not supported for analysis.

simonm3 · 2016-11-04T10:46:49Z

As well as reporting them as ignored would be useful to show frequency
counts if possible. For example if you have values "P", "1", "1.0" then it
is clear what the problem is.

On 4 November 2016 at 09:24, Jos Polfliet notifications@github.com wrote:

Good point, I never thougt about column types that are not numeric,
character or dates. They should indeed just be ignored with a warning
message that the type is not supported for analysis.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABJN6RGNI1wgYOvDx79TcG-B-qhr_cIbks5q6vnDgaJpZM4Klcs0
.

arsenyinfo · 2016-12-19T15:41:22Z

It would be also useful if one could convert these columns into same type: e.g. if I have both str and int within one column, it's probably a good idea to treat it as str while making the report.

I'd suggest to add a param to ProfileReport for this case with three options: exclude weird columns, cast to str or just raise an exception.

@JosPolfliet would you like to get a PR on this?

JosPolfliet · 2016-12-20T11:06:47Z

Yes @arsenyinfo, if you have time feel free to send a PR and I will review! Thanks.

arsenyinfo · 2016-12-28T09:49:59Z

@JosPolfliet PTAL at PR above. Thanks!

JosPolfliet · 2016-12-28T10:28:29Z

My friends, I haven't forgotten this, I am just travelling and haven't made the time. I'm back on Saturday and will take a look then. Happy holidays!

…

On Wed, 28 Dec 2016 at 10:50, Arseny Kravchenko ***@***.***> wrote: @JosPolfliet <https://github.com/JosPolfliet> PTAL at PR above. Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALznoftdolHE9VDohyBF6tnVQwdC8c5Dks5rMjDHgaJpZM4Klcs0> .

simonm3 · 2017-03-05T09:51:01Z

In addition to the mixed type fields there is a list type which also crashes the whole report currently. As a short term fix would be good to flag it up as an unsupported type. In the longer term would be useful to see most common values and distribution.

JosPolfliet · 2017-03-06T09:02:56Z

I thought this was fixed in the last version. It was related to a Pandas bug. PR was closed but this issue wasn't (mea culpa).

Can you share a working example of when it fails?

simonm3 · 2017-03-06T10:34:23Z

from pandas_profiling import ProfileReport
a=pd.DataFrame(dict(a=[1,2,3], b=[4,5,6], mylist=[["item1", "item2"], ["item3", "item2", "item3"], ["item2", "item2"]]))
ProfileReport(a)

unhashable type list

simonm3 · 2017-03-06T12:49:37Z

BTW this is a really great package but can't you change the name to something without underscores, hyphens and capital letters?....e.g.

from pandaseda import eda
eda()

conradoqg · 2018-01-04T22:15:57Z

Hey,

With the above PR #82 merged, you can have dicts, lists and other object types in your dataframe. The profiler will mark those fields as unsupported (since there is not much analysis that we can do).

If in the future we find a nice way to report something related to that data type we can open a feature request.

About the package naming, I think it's a good suggestion and we should open a specific issue for this.
@romainx can you do that?

I think we can close this issue after those changes.

Best

romainx · 2018-01-06T15:52:33Z

Ok I've just created a new issue of the name (#87).

arsenyinfo mentioned this issue Dec 20, 2016

Multiple strategies to handle mixed types columns #33

Closed

conradoqg mentioned this issue Jan 2, 2018

Improve types handling #82

Merged

conradoqg closed this as completed Jan 6, 2018

romainx mentioned this issue Jan 6, 2018

pandas-profiling new name ! #87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unorderable types: str() > int() #29

unorderable types: str() > int() #29

simonm3 commented Oct 31, 2016

simonm3 commented Oct 31, 2016

JosPolfliet commented Nov 4, 2016 •

edited

simonm3 commented Nov 4, 2016

arsenyinfo commented Dec 19, 2016

JosPolfliet commented Dec 20, 2016

arsenyinfo commented Dec 28, 2016

JosPolfliet commented Dec 28, 2016 via email

simonm3 commented Mar 5, 2017

JosPolfliet commented Mar 6, 2017 •

edited

simonm3 commented Mar 6, 2017 •

edited

simonm3 commented Mar 6, 2017

conradoqg commented Jan 4, 2018 •

edited

romainx commented Jan 6, 2018

unorderable types: str() > int() #29

unorderable types: str() > int() #29

Comments

simonm3 commented Oct 31, 2016

simonm3 commented Oct 31, 2016

JosPolfliet commented Nov 4, 2016 • edited

simonm3 commented Nov 4, 2016

arsenyinfo commented Dec 19, 2016

JosPolfliet commented Dec 20, 2016

arsenyinfo commented Dec 28, 2016

JosPolfliet commented Dec 28, 2016 via email

simonm3 commented Mar 5, 2017

JosPolfliet commented Mar 6, 2017 • edited

simonm3 commented Mar 6, 2017 • edited

simonm3 commented Mar 6, 2017

conradoqg commented Jan 4, 2018 • edited

romainx commented Jan 6, 2018

JosPolfliet commented Nov 4, 2016 •

edited

JosPolfliet commented Mar 6, 2017 •

edited

simonm3 commented Mar 6, 2017 •

edited

conradoqg commented Jan 4, 2018 •

edited