New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joblib problems after using sweetviz #95
Comments
Hi @fior-di-latte-byte, thanks for the detailed report! This is the first time I hear of something like this, but it's definitely possible that at some point in processing, the source dataframe gets modified somehow. A lot of processing happens but it's generally on copies of data. I know "Boolean" columns can get their values standardized to 0/1 from Y/N etc, but I don't think that happens in-place. I thought if a column is named "index" it would get renamed to "df_index" but I don't think that happens in-place either. I tried doing a couple of before-and-after tests to see if anything changes but I haven't seen anything obvious yet. |
Hi, I already imagined this error is hard to reconstruct. But who knows, maybe someone else will show up with the same problem. If not, even better ?? ;-) Anyway, thanks again. |
Hi @fior-di-latte-byte, So it would be some global state or variable that is confusing joblib. I noticed But it's probably something similar/related. Perhaps something used to set up |
Hi @fbdesignpro @fior-di-latte-byte , I found myself with the same error yesterday, so I put here a way to replicate the FloatingPointError. I also found that this behavior doesn't depends on modifying DataFrame data, as I show on the notebook. I hope you find it helpful, |
Hi Guys, I dug into the code of SweetViz and found this line of code The other workaround is to put I hope this helps |
Hello,
first off, I'd like to thank you a lot in the name of many for this awesome package.
I am using sweetviz to export a feature report in a data science context. After that, the features are postprocessed using a sklearn (0.24.1) ColumnTransformer that (among other things) uses a TargetEncoder for the categories (category-encoders =2.2.2).
Here comes the odd thing:
Whenever I use
sv.analyze(features)
before using the ColumnTransformer(...).fit_transform(features), it throws a rather long joblib error, complaining that a FloatingPointError occured in a worker that uses the TargetEncoder.Let me be clear: If I ditch that single line
sv.analyze(features)
everything works smoothly [on multiple systems].This leads me to conjecture that sweetviz maybe leaves some kind of artifact behind that the ColumnTransformer subsequently stumbles over.
Maybe somebody has an idea what might be the reason for this behaviour?
Thanks in advance!
I am using Macos Big Sur and Python 3.8.2.
The text was updated successfully, but these errors were encountered: