Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IllegalArgumentException when suing SimpleImputer for data sourced from json file #769

Closed
jamalromero opened this issue Apr 20, 2024 · 1 comment

Comments

@jamalromero
Copy link

Exception thrown when using SimpleImputer for data sourced from a json file. The exception occurs when data has numerical values. Removing numerical data, the SimpleImputer fit and apply changes without error. The following code reads two json files, one has numerical data, and the other has strings only.
Example:

DataFrame df = Read.json(Util.getFilePath("test-json-without-numerical.json"));
System.out.println(df);
System.out.println("=========== Using SimpleImputer with non numerical data ===========");
SimpleImputer.fit(df).apply(df);
df = Read.json(Util.getFilePath("test-json-with-numerical.json"));
System.out.println(df);
System.out.println("=========== Using SimpleImputer with numerical data ===========");
SimpleImputer.fit(df).apply(df);

See files attached. Columns nk, hc and t are numerical
Console log:

[hh: String, ll: String, a: String, c: String, tz: String, cy: String, g: String, h: String, gr: String, al: String, l: String]
+---------+--------------------+--------------------+---+----------------+-------+------+------+---+--------------+-------+
|       hh|                  ll|                   a|  c|              tz|     cy|     g|     h| gr|            al|      l|
+---------+--------------------+--------------------+---+----------------+-------+------+------+---+--------------+-------+
|1.usa.gov| 42.576698, -70.9...|Mozilla/5.0 (Wind...| US|America/New_York|Danvers|A6qOVH|wfLQtf| MA|en-US,en;q=0.8|orofrog|
|     j.mp| 40.218102, -111....|GoogleMaps/Roches...| US|  America/Denver|  Provo|mwszkS|mwszkS| UT|          null|  bitly|
+---------+--------------------+--------------------+---+----------------+-------+------+------+---+--------------+-------+

=========== Using SimpleImputer with non numerical data ===========
[hh: String, ll: String, a: String, c: String, tz: String, g: String, h: String, gr: String, al: String, l: String, t: long, cy: String, hc: long, nk: int]
+---------+--------------------+--------------------+---+----------------+------+------+---+--------------+-------+----------+-------+----------+---+
|       hh|                  ll|                   a|  c|              tz|     g|     h| gr|            al|      l|         t|     cy|        hc| nk|
+---------+--------------------+--------------------+---+----------------+------+------+---+--------------+-------+----------+-------+----------+---+
|1.usa.gov| 42.576698, -70.9...|Mozilla/5.0 (Wind...| US|America/New_York|A6qOVH|wfLQtf| MA|en-US,en;q=0.8|orofrog|1331923247|Danvers|1331822918|  1|
|     j.mp| 40.218102, -111....|GoogleMaps/Roches...| US|  America/Denver|mwszkS|mwszkS| UT|          null|  bitly|1331923249|  Provo|1308262393|  0|
+---------+--------------------+--------------------+---+----------------+------+------+---+--------------+-------+----------+-------+----------+---+

=========== Using SimpleImputer with numerical data ===========
Exception in thread "main" java.lang.IllegalArgumentException: Impute non-floating primitive types
	at smile.feature.imputation.SimpleImputer.lambda$apply$0(SimpleImputer.java:119)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfInt.accept(ForEachOps.java:205)
	at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
	at java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:712)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool.helpComplete(ForkJoinPool.java:2145)
	at java.base/java.util.concurrent.ForkJoinTask.awaitDone(ForkJoinTask.java:420)
	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:668)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfInt.evaluateParallel(ForEachOps.java:189)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
	at java.base/java.util.stream.IntPipeline.forEach(IntPipeline.java:463)
	at java.base/java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:620)
	at smile.feature.imputation.SimpleImputer.apply(SimpleImputer.java:98)
	at com.lixusnet.Data.getBitlyUsaGov(Data.java:32)
	at com.lixusnet.Data.main(Data.java:17)

test-json-with-numerical.json
test-json-without-numerical.json

@haifengl
Copy link
Owner

The fix is in master branch now. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants