Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't create boolean columns of all false #326

Closed
1 task done
erp12 opened this issue Apr 11, 2021 · 1 comment
Closed
1 task done

Can't create boolean columns of all false #326

erp12 opened this issue Apr 11, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@erp12
Copy link
Collaborator

erp12 commented Apr 11, 2021

  • I have read through the quick start and installation sections of the README.

Info

Info Value
Operating System MacOS
Geni Version 0.3.8
JDK 1.8
Spark Version 3.0.2

Problem / Steps to reproduce

It seems like it is impossible to create a boolean column from all false values using records->dataset because they get recognized as null columns. Here is a failing tests.

(fact "should work for bool columns"
    (let [dataset (g/records->dataset
                    @tr/spark
                    [{:i 0 :s "A" :b false}
                     {:i 1 :s "B" :b false}
                     {:i 2 :s "C" :b false}])]
      (instance? Dataset dataset) => true
      (g/schema dataset) => (g/->schema {:i :long
                                         :s :string
                                         :b :bool})
      (g/collect-vals dataset) => [[0 "A" false]
                                   [1 "B" false]
                                   [2 "C" false]]))

and here is the output.

FAIL On records->dataset - should work for bool columns at (dataset_creation_test.clj:143)
Expected:
#<org.apache.spark.sql.types.StructType@2e83f3f5 StructType(StructField(i,LongType,true), StructField(s,StringType,true), StructField(b,BooleanType,true))>
Actual:
#<org.apache.spark.sql.types.StructType@67b8b180 StructType(StructField(i,LongType,true), StructField(s,StringType,true), StructField(b,NullType,true))>

FAIL On records->dataset - should work for bool columns at (dataset_creation_test.clj:146)
Expected:
[[0 "A" false] [1 "B" false] [2 "C" false]]
Actual:
([0 "A" nil] [1 "B" nil] [2 "C" nil])
Diffs: in [0 2] expected false, was nil
              in [1 2] expected false, was nil
              in [2 2] expected false, was nil

The same behavior applies to map->dataset and table->dataset. If any of the booleans are true, then the schema is understood correctly.

@erp12 erp12 added the bug Something isn't working label Apr 11, 2021
@erp12 erp12 changed the title Can't create boolean columns with records->dataset Can't create boolean columns Apr 11, 2021
@erp12 erp12 changed the title Can't create boolean columns Can't create boolean columns of all false Apr 11, 2021
@erp12
Copy link
Collaborator Author

erp12 commented Apr 22, 2021

I confirmed that #327 fixes this issue. Thanks!

@erp12 erp12 closed this as completed Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant