-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip serializing cardinality estimates in FeatureDistributions #447
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, discussed internally.
Codecov Report
@@ Coverage Diff @@
## master #447 +/- ##
==========================================
+ Coverage 86.93% 86.93% +<.01%
==========================================
Files 337 337
Lines 11096 11102 +6
Branches 362 364 +2
==========================================
+ Hits 9646 9652 +6
Misses 1450 1450
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments
- minor fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
core/src/test/scala/com/salesforce/op/filters/FeatureDistributionTest.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/salesforce/op/filters/FeatureDistributionTest.scala
Outdated
Show resolved
Hide resolved
FeatureDistributionType.Scoring) | ||
FeatureDistribution.toJson(Seq(fd1)) shouldNot include ("cardEstimate") | ||
|
||
val json = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already computed this string on Line 249, you can store it in a variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that json doesn't contains cardEstimate obj, which ignored while serializing, i construct another example with it, which should be tested that fromJson can still deserialize this obj without ignoring it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can generate it too:
val jsonWithCardEstimate = Serialization.write(fd1)(DefaultFormats +
EnumEntrySerializer.json4s[FeatureDistributionType](FeatureDistributionType))
|"summaryInfo":[],"moments":{"m0":1,"m1":1.0,"m2":0.0,"m3":0.0,"m4":0.0}, | ||
|"cardEstimate":{"valueCounts":{"foo":1,"bar":2}},"type":"Scoring"}] | ||
|""".stripMargin | ||
FeatureDistribution.fromJson(json) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more concisely:
FeatureDistribution.fromJson(json) shouldBe Success(Seq(fd1))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the test case for being able to deserialize where cardEstimate is not present in JSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this test case is doing so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I meant the complementary case then without cardEstimate in JSON
Related issues
Tokens of text field are printing out while internal usage.
Describe the proposed solution
Ignore cardinality field in serialization
Additional context
Previous PR #420 & #438