-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add minimum rows for scoring set in RawFeatureFilter + rewrite tests to use data generators #250
Changes from 10 commits
9ba57d6
3e6bc34
c05f61e
8469d1f
0999ac9
8b483bf
787bca6
aa1ebb6
8e911b9
c08ac2d
bef03dd
73d9d61
ebcebdd
94f856a
bee777b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -241,14 +241,16 @@ class OpWorkflowTest extends FlatSpec with PassengerSparkFixtureTest { | |
val fv = Seq(age, gender, height, weight, description, boarded, stringMap, numericMap, booleanMap).transmogrify() | ||
val survivedNum = survived.occurs() | ||
val pred = BinaryClassificationModelSelector().setInput(survivedNum, fv).getOutput() | ||
|
||
val wf = new OpWorkflow() | ||
.setResultFeatures(pred) | ||
.withRawFeatureFilter(Option(dataReader), Option(simpleReader), | ||
maxFillRatioDiff = 1.0) // only height and the female key of maps should meet this criteria | ||
.withRawFeatureFilter(Option(dataReader), None, maxFillRatioDiff = 1.0) | ||
val data = wf.computeDataUpTo(weight) | ||
|
||
// Since there are < 500 rows in the scoring set, only the training set checks are applied here, and the only | ||
// removal reasons should be null indicator - label correlations | ||
data.schema.fields.map(_.name).toSet shouldEqual | ||
Set("key", "height", "survived", "stringMap", "numericMap", "booleanMap") | ||
Set("booleanMap", "description", "height", "stringMap", "age", "key", "survived", "numericMap") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. once things are parameterized lets set them so that the tests remain the same There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was just about to suggest the same ;) |
||
} | ||
|
||
it should "return a model that transforms the data correctly" in { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not make this hard coded - it should be a parameter that the user can override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed.