Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't throw on invalid columns in DropColumns #1695

Merged
merged 22 commits into from
Oct 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9b616dd
remove unused imports
niehaus59 May 12, 2022
e52919b
Merge remote-tracking branch 'upstream/master'
niehaus59 Jun 7, 2022
ad733ec
batch prompts
niehaus59 Jun 21, 2022
ac81dec
fix merge conflict
niehaus59 Jun 21, 2022
8c7854e
fix merge conflicts
niehaus59 Jun 22, 2022
d9e1863
Merge branch 'manieh/batch-prompts-2'
niehaus59 Jun 22, 2022
29133a5
Merge remote-tracking branch 'upstream/master'
niehaus59 Jun 24, 2022
cce533c
Merge remote-tracking branch 'upstream/master'
niehaus59 Jun 24, 2022
d791955
Merge remote-tracking branch 'upstream/master'
niehaus59 Jun 27, 2022
c394754
Merge remote-tracking branch 'upstream/master'
niehaus59 Jul 13, 2022
18ea6de
Merge remote-tracking branch 'upstream/master'
niehaus59 Jul 25, 2022
e78062d
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 1, 2022
af6af01
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 2, 2022
1f126b0
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 4, 2022
d51e734
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 10, 2022
5a0b067
Merge branch 'master' of https://github.com/niehaus59/SynapseML
niehaus59 Aug 12, 2022
841cf0a
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 25, 2022
e67bbce
Merge branch 'master' of https://github.com/niehaus59/SynapseML
niehaus59 Aug 25, 2022
ac40e41
Merge remote-tracking branch 'upstream/master'
niehaus59 Aug 25, 2022
493c8a3
Merge remote-tracking branch 'upstream/master'
niehaus59 Oct 25, 2022
72fed31
don't throw for invalid columns in DropColumn
niehaus59 Oct 25, 2022
305b5cc
remove now unused verify method from DropColumns
niehaus59 Oct 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -38,28 +38,14 @@ class DropColumns(val uid: String) extends Transformer with Wrappable with Defau
*/
override def transform(dataset: Dataset[_]): DataFrame = {
logTransform[DataFrame]({
verifySchema(dataset.schema)
dataset.toDF().drop(getCols: _*)
})
}

def transformSchema(schema: StructType): StructType = {
verifySchema(schema)
val droppedCols = getCols.toSet
StructType(schema.fields.filter(f => !droppedCols(f.name)))
}

def copy(extra: ParamMap): DropColumns = defaultCopy(extra)

private def verifySchema(schema: StructType): Unit = {
val providedCols = schema.fields.map(_.name).toSet
val invalidCols = getCols.filter(!providedCols(_))

if (invalidCols.length > 0) {
throw new NoSuchElementException(
s"DataFrame does not contain specified columns: ${invalidCols.reduce(_ + "," + _)}")
}

}

}
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,13 @@ class DropColumnsSuite extends TestBase with TransformerFuzzing[DropColumns] {

test("Invalid column specified") {
try {
new DropColumns().setCol("four").transform(makeBasicDF())
fail()
val df = makeBasicDF()
new DropColumns().setCol("four").transform(df)
val result = new DropColumns().setCol("four").transform(df)
assert(df.schema == result.schema)
} catch {
case _: NoSuchElementException =>
case _: Exception =>
fail("DropColumns should not throw when for invalid column input")
}
}

Expand Down