fix: fix cognitive service errors #1176

serena-ruan · 2021-09-01T06:05:44Z

No description provided.

serena-ruan · 2021-09-01T07:41:25Z

/azp run

azure-pipelines · 2021-09-01T07:41:35Z

Azure Pipelines successfully started running 1 pipeline(s).

serena-ruan · 2021-09-01T08:55:01Z

/azp run

azure-pipelines · 2021-09-01T08:55:11Z

Azure Pipelines successfully started running 1 pipeline(s).

serena-ruan · 2021-09-01T10:46:15Z

/azp run

azure-pipelines · 2021-09-01T10:46:24Z

Azure Pipelines successfully started running 1 pipeline(s).

serena-ruan · 2021-09-01T15:14:52Z

/azp run

azure-pipelines · 2021-09-01T15:15:02Z

Azure Pipelines successfully started running 1 pipeline(s).

mhamilton723

Thank you!!

mhamilton723 · 2021-09-01T15:44:33Z

cognitive/src/main/scala/com/microsoft/ml/spark/cognitive/CognitiveServiceBase.scala

@@ -91,7 +91,9 @@ trait HasServiceParams extends Params {
  }

  protected def shouldSkip(row: Row): Boolean = getRequiredParams.exists { p =>
-    emptyParamData(row, p)
+    if (emptyParamData(row, p))
+      throw new NullPointerException(s"required param undefined: $p")


This will never throw false if the error is in here. What I'm thinking is that we look at the required params in the transformSchema function to ensure all required params are set with either a value or a column. We can then call transformSchema before we transform to add validation there too

nit: Should be an IllegalArgumentException

In CognitiveServiceBase, transformSchema and transform both call getInternalTransformer first, and in this function we create new SimpleHTTPTransformer and setInputParser(getInternalInputParser(schema)). And in getInternalInputParser we call inputFunc, which calls shouldSkip, and that's why I add it here. I know it will never throw false if the error happens here, so do we want to silent the error to catch it later and return null?

We could have the logic in getInternalTransformer. I think the logic should live in the "control-plane" which executes on the head node rather than code inside a mapPartitions like shouldSkip

mhamilton723 · 2021-09-01T15:47:36Z

cognitive/src/main/scala/com/microsoft/ml/spark/cognitive/TextTranslator.scala

-          .map(x => x.map(y => Map("Text" -> y))).toJson.compactPrint, ContentType.APPLICATION_JSON))
+      val textVal = getValueOpt(r, text)
+      if (textVal.nonEmpty) {
+        val content = textVal.get.getClass.getName match {


Chris' second issue was caused by having nulls within a batch. Perhaps we should add that case as a test and ensure this new func can handle. I think in my TA batching logic it had to get pretty hairy to handle that unfortunately. Perhaps we can use similar logic. Also we might want to consider a better solution that just batching elements into single arrays. But we can push that to a later PR as that is current behavior in TA

For nulls in text it will get null as return result, since the request itself would become invalid. But if toLanguage is set to null, then it will trigger an error as above "required param undefined", what behaviors are we expecting exactly? I noticed in TA there's a reshapeToArray stuff dealing with turning string into arrays, but I didn't get Chris's problem with the output schema (there's unpackBatchUDF dealing with the return json in TA), could explain more about that issue?

I think the error comes from when there are nulls of text which would ordinarily be skipped but they are in a batch so the mess up the whole batch

mhamilton723 · 2021-09-01T15:48:27Z

cognitive/src/main/scala/com/microsoft/ml/spark/cognitive/TextTranslator.scala

@@ -171,6 +211,8 @@ class Translate(override val uid: String) extends TextTranslatorBase(uid)

  def setToLanguage(v: Seq[String]): this.type = setScalarParam(toLanguage, v)

+  def setToLanguage(v: String): this.type = setScalarParam(toLanguage, Seq(v))


Nice! We might need to add this to python methods too

I guess this will automatically work? We're calling self._java_obj = self._java_obj.setXXX(value) and the java object will find the corresponding set function that matches the value type?

Youre right!

serena-ruan · 2021-09-02T07:03:37Z

/azp run

azure-pipelines · 2021-09-02T07:03:47Z

Azure Pipelines successfully started running 1 pipeline(s).

…ng as Array & supplement tests

serena-ruan · 2021-09-03T09:56:13Z

/azp run

azure-pipelines · 2021-09-03T09:56:22Z

Azure Pipelines successfully started running 1 pipeline(s).

…ensure the error is thrown if requried params are not set

serena-ruan · 2021-09-06T06:15:25Z

/azp run

azure-pipelines · 2021-09-06T06:15:34Z

Azure Pipelines successfully started running 1 pipeline(s).

mhamilton723

Love how you have applied these fixes to whole lib

mhamilton723 · 2021-09-08T22:14:40Z

cognitive/src/main/scala/com/microsoft/ml/spark/cognitive/TextTranslator.scala

+          post.setHeader("Content-Type", "application/json; charset=UTF-8")
+
+          val json = textAndTranslations.head.getClass.getTypeName match {
+            case "scala.Tuple2" => textAndTranslations.map(


This looks a little fishy, is there a reason we cant use regular pattern matching here?

Yes, because we added def setTextAndTranslation(v: (String, String)): this.type = setScalarParam(textAndTranslation, Seq(v)) function and in this case even though in getValue function we cast it into Seq[(String, String)] the underlying type is still Tuple2, and for the Tuple case I can't cast it into Seq[Row] and use Map("Text" -> s.getString(0), "Translation" -> s.getString(1)) directly...

mhamilton723 · 2021-09-08T22:16:49Z

cognitive/src/test/scala/com/microsoft/ml/spark/cognitive/split1/TranslatorSuite.scala

-                          df: DataFrame,
-                          expectString: String): Boolean = {
-    val results = translator
+                          df: DataFrame): DataFrame = {


Might want to rename this method to better reflect what it does

cognitive/src/test/scala/com/microsoft/ml/spark/cognitive/split1/TranslatorSuite.scala

serena-ruan · 2021-09-09T06:41:08Z

/azp run

azure-pipelines · 2021-09-09T06:41:19Z

Azure Pipelines successfully started running 1 pipeline(s).

serena-ruan · 2021-09-10T03:25:43Z

/azp run

azure-pipelines · 2021-09-10T03:25:54Z

Azure Pipelines successfully started running 1 pipeline(s).

* fix: fix setLinkedService issues in Synapse (#1177) * doc: add predictive maintenence notebook squash * fix: fix cog service test flakes * feat: add NERPii * fix: fix scala style error * fix: rename NERPii to PII * fix: fix anomaly detector test cases * fix: fix cognitive service errors (#1176) fix Left & Right errors Enhancement for text translator * chore: Add script to clean and back up ACR * fix: fix setLinkedService in Synapse * initial commit * resolving comments * updated opinions test Co-authored-by: wenqing xu <80103478+xuwq1993@users.noreply.github.com> Co-authored-by: Mark <mhamilton723@gmail.com> Co-authored-by: xuwq1993 <wenqx@microsoft.com> Co-authored-by: serena-ruan <82044803+serena-ruan@users.noreply.github.com>

fix: fix Left & Right errors

946d73d

serena-ruan force-pushed the serena/bugFix branch from a94d4b8 to 946d73d Compare September 1, 2021 06:06

serena-ruan changed the title ~~fix: fix Left & Right errors~~ fix: fix cognitive service errors & support autoconversion of Seq[String] Sep 1, 2021

serena-ruan changed the title ~~fix: fix cognitive service errors & support autoconversion of Seq[String]~~ fix: fix cognitive service errors Sep 1, 2021

add triggering error if required params are not set

081797f

throw error in shouldSkip instead

d856987

serena-ruan added 2 commits September 1, 2021 17:56

fix flaky confidence value in DescribeImage

280122f

support autoconversion of string to Seq[string] for toLanguage & Text

e75cfba

fix inputFunc

e667728

mhamilton723 requested changes Sep 1, 2021

View reviewed changes

fix Exception type & add null test

8bb1f52

serena-ruan added 2 commits September 3, 2021 17:54

support text as stringType in dataframe for translators & add reshapi…

b0d3be6

…ng as Array & supplement tests

update format

c124660

serena-ruan and others added 2 commits September 6, 2021 11:23

Merge branch 'master' into serena/bugFix

4a7553e

catch required params not set error at head node & add more tests to …

7dd0e9f

…ensure the error is thrown if requried params are not set

serena-ruan marked this pull request as ready for review September 6, 2021 06:51

serena-ruan requested a review from mhamilton723 September 7, 2021 02:57

mhamilton723 requested changes Sep 8, 2021

View reviewed changes

serena-ruan and others added 2 commits September 9, 2021 11:17

Merge branch 'master' into serena/bugFix

af052b1

address comments

b330bb3

mhamilton723 approved these changes Sep 9, 2021

View reviewed changes

Merge branch 'master' into serena/bugFix

50a574b

serena-ruan merged commit d85aae8 into microsoft:master Sep 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix cognitive service errors #1176

fix: fix cognitive service errors #1176

serena-ruan commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

mhamilton723 left a comment

mhamilton723 Sep 1, 2021

mhamilton723 Sep 1, 2021

serena-ruan Sep 2, 2021

mhamilton723 Sep 3, 2021

mhamilton723 Sep 1, 2021

serena-ruan Sep 2, 2021

mhamilton723 Sep 3, 2021

mhamilton723 Sep 1, 2021

serena-ruan Sep 2, 2021 •

edited

Loading

mhamilton723 Sep 3, 2021

serena-ruan commented Sep 2, 2021

azure-pipelines bot commented Sep 2, 2021

serena-ruan commented Sep 3, 2021

azure-pipelines bot commented Sep 3, 2021

serena-ruan commented Sep 6, 2021

azure-pipelines bot commented Sep 6, 2021

mhamilton723 left a comment

mhamilton723 Sep 8, 2021

serena-ruan Sep 9, 2021

mhamilton723 Sep 8, 2021

serena-ruan commented Sep 9, 2021

azure-pipelines bot commented Sep 9, 2021

serena-ruan commented Sep 10, 2021

azure-pipelines bot commented Sep 10, 2021

		@@ -171,6 +211,8 @@ class Translate(override val uid: String) extends TextTranslatorBase(uid)

		def setToLanguage(v: Seq[String]): this.type = setScalarParam(toLanguage, v)

		def setToLanguage(v: String): this.type = setScalarParam(toLanguage, Seq(v))

fix: fix cognitive service errors #1176

fix: fix cognitive service errors #1176

Conversation

serena-ruan commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

serena-ruan commented Sep 1, 2021

azure-pipelines bot commented Sep 1, 2021

mhamilton723 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serena-ruan Sep 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serena-ruan commented Sep 2, 2021

azure-pipelines bot commented Sep 2, 2021

serena-ruan commented Sep 3, 2021

azure-pipelines bot commented Sep 3, 2021

serena-ruan commented Sep 6, 2021

azure-pipelines bot commented Sep 6, 2021

mhamilton723 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serena-ruan commented Sep 9, 2021

azure-pipelines bot commented Sep 9, 2021

serena-ruan commented Sep 10, 2021

azure-pipelines bot commented Sep 10, 2021

serena-ruan Sep 2, 2021 •

edited

Loading