Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3904] [SQL] add constant objectinspector support for udfs #2762

Closed
wants to merge 8 commits into from

Conversation

chenghao-intel
Copy link
Contributor

In HQL, we convert all of the data type into normal ObjectInspectors for UDFs, most of cases it works, however, some of the UDF actually requires its children ObjectInspector to be the ConstantObjectInspector, which will cause exception.
e.g.
select named_struct("x", "str") from src limit 1;

I updated the method wrap by adding the one more parameter ObjectInspector(to describe what it expects to wrap to, for example: java.lang.Integer or IntWritable).

As well as the unwrap method by providing the input ObjectInspector.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@chenghao-intel
Copy link
Contributor Author

test this please.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have started for PR 2762 at commit 06581e3.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have started for PR 2762 at commit 06581e3.

  • This patch merges cleanly.

@@ -578,6 +578,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
"multi_join_union",
"multiMapJoin1",
"multiMapJoin2",
"udf_named_struct",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should put it after "udf_month"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, moved.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have finished for PR 2762 at commit 06581e3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21637/Test FAILed.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have finished for PR 2762 at commit 06581e3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected case class Keyword(str: String)
    • class SqlLexical(val keywords: Seq[String]) extends StdLexical
    • case class FloatLit(chars: String) extends Token
    • class SqlParser extends AbstractSparkSQLParser
    • case class SetCommand(kv: Option[(String, Option[String])]) extends Command
    • case class ShellCommand(cmd: String) extends Command
    • case class SourceCommand(filePath: String) extends Command
    • case class SetCommand(kv: Option[(String, Option[String])], output: Seq[Attribute])(

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21638/Test FAILed.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have started for PR 2762 at commit 335a952.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 11, 2014

QA tests have finished for PR 2762 at commit 335a952.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21645/Test FAILed.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have started for PR 2762 at commit cb97576.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 12, 2014

QA tests have finished for PR 2762 at commit cb97576.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21661/
Test PASSed.

val baseValue = child.eval(input).asInstanceOf[Row]
if (baseValue == null) null else baseValue(ordinal)
val baseValue = child.eval(input)
if (baseValue == null) null else baseValue.asInstanceOf[Row](ordinal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

@chenghao-intel
Copy link
Contributor Author

Thanks, I've rebased the latest master, and solved the issues you guys raised.

@chenghao-intel
Copy link
Contributor Author

test this please.

deferedObjects(i).asInstanceOf[DeferredObjectAdapter].set(
() => {
children(idx).eval(input)
}, argumentInspectors(i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argumentInspector can be set once during deferedObjects creation, rather than execution per row.

@gvramana
Copy link
Contributor

unfortunately :( I also have worked and implemented the same as part of #2802
Anyways I will rework on my pull request for supporting UDAF and GenericUDAF, once your changes are merged.

@chenghao-intel
Copy link
Contributor Author

Thank you @gvramana , I've updated the code as you suggested.

@chenghao-intel
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 20, 2014

QA tests have started for PR 2762 at commit 49d442b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 20, 2014

QA tests have finished for PR 2762 at commit 49d442b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DeferredObjectAdapter(oi: ObjectInspector) extends DeferredObject

@tianyi
Copy link
Contributor

tianyi commented Oct 22, 2014

Is there any admin could review this PR ASAP? We got some PR blocked by this one. like #2542 @marmbrus @rxin

@liancheng
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22345 has started for PR 2762 at commit 782722f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22345 has finished for PR 2762 at commit 782722f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DeferredObjectAdapter(oi: ObjectInspector) extends DeferredObject

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22345/
Test FAILed.

@chenghao-intel
Copy link
Contributor Author

It passed the unit test in my local with the master rebased. Seems weird.

@chenghao-intel
Copy link
Contributor Author

retest this please. @liancheng :)

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #490 has started for PR 2762 at commit 782722f.

  • This patch merges cleanly.

@marmbrus
Copy link
Contributor

Also, this needs to be updated again (I think it conflicted with your CTAS patch).

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #490 has finished for PR 2762 at commit 782722f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DeferredObjectAdapter(oi: ObjectInspector) extends DeferredObject

@chenghao-intel
Copy link
Contributor Author

Rebased!
test this please.

@liancheng
Copy link
Contributor

test this please

oi match {
case x: ConstantObjectInspector => x.getWritableConstantValue
case x: PrimitiveObjectInspector => a match {
// TODO what if x.preferWritable() == true? reuse the writable?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that the Writable related code was reverted after rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently the oi is should not be "preferWritable" as toInspector doesn't return that. Even if we return an new instance of Writable here, it's the same as the preferWritable ObjectInspector does internally.
As you suggested we don't want to dynamically check the oi type, I will keep that for future improvement, and to reuse the writable object.

@SparkQA
Copy link

SparkQA commented Oct 29, 2014

Test build #22404 has started for PR 2762 at commit bcacfd7.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 29, 2014

Test build #22404 has finished for PR 2762 at commit bcacfd7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DeferredObjectAdapter(oi: ObjectInspector) extends DeferredObject

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22404/
Test PASSed.

@marmbrus
Copy link
Contributor

Thanks for working on this. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants