Skip to content

[compiler] Add SJavaString / SJavaBytes / SJavaArrayString #10625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 30, 2021

Conversation

tpoterba
Copy link
Contributor

No description provided.

@tpoterba tpoterba added the WIP label Jun 29, 2021
@tpoterba tpoterba force-pushed the java-string branch 2 times, most recently from 96362ba to 69ef775 Compare June 30, 2021 12:54
@tpoterba tpoterba force-pushed the java-string branch 2 times, most recently from 360e172 to 19d1ea3 Compare July 21, 2021 01:54
Copy link
Collaborator

@chrisvittal chrisvittal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. Found a few minor issues.

Comment on lines +260 to +269
val v1 = cb.newLocal[String]("hamming_str_1", sc1.loadString())
val v2 = cb.newLocal[String]("hamming_str_2", sc2.loadString())

val m = v1.loadLength().cne(v2.loadLength())
val l1 = cb.newLocal[Int]("hamming_len_1", v1.invoke[Int]("length"))
val l2 = cb.newLocal[Int]("hamming_len_2", v2.invoke[Int]("length"))
val m = l1.cne(l2)

IEmitCode(cb, m, {
cb.whileLoop(i < v1.loadLength(), {
cb.ifx(v1.loadByte(i).cne(v2.loadByte(i)),
cb.whileLoop(i < l1, {
cb.ifx(v1.invoke[Int, Char]("charAt", i).toI.cne(v2.invoke[Int, Char]("charAt", i).toI),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavioral change. The old version calculated the hamming distance on bytes this one does so on character sequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the old implementation was wrong.

@@ -19,7 +19,7 @@ object ShuffleOrdering {
val bcode1 = x.asInstanceOf[SCanonicalShufflePointerCode].binaryRepr
val bcode2 = y.asInstanceOf[SCanonicalShufflePointerCode].binaryRepr
val ord = BinaryOrdering.make(bcode1.st, bcode2.st, ecb)
ord.compareNonnull(cb, x.asString.asBytes(), y.asString.asBytes())
ord.compareNonnull(cb, x.asInstanceOf[SCanonicalShufflePointerCode].binaryRepr, y.asInstanceOf[SCanonicalShufflePointerCode].binaryRepr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this just use bcode1 and bcode2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yeah.

Comment on lines 47 to 48
private[this] def memoizeWithBuilder(cb: EmitCodeBuilder, name: String, sb: SettableBuilder): SBinaryValue = {
val s = new SJavaBytesSettable(sb.newSettable[Array[Byte]]("sjavabytes_memoize"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is unused, maybe need to use SJavaBytesSettable.apply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just need to use name in the string passed here.

Comment on lines 59 to 60
private[this] def memoizeWithBuilder(cb: EmitCodeBuilder, name: String, sb: SettableBuilder): SValue = {
val s = new SJavaStringSettable(sb.newSettable[String]("sjavastring_memoize"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name unused, same as SJavaBytes comment

case _ =>
val sv = code.asIndexable.memoize(cb, "scode_array_string")
val arr = cb.newLocal[Array[String]]("scode_array_string", Code.newArray[String](sv.loadLength()))
sv.foreach(cb) { case (cb, idx, elt) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this new foreach method rather than foreachDefined here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that will work.

Copy link
Collaborator

@chrisvittal chrisvittal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

now test batch needs to pass

@danking danking merged commit 42ed97d into hail-is:main Jul 30, 2021
daniel-goldstein pushed a commit to daniel-goldstein/hail that referenced this pull request Aug 3, 2021
…0625)

* [compiler] Add SJavaString / SJavaBytes / SJavaArrayString

* some fixes

* fix bytes

* add check

* fix array wrap

* fix java array virtual type and settable tuple types

* fix arraylen

* fix rg

* addressed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants