Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Fix accidental deletion of tabix files #12056

Merged
merged 2 commits into from
Jul 27, 2022

Conversation

chrisvittal
Copy link
Collaborator

When cleaning up a text export, we delete all files that our successful
tasks didn't produce. We weren't adding tabix files to this set, and so
were deleting tabix files that were created during vcf export. Now, we
explicitly add tabix files to our 'to keep' list.

When cleaning up a text export, we delete all files that our successful
tasks didn't produce. We weren't adding tabix files to this set, and so
were deleting tabix files that were created during vcf export. Now, we
explicitly add tabix files to our 'to keep' list.
val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2))
partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) =>
val path = file.loadString(cb)
cb += (files(i * 2) = path)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be creating invalid code. I'm getting lir blocks that are not well formed.

This is the stack trace of the block formation that is later failing the 'block well formed' assertion here

E           java.lang.Thread.getStackTrace(Thread.java:1564)
E           is.hail.lir.Block.<init>(X.scala:343)
E           is.hail.asm4s.Code$.apply(Code.scala:54)
E           is.hail.asm4s.Value$$anon$3.get(Code.scala:1282)
E           is.hail.asm4s.package$.valueToCode(package.scala:287)
E           is.hail.asm4s.package$.intToCode(package.scala:358)
E           is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3(MatrixWriter.scala:768)
E           is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3$adapted(MatrixWriter.scala:766)
E           is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$4(SIndexablePointer.scala:101)
E           is.hail.asm4s.CodeBuilderLike.ifx(CodeBuilder.scala:97)
E           is.hail.asm4s.CodeBuilderLike.ifx$(CodeBuilder.scala:88)
E           is.hail.expr.ir.EmitCodeBuilder.ifx(EmitCodeBuilder.scala:38)
E           is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$2(SIndexablePointer.scala:99)
E           is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1(CodeBuilder.scala:119)
E           is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1$adapted(CodeBuilder.scala:119)
E           is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:114)
E           is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:107)
E           is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E           is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:119)
E           is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:119)
E           is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E           is.hail.types.physical.stypes.concrete.SIndexablePointerValue.forEachDefined(SIndexablePointer.scala:96)
E           is.hail.expr.ir.VCFExportFinalizer.writeMetadata(MatrixWriter.scala:766)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so with this change, I can get the tests to pass. @tpoterba @patrick-schultz why?

-      val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2))
+      val len = partPaths.loadLength()
+      val files = cb.memoize(Code.newArray[String](len * 2))
       partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) =>
         val path = file.loadString(cb)
-        cb += (files(i * 2) = path)
+        cb += files.update(i, path)
         // FIXME(chrisvittal): this will put the string ".tbi" in generated code, we should just access the htsjdk value
-        cb += (files(i * 2 + 1) = Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))
+        cb += files.update(cb.memoize(i + len), Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some logic changes, like i*2 -> i in the middle replacement. Are you saying that before this change, there was a malformed lir, and not after?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I see you just changed the layout of files. I also can't see why this should affect the lir generated.

Copy link
Collaborator

@patrick-schultz patrick-schultz Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If IntelliJ's "desugar scala code" is to be believed, the old version wasn't actually desugaring to an update. That might explain it because the path wasn't getting used, so the enclosed block stayed empty. I'd expect explicitly calling update is the important change here. The other changes look fine.

@danking danking merged commit 75e852f into hail-is:main Jul 27, 2022
daniel-goldstein pushed a commit to daniel-goldstein/hail that referenced this pull request Aug 8, 2022
* [query] Fix accidental deletion of tabix files

When cleaning up a text export, we delete all files that our successful
tasks didn't produce. We weren't adding tabix files to this set, and so
were deleting tabix files that were created during vcf export. Now, we
explicitly add tabix files to our 'to keep' list.

* fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants