[query] Fix accidental deletion of tabix files#12056
Conversation
When cleaning up a text export, we delete all files that our successful tasks didn't produce. We weren't adding tabix files to this set, and so were deleting tabix files that were created during vcf export. Now, we explicitly add tabix files to our 'to keep' list.
| val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2)) | ||
| partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) => | ||
| val path = file.loadString(cb) | ||
| cb += (files(i * 2) = path) |
There was a problem hiding this comment.
This seems to be creating invalid code. I'm getting lir blocks that are not well formed.
This is the stack trace of the block formation that is later failing the 'block well formed' assertion here
E java.lang.Thread.getStackTrace(Thread.java:1564)
E is.hail.lir.Block.<init>(X.scala:343)
E is.hail.asm4s.Code$.apply(Code.scala:54)
E is.hail.asm4s.Value$$anon$3.get(Code.scala:1282)
E is.hail.asm4s.package$.valueToCode(package.scala:287)
E is.hail.asm4s.package$.intToCode(package.scala:358)
E is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3(MatrixWriter.scala:768)
E is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3$adapted(MatrixWriter.scala:766)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$4(SIndexablePointer.scala:101)
E is.hail.asm4s.CodeBuilderLike.ifx(CodeBuilder.scala:97)
E is.hail.asm4s.CodeBuilderLike.ifx$(CodeBuilder.scala:88)
E is.hail.expr.ir.EmitCodeBuilder.ifx(EmitCodeBuilder.scala:38)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$2(SIndexablePointer.scala:99)
E is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1$adapted(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:114)
E is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:107)
E is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:119)
E is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.forEachDefined(SIndexablePointer.scala:96)
E is.hail.expr.ir.VCFExportFinalizer.writeMetadata(MatrixWriter.scala:766)
There was a problem hiding this comment.
Okay, so with this change, I can get the tests to pass. @tpoterba @patrick-schultz why?
- val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2))
+ val len = partPaths.loadLength()
+ val files = cb.memoize(Code.newArray[String](len * 2))
partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) =>
val path = file.loadString(cb)
- cb += (files(i * 2) = path)
+ cb += files.update(i, path)
// FIXME(chrisvittal): this will put the string ".tbi" in generated code, we should just access the htsjdk value
- cb += (files(i * 2 + 1) = Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))
+ cb += files.update(cb.memoize(i + len), Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))There was a problem hiding this comment.
There are some logic changes, like i*2 -> i in the middle replacement. Are you saying that before this change, there was a malformed lir, and not after?
There was a problem hiding this comment.
Never mind, I see you just changed the layout of files. I also can't see why this should affect the lir generated.
There was a problem hiding this comment.
If IntelliJ's "desugar scala code" is to be believed, the old version wasn't actually desugaring to an update. That might explain it because the path wasn't getting used, so the enclosed block stayed empty. I'd expect explicitly calling update is the important change here. The other changes look fine.
* [query] Fix accidental deletion of tabix files When cleaning up a text export, we delete all files that our successful tasks didn't produce. We weren't adding tabix files to this set, and so were deleting tabix files that were created during vcf export. Now, we explicitly add tabix files to our 'to keep' list. * fixes
When cleaning up a text export, we delete all files that our successful
tasks didn't produce. We weren't adding tabix files to this set, and so
were deleting tabix files that were created during vcf export. Now, we
explicitly add tabix files to our 'to keep' list.