-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[query] Fix accidental deletion of tabix files #12056
[query] Fix accidental deletion of tabix files #12056
Conversation
When cleaning up a text export, we delete all files that our successful tasks didn't produce. We weren't adding tabix files to this set, and so were deleting tabix files that were created during vcf export. Now, we explicitly add tabix files to our 'to keep' list.
val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2)) | ||
partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) => | ||
val path = file.loadString(cb) | ||
cb += (files(i * 2) = path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be creating invalid code. I'm getting lir blocks that are not well formed.
This is the stack trace of the block formation that is later failing the 'block well formed' assertion here
E java.lang.Thread.getStackTrace(Thread.java:1564)
E is.hail.lir.Block.<init>(X.scala:343)
E is.hail.asm4s.Code$.apply(Code.scala:54)
E is.hail.asm4s.Value$$anon$3.get(Code.scala:1282)
E is.hail.asm4s.package$.valueToCode(package.scala:287)
E is.hail.asm4s.package$.intToCode(package.scala:358)
E is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3(MatrixWriter.scala:768)
E is.hail.expr.ir.VCFExportFinalizer.$anonfun$writeMetadata$3$adapted(MatrixWriter.scala:766)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$4(SIndexablePointer.scala:101)
E is.hail.asm4s.CodeBuilderLike.ifx(CodeBuilder.scala:97)
E is.hail.asm4s.CodeBuilderLike.ifx$(CodeBuilder.scala:88)
E is.hail.expr.ir.EmitCodeBuilder.ifx(EmitCodeBuilder.scala:38)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.$anonfun$forEachDefined$2(SIndexablePointer.scala:99)
E is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.$anonfun$whileLoop$1$adapted(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:114)
E is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:107)
E is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E is.hail.asm4s.CodeBuilderLike.whileLoop(CodeBuilder.scala:119)
E is.hail.asm4s.CodeBuilderLike.whileLoop$(CodeBuilder.scala:119)
E is.hail.expr.ir.EmitCodeBuilder.whileLoop(EmitCodeBuilder.scala:38)
E is.hail.types.physical.stypes.concrete.SIndexablePointerValue.forEachDefined(SIndexablePointer.scala:96)
E is.hail.expr.ir.VCFExportFinalizer.writeMetadata(MatrixWriter.scala:766)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so with this change, I can get the tests to pass. @tpoterba @patrick-schultz why?
- val files = cb.memoize(Code.newArray[String](partPaths.loadLength() * 2))
+ val len = partPaths.loadLength()
+ val files = cb.memoize(Code.newArray[String](len * 2))
partPaths.forEachDefined(cb) { case (cb, i, file: SStringValue) =>
val path = file.loadString(cb)
- cb += (files(i * 2) = path)
+ cb += files.update(i, path)
// FIXME(chrisvittal): this will put the string ".tbi" in generated code, we should just access the htsjdk value
- cb += (files(i * 2 + 1) = Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))
+ cb += files.update(cb.memoize(i + len), Code.invokeStatic2[htsjdk.tribble.util.ParsingUtils, String, String, String]("appendToPath", path, htsjdk.samtools.util.FileExtensions.TABIX_INDEX))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some logic changes, like i*2
-> i
in the middle replacement. Are you saying that before this change, there was a malformed lir, and not after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see you just changed the layout of files
. I also can't see why this should affect the lir generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If IntelliJ's "desugar scala code" is to be believed, the old version wasn't actually desugaring to an update
. That might explain it because the path
wasn't getting used, so the enclosed block stayed empty. I'd expect explicitly calling update
is the important change here. The other changes look fine.
* [query] Fix accidental deletion of tabix files When cleaning up a text export, we delete all files that our successful tasks didn't produce. We weren't adding tabix files to this set, and so were deleting tabix files that were created during vcf export. Now, we explicitly add tabix files to our 'to keep' list. * fixes
When cleaning up a text export, we delete all files that our successful
tasks didn't produce. We weren't adding tabix files to this set, and so
were deleting tabix files that were created during vcf export. Now, we
explicitly add tabix files to our 'to keep' list.