New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix flagstat output file encoding #840

Merged
merged 1 commit into from Sep 28, 2015

Conversation

Projects
None yet
3 participants
@ryan-williams
Member

ryan-williams commented Sep 28, 2015

Lesson: don't pick random encodings from DataOutputStream to output your files in!

Also, cleans up some extra whitespace.

@@ -121,7 +121,8 @@ class FlagStat(protected val args: FlagStatArgs) extends BDGSparkCommand[FlagSta
val fs = FileSystem.get(conf)
val path = new Path(outputPath)
val outputStream = fs.create(path, true)
outputStream.writeUTF(output)
outputStream.writeBytes(output)

This comment has been minimized.

@fnothaft

fnothaft Sep 28, 2015

Member

Why writeBytes instead of writeChars?

@fnothaft

fnothaft Sep 28, 2015

Member

Why writeBytes instead of writeChars?

This comment has been minimized.

@ryan-williams

ryan-williams Sep 28, 2015

Member

if all output chars are < 128 (which should always be true of this output), then writeBytes should actually be UTF8, whereas writeChars will output two bytes per char, and the former seems preferable to me? I was thinking that the existing println behavior is just doing one byte per char but I don't actually know, this is not my area of expertise. lmk what you think.

@ryan-williams

ryan-williams Sep 28, 2015

Member

if all output chars are < 128 (which should always be true of this output), then writeBytes should actually be UTF8, whereas writeChars will output two bytes per char, and the former seems preferable to me? I was thinking that the existing println behavior is just doing one byte per char but I don't actually know, this is not my area of expertise. lmk what you think.

This comment has been minimized.

@fnothaft

fnothaft Sep 28, 2015

Member

Oh, I see. I am OK either way.

@fnothaft

fnothaft Sep 28, 2015

Member

Oh, I see. I am OK either way.

This comment has been minimized.

@ryan-williams

ryan-williams Sep 28, 2015

Member

fwiw I just tested with writeChars and my shell displays the results just like it did before, which was my only litmus test, so I'm fine either way

@ryan-williams

ryan-williams Sep 28, 2015

Member

fwiw I just tested with writeChars and my shell displays the results just like it did before, which was my only litmus test, so I'm fine either way

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 28, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/945/
Test PASSed.

AmplabJenkins commented Sep 28, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/945/
Test PASSed.

@fnothaft fnothaft merged commit 2d5c7d3 into bigdatagenomics:master Sep 28, 2015

1 check passed

default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft
Member

fnothaft commented Sep 28, 2015

Thanks @ryan-williams!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment