-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in IndexFactory and BinningIndexBuilder. #906
Conversation
Codecov Report
@@ Coverage Diff @@
## master #906 +/- ##
==============================================
+ Coverage 65.564% 65.76% +0.196%
- Complexity 7374 7401 +27
==============================================
Files 528 529 +1
Lines 31981 31968 -13
Branches 5467 5463 -4
==============================================
+ Hits 20968 21022 +54
+ Misses 8868 8801 -67
Partials 2145 2145
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this @cmnbroad! I learned a lot from this fix. Some minor comments and some the stream problem with my suggestions.
Looking forward to get this in and retrieve the tribble/tabix indexing support in htsjdk and derived projects!
public AsciiLineReader(final InputStream is){ | ||
// NOTE: This will wrap the input stream in a PositionalBufferedStream even if its already a PositionalBufferedStream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this constructor could be safer by using a private method to wrap the input stream:
private PositionalBufferedStream wrapStream(final InputStream is) {
if (is instanceof PositionalBufferedStream) {
return (PositionalBufferedStream) is;
} else if (is instanceof BlockCompressedInputStream) {
throw new IllegalArgumentException("Cannot use AsciiLineReader with BlockCompressedInputStream; use BlockCompressedAsciiLineReader instead");
} else {
return new PositionalBufferedStream(is);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it should be illegal to pass a BCS to AsciiLineReader (my comment that it violates the contract might be an overstatement). Perhaps we should just change the doc to be more explicit about the fact that it provides buffering, which affects the positions returned.
if (bufferedInputStream instanceof PositionalBufferedStream) { | ||
pbs = (PositionalBufferedStream) bufferedInputStream; | ||
public LocationAware makeIndexableSourceFromStream(final InputStream inputStream) { | ||
if (inputStream instanceof BlockCompressedInputStream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to deal with this in a new static method AsciiLineReader.from(InputStream is)
and deprecate the constructor. This will be safer for the users of AsciiLineReader
and will simplify the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like that would put an unnecessary constraint on AsciiLineReader (it should be legal to wrap an AsciiLineReader around a BCS). Having said that, its confusing because AsciiLineReader confers both a line orientation, as well as buffering, which is what messes up the positions. This would be a lot simpler if the buffering were separate.
Also, given how brittle all this code is, I like the idea of having this behavior and the comments explicitly included right here, where the explicit control over the buffering/positioning matters (this method is specifically to support indexing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we had to duplicate this code in GATK anyway, I added a "from" method per this suggestion.
* An AsciiLineReader implementation that wraps a BlockCompressedInputStream and provides no additional buffering. | ||
* Useful for cases where we need to preserve virtual file pointers in the underlying stream, such as during indexing. | ||
*/ | ||
public class BlockCompressedAsciiLineReader extends AsciiLineReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the static method is done in AsciiLineReader
, this could be a private inner class to avoid direct usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think now that we have that static method we should make this either a package protected class or an inner class of AsciiLineReader. I don't think we want to expose it unless we have a use case for that.
@@ -28,7 +28,11 @@ | |||
|
|||
/** | |||
* A wrapper around an {@code InputStream} which performs it's own buffering, and keeps track of the position. | |||
* | |||
* | |||
* TODO: This class implements Positional, which in turn extends LocationAware, which requires preservation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be fixed by throwing an IllegalArgumentException
if the InputStream
is a BlockCompressedInputStream
as a temporary solution. It is much better than violate the contract and assume that it is working as expected...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that would be the right thing to do, but htsjdk is kind of painted into a corner at this point. There are several other code paths that would have to change, but worse, BinaryFeatureCodec (and thus BCF2Codec) use PositionalBufferedStream for a SOURCE type parameter, even though BCF can be block compressed. So that codec actually depends on being able to wrap a PositionalBufferedStream around a block compressed stream.
@@ -92,6 +88,7 @@ public void testWriteIndex(final File inputFile, final IndexFactory.IndexType ty | |||
Assert.assertTrue(tempIndex.exists()); | |||
// load the generated index | |||
final Index loadedIndex = IndexFactory.loadIndex(tempIndex.getAbsolutePath()); | |||
//TODO: This not very useful; the test can pass even if the generated index is unuseable for queries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this test only for checking if writing the index does not blow up, and if the in-memory index and the one written have the same contents. For testing if it is usable for queries, it should be a different test (which actually does not require to write the index down, I guess).
} | ||
|
||
@Test(dataProvider = "bedTabixIndexTestData") | ||
public void testBedTabixIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test!
@@ -102,6 +114,8 @@ public void testQueryProvidedItemsAmount() throws IOException { | |||
) // create TabixIndex straight from plaintext VCF | |||
.write(plainTextVcfIndexFile); // write it | |||
|
|||
//TODO: you can pass in a .tbi file as the index for a plain .vcf, but if you *don't* pass in the file name and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is it correct to use a .tbi index with a plain text file? As you noted in a previous comment, this is not allowed by tabix... I suggest to blow up in this case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial review pass -- haven't looked at the tests yet, so a second pass will be needed.
} else { | ||
pbs = new PositionalBufferedStream(bufferedInputStream); | ||
return new AsciiLineReaderIterator(new AsciiLineReader(inputStream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this case? Seems like you're not "indexable" if you reach this else clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly; it depends on the type of the inputStream. I left this in since it represents the previous behavior for this method. If inputStream isn't wrapping a block compressed stream, and reports it's position correctly, I think it would work. We could throw instead, but if we we're going to do that, we could deprecate this method and replace it method with two overloads with the correct stream argument types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we should have a test case for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there at least be a warning if we reach this? It seems very possible that you're going to try indexing stdin or something like that here.
* An AsciiLineReader implementation that wraps a BlockCompressedInputStream and provides no additional buffering. | ||
* Useful for cases where we need to preserve virtual file pointers in the underlying stream, such as during indexing. | ||
*/ | ||
public class BlockCompressedAsciiLineReader extends AsciiLineReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class needs unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
/** | ||
* Read a single line of inpu, advnced the underlying stream only enough to read the line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inpu -> input
advnced -> advanced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
}; | ||
|
||
public String readLine(final PositionalBufferedStream stream) { | ||
throw new IllegalArgumentException("A BlockCompressedAsciiLineReader class cannot be used to read from a PositionalBufferedStream"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnsupportedOperationException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -50,6 +52,10 @@ | |||
*/ | |||
public BinningIndexBuilder(final int referenceSequence, final int sequenceLength) { | |||
this.referenceSequence = referenceSequence; | |||
// Initially set each window to -1 so we can distinguish between windows that have no overlapping | |||
// features, and those who's lowest offset is 0, which is a valid (virtual file) offset for feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who's -> whose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -33,6 +34,7 @@ | |||
* Builder for a BinningIndexContent object. | |||
*/ | |||
public class BinningIndexBuilder { | |||
private static final int UNINITIALIZED_WINDOW = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a unit test for this class that tests with a record at position 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. There were previously no unit tests so now there are tests for a few basic cases.
@@ -243,6 +243,7 @@ private void readHeader() throws IOException { | |||
try { | |||
is = ParsingUtils.openInputStream(path, wrapper); | |||
if (hasBlockCompressedExtension(new URI(URLEncoder.encode(path, "UTF-8")))) { | |||
// TODO: TEST/FIX THIS! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you ensure that a github ticket is filed for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* feature query (otherwise the feature reader can terminate prematurely if the | ||
* header is large). | ||
* @param lineIterator | ||
* @return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fill in @return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Override | ||
public Object readActualHeader(LineIterator reader) { | ||
public Object readActualHeader(final LineIterator lineIterator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit test for this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
final PositionalBufferedStream pbs; | ||
if (bufferedInputStream instanceof PositionalBufferedStream) { | ||
pbs = (PositionalBufferedStream) bufferedInputStream; | ||
public LocationAware makeIndexableSourceFromStream(final InputStream inputStream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you file a ticket against LocationAware
to either make all implementation obey the interface contract, or change the interface to reflect the actual behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also PR fixes #943, which is a tracking ticket for the BinningIndexBuilder issue fixed as part of this PR. |
Round 1 done. Back to @droazen. |
return new PositionalBufferedStream(fileStream); | ||
} catch (final FileNotFoundException e) { | ||
throw new TribbleException.FeatureFileDoesntExist("Unable to open the input file, most likely the file doesn't exist.", inputFile.getAbsolutePath()); | ||
} catch (final IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets make sure we don't get npr in these exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
final File inputVCF, | ||
final Interval queryInterval) throws IOException { | ||
// copy the original file and create the index for the copy | ||
final File tmpVCF = new File(IOUtil.createTempDir("testCreateTabixIndexFromVCF", null), inputVCF.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup this temp dir afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Test(dataProvider = "bcfDataFactory") | ||
public void testCreateLinearIndexFromBCF(final File inputBCF) throws IOException { | ||
// copy the original file and create the index for the copy | ||
final File tmpBCF = new File(IOUtil.createTempDir("testCreateIndexFromBCF", null), inputBCF.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same cleanup issue here I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
index.writeBasedOnFeatureFile(tmpBCF); | ||
|
||
try(final VCFFileReader originalReader = new VCFFileReader(inputBCF,false); | ||
final VCFFileReader tmpReader = new VCFFileReader(tmpBCF, new File(tmpBCF.getAbsolutePath()+ ".idx"),true)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hardcoded file extension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
final List<Interval> queryIntervals | ||
) throws Exception { | ||
// copy the input file and create an index for the copy | ||
final File tmpBed = new File(IOUtil.createTempDir("testBedTabixIndex", null), inputBed.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
temp dirs all over the place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This looks good to me modulo the few minor issues. Merge when those are addressed.
final TabixIndex tabixIndexGz = IndexFactory.createTabixIndex(tmpVCF, new VCFCodec(), null); | ||
tabixIndexGz.writeBasedOnFeatureFile(tmpVCF); | ||
|
||
try(final VCFFileReader originalReader = new VCFFileReader(inputVCF,false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need space after try
, and while
below.
BEDCodec bedCodec = new BEDCodec(); | ||
try (final InputStream is = ParsingUtils.openInputStream(gzippedBedFile.getPath()); | ||
final BlockCompressedInputStream bcis = new BlockCompressedInputStream(is)) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
open {
should go on previous line, unless the line is too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For multiline resource blocks I think this way is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A matter of taste :)
One data point, oracle's Java documentation: https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
@DataProvider(name = "bedDataProvider") | ||
public Object[][] getLinearIndexFactoryTypes(){ | ||
return new Object[][] { | ||
new Object[] { new File(TestUtils.DATA_DIR, "bed/Unigene.sample.bed") }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary new Object[]
, and there shouldn't be a space around the contents of an array initializer, e.g.
public Object[][] getLinearIndexFactoryTypes(){
return new Object[][] {
{new File(TestUtils.DATA_DIR, "bed/Unigene.sample.bed")},
{new File(TestUtils.DATA_DIR, "bed/Unigene.sample.bed.gz")}
};
}
final File discontinuousFile = new File(TestUtils.DATA_DIR + "bed/disconcontigs.bed"); | ||
final BEDCodec bedCodec = new BEDCodec(); | ||
@DataProvider(name = "bedDataProvider") | ||
public Object[][] getLinearIndexFactoryTypes(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space before {
// for ( int mid = 0; mid <= end; mid += 1000000 ) { | ||
// tests.add(new Object[]{0, mid, mid+1000000, end}); | ||
// } | ||
List<Object[]> tests = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be shorter to write:
return new Object[][] {{1226943, 1226943, 1226943, 2000000}};
|
||
public class BinningIndexBuilderTest extends HtsjdkTest { | ||
|
||
static int referenceSequenceIndex = 19; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be private
, final
and all uppercase, e.g. REFERENCE_SEQUENCE_INDEX
@jrobinso We're planning to make some changes to the way indexing works in order to fix several bugs, and wanted to get out in front of any issues that might arise for IGV. Amongst other things, with this PR, when indexing we no longer close the input SOURCE/stream after reading the header (this affected indexing for at least one codec in GATK). Can you take a look and see if this will require any IGV changes ? Thx. |
I don't see any IGV impact, and if there are I will deal with it in IGV. |
a7a4eb5
to
74e64ff
Compare
public final String readLine(final PositionalBufferedStream stream) throws IOException{ | ||
// TODO: This should be deprecated and made private/merged with readLine, or made | ||
// static, as it reads from the provided input stream rather than the wrapped input stream | ||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed this before, but could you add a deprecation date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and a javadoc comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put an @deprecated inside the javadoc, with the date, and added an @see to the "from" method. I can't figure out how to anchor to the OTHER readLine method though. Maybe I could if I swapped the order of the methods....
* TODO: This class implements LocationAware, which requires preservation of virtual file pointers on BGZF inputs. | ||
* However, if the inputStream wrapped by this class is a BlockCompressedInputStream, it violates that contract by | ||
* wrapping the stream and returning positional file offsets instead. | ||
*/ | ||
public AsciiLineReader(final InputStream is){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we deprecate this constructor and eventually make it private? Is there a case outside of this class that we would want to call this constructor instead of from()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. We could do that. There are a bunch of internal uses of it, but thats easily fixed. But should we deprecate the other one too (the one that takes a PositionalBufferedInputStream), since "from" is the preferred method ? Otherwise its weird to leave the PBS constructor, but not have a BCS constructor (you'd have to use .from for that case). It would be most consistent to just have .from, but thats less BWC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that it would make sense to also deprecate the other one. What's BWC stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry...backward compatible. I'll deprecate the other one as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmnbroad A few last minute comments. Sorry!
@lbergelson I may have added more questions than answers... See what you think. |
@cmnbroad Before merging this, can we ensure that there are clear and complete instructions for codec authors on how to write a codec that is properly indexable? (eg., in a comment to |
@droazen Done. We should let @lbergelson have one more look at the previous set of changes I made when he gets to it. |
@cmnbroad Looks good to me. 👍 Merge when ready. |
8eb7f93
to
6a7fe97
Compare
Squashed and rebase once more. Will merge when tests pass. |
protected AsciiLineReader() {}; | ||
|
||
/** | ||
* TODO: This class implements LocationAware, which requires preservation of virtual file pointers on BGZF inputs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you want TODO here. This sounds more like a warning or a note to the user of the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it is a TODO, but you're right the TODO doesn't need to be in the javadoc, just the warning.
* However, if the inputStream wrapped by this class is a BlockCompressedInputStream, it violates that contract by | ||
* wrapping the stream and returning positional file offsets instead. | ||
* | ||
* @deprecated 8/8/2017 see {@link #from} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the format you want here is use {@link #from(InputStream)}
. Saying "see" to me sounds like @see
which has a different usage in javadoc. You can provide the types of the parameters if you want to specify an overridden method.
Also it's not great form to use a date, since the date when the software released is not yet fixed, or easy for a user to determine given a .jar they've got lying around. If you want to include this information, use the version number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the "see"s to "use". AFACIT, we've been putting deprecation dates, I think mostly for our own benefit, since we have existing deprecated stuff but we don't know when it was deprecated. As soon as the maintainer groups defines the deprecation rules, we'll need that info...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've started adding the dates so we can get a rough sense of how long something has been deprecated. We could always look in the blame which is more accurate, but it's a pain if you're scanning through lots of things. We could provide the version number, but our deprecation removal policy (which doesn't really exist in any official form) is "has this been deprecated for a long time" rather than "has this been deprecated for N versions".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbergelson I left the date in - any input on that ? Phil is probably right that its of limited use to the user, but not sure if you guys have discussed the plan for deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind - just saw your comment....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmnbroad Yeah, it's really not intended for the users except for them to say "oh, yeah, this is ancient and I shouldn't use it." A version number wouldn't be a bad idea, but then we have to be sure we guess what version the commit is going into, which is awkward. Checking the blame is the ultimate guide, but it's a pain to do in bulk.
/** | ||
* Read a single line of input, advance the underlying stream only enough to read the line. | ||
*/ | ||
public String readLine() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are overridden, they need an @Override
annotation. Also there's no need to javadoc an overridden method, as javadoc will copy down the base class method doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
private char[] lineBuffer; | ||
private int lineTerminatorLength = -1; | ||
|
||
protected AsciiLineReader() {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a great use of an empty constructor. If subclassed improperly a user ends up with a class that's in an invalid state. It will only work correctly if certain methods are overridden, which are not part of a documented API.
The proper solution is to create an abstract base class (or interface with default methods) that both AsciiLineReader
and BlockCompressedAsciiLineReader
inherit from. That makes it explicit which methods must be overridden and prevents having inherited fields in BlockCompressedAsciiLineReader
that are invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed thats a better solution, but it its not source compatible. We recognize that we need to do something more comprehensive to fix the underlying problems, and there are a couple of issue tickets to represent that. Ideally we'll only break compatibility once for these issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
6a7fe97
to
1cc12f9
Compare
1cc12f9
to
bebc5ae
Compare
Description
Fixes #393 and a few other issues when creating indices through IndexFactory.create*Index methods that take a file:
Checklist