Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.16.0 Pre-Release issues and discussion #197

Closed
attipaci opened this issue Nov 9, 2021 · 42 comments · Fixed by #202 or #208
Closed

1.16.0 Pre-Release issues and discussion #197

attipaci opened this issue Nov 9, 2021 · 42 comments · Fixed by #202 or #208
Assignees
Milestone

Comments

@attipaci
Copy link
Collaborator

attipaci commented Nov 9, 2021

Please feel free to use this ticket to note any issues you might encounter with any of the 1.16.0 release candidates, between now and the finalized 1.16.0 release (anticipated in a month or two perhaps).

@attipaci attipaci added this to the 1.16.0 milestone Nov 9, 2021
@attipaci attipaci self-assigned this Nov 9, 2021
@attipaci
Copy link
Collaborator Author

attipaci commented Nov 10, 2021

One issue with rc1 is casting decimal header values to integers, which was possible in prior versions but throws a NumberFormatException in the 1.16.0-rc1. It's an easy change to allow down-casting (which may involve rounding and/or truncation), so we can restore the old behavior and allow 'parsing' decimal types as integers routinely.

@attipaci attipaci linked a pull request Nov 10, 2021 that will close this issue
@attipaci attipaci pinned this issue Nov 10, 2021
@attipaci
Copy link
Collaborator Author

After a fair bit of struggle with github Actions, and getting it to use GPG without obscure errors, automated signed releases (SNAPSHOT and staged) seem to be working... One step closer to the official release of 1.16.0...

@mbtaylor
Copy link
Contributor

I haven't so far managed to look at the 1.16.0 RC in detail, but the introduction of the new method checkTruncated() in nom.tam.util.ArrayDataInput is a breaking change that means my 1.15.2-compliant code no longer compiles (I have classes which implement this interface). Have you considered declaring this method with a default implementation?

@attipaci
Copy link
Collaborator Author

@mbtaylor, that's exactly the sort of thing I was hoping to find out before the official release. :-)

As for finding a resolution, the release is baselined for Java 8, which does not yet support default methods -- and at this point I'd be reluctant to bump it further up. But, I could work around it either by making it a static method or else by putting in somewhere where it is FITS-specific (since, really it's only an issue with Fits files...). I'll come up with something and release an rc3 next Monday...

If you find more trip-ups in the meantime, let me know...

@attipaci
Copy link
Collaborator Author

OK, I think if I make it a static method Fits.checkTruncated(ArrayDataInput) it should not stand in the way. I could even reduce the visibility to package level, since only Fits and Header really use it, and there is no reason why it should be exposed to the public anyway... Let me know if you disagree...

This was linked to pull requests Nov 18, 2021
@attipaci attipaci removed a link to a pull request Nov 19, 2021
@mbtaylor
Copy link
Contributor

Thank you, I haven't really grasped so far how checkTruncated is used, but that sounds like it will address my problem.

But for the record I believe that java 8 (which I agree should be the target platform for this release) does support default methods; the following code compiles under java 8 for me:

public interface A {
    void reset();
    default boolean checkTruncated() {
        return false;
    }
}

@attipaci
Copy link
Collaborator Author

You are right. For some reason I remembered default methods as having been introduced in Java 9 only... ;-).

@attipaci
Copy link
Collaborator Author

attipaci commented Nov 20, 2021

@mbtaylor, BTW, your change of adding markSupported() to the same interface is a similarly breaking change for anyone else who has implemented ArrayDataInput, like you. I think we'll want to make it a default method returning true for that very reason... I'll include that in rc3...

@mbtaylor
Copy link
Contributor

Good point, thanks.

@attipaci attipaci reopened this Nov 21, 2021
@attipaci
Copy link
Collaborator Author

attipaci commented Nov 21, 2021

Hello all,

rc3 is now released for testing. Provided it fares well, let's aim for a 1.16.0 release in two weeks. If you want to run your own tests, but you will not be able to do it in the proposed time-frame, please communicate it here what (reasonable) timeline you require, and we'll adjust the plans accordingly. If new avoidable regressions are found, we'll keep making release candidates as long as needed, and push back the final release accordingly...

@mbtaylor
Copy link
Contributor

One behaviour change has provoked a new failure in my unit tests: attempting to create a nom.tam.fits.HeaderCard with a long value now results in a HeaderCard instance that actually represents multiple 80-character records (after the first they are CONTINUEs), rather than failing. I can see what's going on and work round it, but the behaviour is a bit surprising for a class named "HeaderCard" if you're not expecting it. I have not attempted to work out whether there are pitfalls for code that may be manipulating HeaderCard instances related to this change.

@mbtaylor
Copy link
Contributor

I still have some other unit test failures using 1.16.0-rc3, but I've run out of time to track them down. It's possible the problems result from bugs in my code.

@attipaci
Copy link
Collaborator Author

attipaci commented Nov 22, 2021

I agree that HeaderCard is confusing. Before long strings, there was a 1:1 correspondence between this object and an 80-character header record, but long strings have turned it upside down a long time ago. This is why Header has had separate getNumberOfPhysicalCards() vs getNumberOfCards() methods for a while now. That's nothing new in 1.16. What's new is that long strings are now enabled by default since as of 2016 they are now part of the FITS standard. Maybe you can try to disable them via FitsFactory.setLongStringsEnabled(false), if it's not something you would use normally -- or just to see if that's what breaks things on your end...

@attipaci
Copy link
Collaborator Author

Hi @mbtaylor,

I was wondering if you made progress on getting your code work with 1.16, or if you found any other avoidable regressions? Let me know if you need more time to sort things out, beyond next week...

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 2, 2021

There is an error in my tests that I've tried hard to track down, but I've not been able to get to the bottom of it.

I tried setting FitsFactory.setLongStringsEnabled(false) and the other backward compatibility settings discussed in #161, but that didn't fix it.

A bisect identified the commit which introduced this failure as d29355e, which is the big one rewriting the I/O stream classes. It appears to be the case that following that commit some calls to BufferedDataInputStream.readByte() are throwing an EOFException when there should be still bytes in the stream; at least the same code path using versions prior to that commit finds bytes there. But whether it's the readByte call at fault or some earlier behaviour that's exhausting the stream, I can't be certain. It might be something else going wrong, or even workarounds for incorrect behaviour by the earlier implementation now doing the wrong thing. This behaviour is buried deep in unit tests that use other parts of my code, so it's not easy to extract a standalone example.

Sorry not to be able to come up with something more transparent, but I really don't have time to look at this much more.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 2, 2021

Thanks Mark! I very much appreciate your efforts, and I think you may have given enough details for me to track it down. My guess is that it may be a mishandling of an underlying read() returning 0 in the new implementation. If so, that should be easy to track down and fix. I'll let you know when I have found the likely culprit, and we'll move on to an rc4 with it then.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 2, 2021

OK, I did find two instances where if a read(byte[], int, int) returned 0 (i.e. no data is immediately available from the input -- but not at the end of stream) an EOFException would be thrown. One of these could have got triggered in BufferedDataInputStream (the other would only affect BufferedFile).

I also recall that the old versions of read(byte[], int, int) calls did not follow the canonical contract of the method from InputStream -- which made it very confusing when it behaved which way. So, it's also possible that it's a bug in the old code...

In any case, I'll prepare an rc4 with the fixes for the above mentioned read handling. I'll release it by Monday (or perhaps a bit before on Sunday). In the meantime, you can also check it out from my 1.16.0-rc4 branch if you want to take a quick look before then...

@attipaci attipaci linked a pull request Dec 2, 2021 that will close this issue
@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

Hi Mark!

I think I can reproduce the error, but will spend some more time to understand it better...

But, basically, it is java.io.BufferedInputStream that causes this behavior. I switched to BufferedInputstream because I see no reason why we want to run with a home-brewed implementation of the same. Anyway, the following little test code exposes the issue:

    public void testCloggedInputStream() throws Exception {
        final Random random = new Random();
        
        InputStream is = new InputStream() {
            @Override
            public int read() throws IOException {
                throw UnsupportedOperationException("not implemented");
            }

            @Override
            public int read(byte[] b, int from, int length) throws IOException {
                int i = random.nextBoolean() ? 1 : 0;
                System.out.print(i);
                return i;
            }       
        };
        
        BufferedInputStream bis = new BufferedInputStream(is);
        
        for(int i=0; i<10; i++) System.out.println(" -> " + bis.read());

        in.close();
    }

The underlying InputStream.read(byte[], int, int) returns 1 or 0 randomly. Accordingly when it returns 0, you'd expect the BufferedInpuStream.read() to block, but instead it returns -1 (EOF)... What's really annoying is that the non-blocking read() behavior is not at all mentioned in the Java API docs of BufferedInputStream. Grrrr!

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

As for your test code, I think you might be reading from a stream before something provides data into it, expecting the read to block. One way to work around would be that if you know more data is on the way, is to either catch the EOFException as long as you expect more data, or else to call available() before attempting to read...

In general I think that best practice is avoid blocking reads in the library, since it can result in a program hang without giving the user a chance to do something about it. In that regard, throwing the exception is not a bad thing as it gives the control back to the user, who can then keep trying to read again, or else move on...

Nevertheless, for backward compatibility I'll see if I can reinstate the blocking read with BufferedInputStream somehow (provided it's doable without getting too ugly...)

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

I think I found a workable solution, by delegating reads to the underlying input stream whenever the read of BufferedInputStream returns -1. I also added a ArrayInputStream.setAllowBlocking(boolean) and .isAllowBlocking() methods s.t. the user can control whether or not blocking reads behavior is desired. (The default is to allow blocking reads for back compatibility). And, I added a bunch of tests to verify the expected behavior both when blocking is enabled and disabled.

I think this should resolve your issue hopefully. If so, I'll aim for the 1.16 release on or around Dec 13, after giving rc4 a bit of time to discover any other lingering issues...

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 3, 2021

@attipaci, thanks for your efforts so far. I haven't tried your solution yet, but I'm not sure I agree with your analysis. First, your test code violates the InputStream contract. From the java.io.InputStream javadocs of read(byte[] b, int off, int len):

This method blocks until input data is available, end of file is detected, or an exception is thrown.

If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b.

So you're not allowed to return a 0 from that method (unless len==0); the fact that you do may be the reason that BufferedInputStream is displaying unexpected behaviour.

My code is reading from a stream that may not be ready; it's supplied from another thread which is decoding an input base64-encoded stream, so the data should be along eventually, but depending on thread scheduling it may or may not be ready when the read is initiated. I'm using DataInput.readByte() exactly so that it will block until data is available, or throw an EOFException if no more is going to come. A loop polling availability, or throw/catch of EOFException, is not a good solution for that. (In case you want to take a closer look, the exception is thrown at line 91 of IOUtils.java).

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

Hi Mark!

I don't disagree with you. But the fact is that Java's own java.io.BufferedInputStream itself violates that contract -- which is what the test code above demonstrates... It took a little working around to get to behave according to the contract you mention for InputStream...

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 3, 2021

The test code demonstrates java.io.BufferedInputStream doing the wrong thing if it's constructed using a broken InputStream. I suspect (though admittedly haven't proved) it wouldn't do that if your InputStream instance was behaving correctly.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

Good point! I did not read the InputStream contract carefully enough... Thanks for pointing that out :-). In that case I'm not sure if the changes to ArrayInputStream will do anything for you, or if they are even necessary... I'll try to think of other ways that things could be going wrong...

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

I undid the changes toArrayInputStream (due to my misreading the InputStream contract), but left the other EOF related changes in place. At this point I'm pretty sure that readByte() will only throw an EOFException if the underlying read() returns -1...

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

Mark,

Here is the old read(byte[], int, int) from BufferedDataInputStream before my changes:

    @Override
    public int read(byte[] obuf, int offset, int length) throws IOException {
        int total = 0;
        int remainingToRead = length;
        int currentOffset = offset;
        while (remainingToRead > 0) {
            // Use just the buffered I/O to get needed info.
            int xlen = super.read(obuf, currentOffset, remainingToRead);
            if (xlen <= 0) {
                if (total == 0) {
                    throw new EOFException();
                } else {
                    return total;
                }
            } else {
                remainingToRead -= xlen;
                total += xlen;
                currentOffset += xlen;
            }
        }
        return total;
    }

As you can see, it threw an EOFException instead of returning -1 at the end of file, which is clearly not in the InputStream contract. And, the corresponding old implementation of readByte() was simply:

    @Override
    public byte readByte() throws IOException {
        return (byte) read();
    }

Which means, that if you have overwritten either of the BufferedDataInputStream.read() methods with one that follows the contract of InputStream, returning -1 at the end of file consistently -- the old implementation of readByte() would not have thrown an exception but would return 0xff instead at the end of file. The new implementation of readByte() on the other hand will throw an EOFException if the underlying read() returns -1. Perhaps, that's at the root of your test failures?

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 3, 2021

I added a last-minute refactoring of the new (in 1.16) classes of ArrayDecoder / ArrayEncoder, to InputDecoder and OutputEncoder, which I think convey their functions more intuitively. In the unlikely case that someone already adapted their code to use the new classes, please forgive my whim...

@attipaci attipaci reopened this Dec 5, 2021
@attipaci
Copy link
Collaborator Author

attipaci commented Dec 5, 2021

rc4 is now out, pushing the final 1.16.0 release to December 13 or later (depending on how things go from here...)

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 6, 2021

@attipaci sorry, but the behaviour in my tests has not changed much. I still see

...
Caused by: java.io.EOFException
        at nom.tam.util.FitsDecoder.readByte(FitsDecoder.java:180)
        at nom.tam.util.FitsInputStream.readByte(FitsInputStream.java:295)
        at uk.ac.starlink.util.IOUtils.skipBytes(IOUtils.java:91)
        ...

when I believe there are bytes remaining (though not yet available()) in that stream.
(at nom-tam-fits-1.16.0-rc4)

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

Hi Mark!

I'm not too surprised.

It would be good to track down where this comes from. As I mentioned, at this point I'm pretty sure the library does not throw an EOFException unless the underlying read() returns -1. (I checked through all EOFException references in the lib).

So, you could either try to diagnose why this happens in your code -- perhaps by a deriving a new FitsInputStream subclass, which overrride the read(...) calls, by calling super.read(...), and printing / logging diagnostics that you can use when these return -1.... Or else, if you can come up with an isolated test code that can demonstrate that the library itself misbehaves in a clear way... (If so, we can integrate such a test with the corresponding fix)...

We should try to conclude it either way this week (or maybe next). I'm not in a rush to release 1.16, but I also don't want to delay it unnecessarily...

-- A.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

I just noticed that your exception originates from your own implementation of skipBytes(). I think that's probably a good place to start. You could try using FitsInputStream.skip(long) or .skipAllBytes(long) directly -- the first makes a best attempt and returns the number of bytes actually skipped, while the second wraps the first to throw an exception in the required number of bytes could not be skipped entirely. The implementation will try to use the skip(long) of the underlying stream, and if that fails, it will try to read() the bytes from the stream. Based on your trace you must be doing something similar too...

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

Ooops, I did find an issue with the FitsInputStream.skipAllBytes(long) implementation. I'll prepare a fix on my fork right away... If you can test from there that's great, or else I can make a quick rc5...

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 6, 2021

Yes I can pull from your fork.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

Well, it's not as simple as I first thought... What caught my eye is that it is normal for skip(long) to return 0, but we threw an EOFException, the same as for -1. So that's another thing that's not according to the canonical contract. But looking at the old code, 0 return was also handled the same as -1, so maybe it's not it after all... I'll let you know if I find something definitive...

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

Mark, in the meantime maybe you want to look at BufferedDataInputStream here before any of my changes. See if you can find out anything from the mess of what skipAllBytes() used to be... It's definitely different from the new behavior, but I want to understand which one is correct, or what may be missed in the new one...

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

So, the best I could find so far is that the new skipAllBytes() in FitsInputStream (l.275), has a condition:

        if (got != toSkip) {
            throw new EOFException("Reached end-of-stream after skipping " + got + " of " + toSkip);
        }

which would not handle negative arguments to skipAllBytes() properly. It should probably be if (got < toSkip) instead... Could it be you call this method with a negative value???

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 6, 2021

No, I'm not calling skipAllBytes, just InputStream.skip(long) or DataInput.skipBytes(int) (I don't think it matters which) followed by DataInput.readByte() if they yield zero.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 6, 2021

Well, you could replace your DataInput.readByte() with an InputStream.read() to cut out the middle-man, and handle the -1 yourself -- that way your read will not go through the nom-tam-fits lib at all, and you can see what really goes on...

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 7, 2021

Both of those signal end of file, so I suspect that something has gone wrong with earlier operations on the stream, but I can't work out what. It's possible that the older behaviour was wrong or that somehow or other there's something wrong in my code provoking the change of behaviour. I've tried hard to track down what's wrong and failed but I really can't spend more time on this. I don't want to hold up your release any longer.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 7, 2021

Hi Mark!

That's my feeling about it too -- that the old version mishandled the EOF, and hid an underlying problem for you. But, I wanted to be sure (and in the meantime, I still found a few things that needed a bit of tweaking :-). The InputStream.read() check seems to exonerate the 1.16.0 library code. As such, I will plan to go ahead with the release next week. But, if you do find anything worrisome before then, give me a howl!

Thanks again for your efforts :-). I really appreciate your feedback and catching some of the problems I created...

After the release, I'll be taking a break from nom-tam-fits lib for a while, but I'll be back for the next update or release (maybe I will try to adhere to a 6-month release cycle...). I hope you'll be back for more too!

@mbtaylor
Copy link
Contributor

mbtaylor commented Dec 8, 2021

OK, I'll mention one other item. Since some of the method signatures have changed at this release, 1.16.0 does not function as a "drop-in" runtime replacement for 1.15.2. For instance, the following code will compile against either 1.15.2 or 1.16.0:

import nom.tam.fits.Header;
public class Y {
    public static void main( String[] args ) throws Exception {
        Header hdr = new Header();
        hdr.addValue("SIMPLE", true, "Standard FITS format");
    }
}

However if I compile it against 1.15.2 and then run it against 1.16.0-rc4, it fails with:

Exception in thread "main" java.lang.NoSuchMethodError: nom.tam.fits.Header.addValue(Ljava/lang/String;ZLjava/lang/String;)V
        at Y.main(Y.java:5)

since the method signature that source call resolves to is different in the two versions (HeaderCard addValue(String,Boolean,String) vs. void addValue(String,boolean,String)). I'm not saying that has to be changed, or that you haven't already thought about it, but it should be understood and flagged in the release notes.

@attipaci
Copy link
Collaborator Author

attipaci commented Dec 8, 2021

True, and I have not thought much about drop-in replacements before. I'll be sure to comment on that in the release description. I'm not too worried about it though. Downstream software is either build from source(s), like in the case of Debian packages, or else the requisite jars are bundled with the app, to ensure that the versions are compatible. Drop-in replacements of shared libraries rarely work, as anyone who has ever tried drop-in replacing .so libs on Linux knows :-)

@attipaci attipaci unpinned this issue Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants