Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hetbsf.tst fails for BZIP2 on s390x #535

Closed
Fish-Git opened this issue Dec 17, 2022 · 12 comments
Closed

hetbsf.tst fails for BZIP2 on s390x #535

Fish-Git opened this issue Dec 17, 2022 · 12 comments
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.

Comments

@Fish-Git
Copy link
Member

The hetbsf.tst performs two identical tests: the first using a ZLIB compressed .het tape as input, the second using a BZIP2 compressed .het tape as input. On s390x systems/guests, the second BZIP2 test always fails:

   >>>>> line  8786: Received unexpected wait state:  000A0000 00EEEEEE
   >>>>> line  8800: Storage compare mismatch.
                     Want: R:00000500 E5D6D3F1  
                     Got:  R:00000500 FFFFFFFF
   >>>>> line  8805: Storage compare mismatch.
                     Want: R:00000510 0048  
                     Got:  R:00000510 FFFF
   >>>>> line  8810: Storage compare mismatch.
                     Want: R:00000510 E5D6D3F1  
                     Got:  R:00000510 FFFFFFFF
   Test "S/370 HET (BZIP2) tape BSF into Load Point":  1 OK compares.  4 failures.

Further research has revealed the failure is due to an incorrect API call to the BZ2_bzBuffToBuffDecompress libbz2 (bz2) function in source member hetlib.c's het_read function:

hyperion/hetlib.c

Lines 964 to 984 in 5857860

#if defined( HET_BZIP2 )
case HETHDR_FLAGS1_BZLIB:
slen = HETMAX_BLOCKSIZE;
rc = BZ2_bzBuffToBuffDecompress( sbuf,
(void *) &slen,
tbuf,
tlen,
0,
0 );
if (rc != BZ_OK)
{
rc = errno;
free_aligned( tbuf );
errno = rc;
return( HETE_DECERR );
}
tlen = slen;
break;
#endif /* defined( HET_BZIP2 ) */

The problem stems from the fact that the bzip2 BZ2_bzBuffToBuffDecompress API is actually defined in header bzlib.h as:

BZ_EXTERN int BZ_API(BZ2_bzBuffToBuffDecompress) (
      char*         dest,
      unsigned int* destLen,
      char*         source,
      unsigned int  sourceLen,
      int           small,
      int           verbosity
   );

but Hercules's het_read function in hetlib.c is passing unsigned long values (instead of unsigned int values) for the destLen (slen) and sourceLen (tlen) parameters:

hyperion/hetlib.c

Lines 765 to 772 in 5857860

DLL_EXPORT int
het_read( HETB *hetb, void *sbuf )
{
char *tptr;
int rc;
unsigned long slen;
int flags1, flags2;
unsigned long tlen;

 
Recall that s390x systems (just like most other non-Windows systems) are LP64, not LLP4.  (where 'long' is 64 bits, not 32 bits like an 'int' is)

When the slen and tlen variables in the het_read function are properly declared as unisgned int, the test passes.

The same bug exists in the het_write function as well, where the BZ2_bzBuffToBuffCompress API call is being made.

This GitHub Issue is being created to simply document the bug and the resulting upcoming fix, which I will be making shortly.
 

@Fish-Git Fish-Git added BUG The issue describes likely incorrect product functionality that likely needs corrected. IN PROGRESS... I'm working on it! (Or someone else is!) labels Dec 17, 2022
Fish-Git added a commit that referenced this issue Dec 17, 2022
@Fish-Git
Copy link
Member Author

Fixed by commit 7ab077e.

Closing.

@Fish-Git Fish-Git removed the IN PROGRESS... I'm working on it! (Or someone else is!) label Dec 17, 2022
@wrljet
Copy link
Member

wrljet commented Dec 17, 2022

You're closing before it's tested on the Spark64?

@Fish-Git
Copy link
Member Author

You're closing before it's tested on the Spark64?

Yes. Based on my analysis I am 99.999%+ confident the Sparc64 bug is identical and thus the fix just committed will also fix it too.

@Fish-Git
Copy link
Member Author

Yes. Based on my analysis I am 99.999%+ confident the Sparc64 bug is identical and thus the fix just committed will also fix it too.

(SIGH!)   Or at least it would if I had implemented it correctly! (the way I actually tested it!)

Sorry about that.   :(

New commit d53fc0c replaces previous commit.

@wrljet
Copy link
Member

wrljet commented Dec 17, 2022

OK, with commit d53fc0c the tests now pass on Hercules emulated s390x zLinux Ubuntu 18, real hardware s390x zLinux Centos 7, and Sun Sparc64 NetBSD 9.

(just mentioning for completeness, the original "fix", commit 7ab077e resulted in compiler warnings related to the fix, failed ZLIB test on s390x, and a crash on the Sparc64.)

@Fish-Git
Copy link
Member Author

the original "fix", commit 7ab077e resulted in compiler warnings related to the fix, failed ZLIB test on s390x, and a crash on the Sparc64.)

Yep. I saw the compiler warnings too, which is how I knew I f**ked up.

Glad to hear all is well now. I'm still having trouble testing it on my system though. I do a git pull and it says up to date, but after building it (make followed by make install), for some reason it's still saying it's at version g2e984f9a, which I don't understand! I must have done something wrong. I'm in the middle of doing everything all over again, but as you know it takes freaking forever.

@wrljet
Copy link
Member

wrljet commented Dec 18, 2022

Using the same system under Windows 7 (although God knows what you've done to it), what I did in zLinux:

cd ~/herctest/hyperion
git fetch
git pull
cd build
make
make check

And yes, it took freakin' forever.

@Fish-Git
Copy link
Member Author

Fish-Git commented Dec 18, 2022

Using the same system under Windows 7 (although God knows what you've done to it), what I did in zLinux:

I don't know what I was doing wrong or why it wasn't working, but I decided to just do a hyperion-buildall.sh -v -p, which, as expected, took forever, but eventually finished successfully, and now my Hercules properly reports version gd53fc0c8.

FWIW I was doing the exact same thing (except for the git fetch which I wasn't doing; is it important?), and each time, after the make (followed by a make install of course), it would still report the same version. I don't know why.

But the problem's moot. I did manage to get the system build using your hercules-helper script and all is fine now.

I'm not sure what you mean by "(although God knows what you've done to it)". Except for a single alias command (cls=clear) and adding . (current directory) to the $PATH, it's exactly the same system you sent me.

@wrljet
Copy link
Member

wrljet commented Dec 18, 2022

Fish,

If you just run what I did, perhaps it'll skip the _dynamic_version so the informational display will be out of date?

As far as git fetch goes... I hate git and don't get it. So I don't know if it's needed or not. Probably it isn't needed.

I'm not sure what you mean by "(although God knows what you've done to it)". Except for a single alias command (cls=clear) and adding . (current directory) to the $PATH, it's exactly the same system you sent me.

I was being cute.

Having . in the PATH is considered by some to be a security risk.

Bill

@wrljet
Copy link
Member

wrljet commented Dec 18, 2022

The fact that it runs forever, and everything actually works, is a pretty good thing, don't you think?

@Fish-Git
Copy link
Member Author

Fish-Git commented Dec 18, 2022

If you just run what I did, perhaps it'll skip the _dynamic_version so the informational display will be out of date?

Probably. My *nix skills are weak.

I'm not sure what you mean by "(although God knows what you've done to it)".

I was being cute.

Ah. My bad. Missed that.  :)

Having . in the PATH is considered by some to be a security risk.

I'm aware of that. Given how I use my system however it's not a real concern to me. It's more convenient for me to be able to enter the name of a command or script without having to always prefix it with ./. If I want to run the script foobar in my current directory, I want to be able to enter just foobar to run it, not ./foobar. It's just a personal preference of mine.

@Fish-Git
Copy link
Member Author

The fact that it runs forever, and everything actually works, is a pretty good thing, don't you think?

Absolutely! It works quite well! I'm quite proud of your hard work, and you should be too!  :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.
Projects
None yet
Development

No branches or pull requests

2 participants