Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous CCW chain results in EQUIPMENT CHECK #660

Closed
SE20225 opened this issue May 11, 2024 · 14 comments
Closed

Erroneous CCW chain results in EQUIPMENT CHECK #660

SE20225 opened this issue May 11, 2024 · 14 comments
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.

Comments

@SE20225
Copy link

SE20225 commented May 11, 2024

During the last few months, I have spent some time updating an old program so that it can execute channel programs without getting 'disturbed' by the standard SEEK+SetFileMask prefixing that is standard in MVS.

On XA, authorized programs could turn it off by setting a bit in an IO control block, but this is not possible on older MVS (or at least I am not aware of how to do it), so I had to add code effectively introducing another IOS Driver. So now I have this running under Hercules and MVS-TK5.

In particular I intend to run ECKD (DX+LR which MVS-TK5 does not use). Very good exercise, boys. as my professor used to say when he did not want to do something himself.

While developing this, I used a simple (single) NOP CCW and for the next test, I tried the next old chain in my code. I happens to be a SEARCH ID, TIC back, WRITE CKD chain. It is a very well known fact that a SEEK must precede this and then of course it runs perfectly. But I tried without prefixing and it failed, but not as I expected.

I get an EQC and when traced in Hercules it seems that the code somehow runs amok and produces a varying number of trace entries before passing end of track and gets the EQC. Since then I have reread all my older DASD books and it is not clearly documented what error should surface (does anybody have a real 2314 for testing?) but it seems that a NoRecordFound could be more appropriate (since we could be searching wrong track).

Only with 3880 (and newer) is it clear from the book that a CMD REJ with invalid sequence (02) should be expected. The 4.6 release of Hercules that I am using has code to detect this, but it is effective only with 3990. My 3380 gets attached to a 3880 (without J/K support).

Tried 3350 and get the same EQC. With 3390 (which uses a 3990 CU), I get the CMDREJ 02 as coded in ckddasd.c.

I think that something is wrong here. This should not give an EQC.

I am providing an edited Hercules log with IO trace and a small piece of SAVECORE dump of a code to recreate, and the virtual volume I am using.

Set the device number in the instruction starting at loc 100 (hex) and start in Supervisor State at loc 100.

It seems that the actual contents of the track is irrelevant.

This is a very low severity 'problem'. What more shall I provide?

It is certainly much more important that Hercules correctly executes correctly coded CCW chains than that it gives correct error responses to all possible ways to miscode DASD I/O. What about a read or write without a search?

Is there some simple way to test if a track address of CCHH has been established or not for the current CCW chain?

andersedlund@telia.com

@Fish-Git
Copy link
Member

Since then I have reread all my older DASD books and it is not clearly documented what error should surface ... but it seems that a NoRecordFound could be more appropriate (since we could be searching wrong track).

Do you have "GA32-0274-05 3990,9390 Storage Control Reference"?

What about "SA22-1025-00 System-390 Internal Disk Subsystem - Reference Guide (Multiprise 3000)"?

I am providing an edited Hercules log with IO trace and a small piece of SAVECORE dump of a code to recreate, ...

Providing the actual assembler source code would be better.

19:25:58 HHC01413I Hercules version 4.6.0.10941-SDL-g65c97fd6-modified
19:25:58 HHC01414I (C) Copyright 1999-2023 by Roger Bowler, Jan Jaeger, and others
19:25:58 HHC01417I ** The SDL 4.x Hyperion version of Hercules **
19:25:58 HHC01415I Build date: Feb 15 2024 at 11:52:18

The modifications are from issue 615 and in ckddasd.c 

Unacceptable. Please try again with an unmodified version of Hercules. Preferably version 4.7, the most recent officially released version.

... and the virtual volume I am using.

I see no virtual volume attached to this issue. GitHub has limits on the size of the files that you can attach. I'm guessing that's probably why it wasn't attached. If you could upload it somewhere and just provide a download URL for us, that would be great. Or, if it's not too big, you could upload it to my SoftDevLabs FTP server and then I could provide the download URL for others. (But I prefer that you upload it yourself somewhere if you don't mind.)

What more shall I provide?

  1. The assembler source code to your test.
  2. The virtual volume you are testing with.
  3. The Hercules log of your test that illustrates the problem using an unmodified version of Hercules.

@Fish-Git Fish-Git added BUG The issue describes likely incorrect product functionality that likely needs corrected. QUESTION... A question was asked but has not been answered yet, -OR- additional feedback is requested. Researching... The issue is being looked into or additional information is being gathered/located. labels May 11, 2024
@SE20225
Copy link
Author

SE20225 commented May 15, 2024

I installed 4.7 and the EQC recreated nicely. Somehow, it has
escaped me that 4.7 had become available. I guess I have unintentionally somehow
turned off getting messages sent from Hercules when issues change etc.

I am enclosing a Hercules log with IO trace enabled. From MVS.
This time I will zip the directory so you will find the files, I hope.

Interestingly, the handloop code that I have causes Hercules 4.7 to crash and take
a dump. I am enclosing a couple of them. I thought (wrong) that this was a result
of fooling around with trace functions or running on a half MP, but it fails with
a uniprocessor without trace too! Same CORE file ran fine on my (modified) 4.6
system. The IO trace when running under MVS went OK and is in the log file.

I could send my assembler code but of course it takes a MVS to run. I am copying
the relevant CCW chains here. I guess you could easily fit them into some
code you already have.

Under MVS this is what runs:

WRICCW   CCW   SIDE,SEEKAD,CC,5                                       
*                                                                     
         CCW   TIC,*-8,0,0                                            
*                                                                     
         CCW   WCKD,DACOUNT,CD,8   DATA CHAIN WRITE OF 8 BYTES COUNT  
*                                                                     
         CCW   WCKD,QRAS,CD,1      SEVERAL WRITE OF ONE BYTE'         
         CCW   WCKD,QRAS,CD,1      2                                  
         CCW   WCKD,QRAS,CD,1      3                                  
         CCW   WCKD,QRAS,CD,1      4                                  
         CCW   WCKD,QRAS,CD,1      5                                  
         CCW   WCKD,QRAS,CD,1      6                                  
         CCW   WCKD,QRAS,CD,1      7                                  
         CCW   WCKD,QRAS,CD,1      8                                  
*                                                                     
         CCW   WCKD,QRAS,0,392     AND THE REMAINING BYTES TO 400     
* 
SEEKAD   DC    X'0002000000' 
DACOUNT  DS    0D                                                              
         DC    X'0002000001'       COUNT FIELD                        
         DC    X'00'               KL = 0                             
         DC    H'400'              DL = 400                           
*                                                     ](url)

SIDE is 31, TIC 08, WCKD 1D
Sorry about the confusing label SEEKAD
QRAS IS any address where 400 bytes of garbage follows.
This 'misuse' of data chaining causes an OVERRUN error with traditional parallel channels.
In real life this could occur with bad interface cables and MVS has code to set up
WRITE INHIBIT for those paths to avoid the situation where the machine has to write zeros
because the channel did not deliver data quickly enough. So it is rather an UNDERRUN !

In the handloop this is simplified to

WRICCW   CCW   SIDE,SEEKAD,CC,5                                       
*                                                                     
         CCW   TIC,*-8,0,0                                            
*                                                                     
         CCW   WCKD,DACOUNT,0,408  
* 
SEEKAD   DC    X'0002000000' 
DACOUNT  DS    0D                                                                   
         DC    X'0002000001'       COUNT FIELD   REC 1                        
         DC    X'00'               KL = 0                             
         DC    H'400'              DL = 400 
         DS    400X                          
*            

As for documentation on what error should be produced by this incomplete chain, it is very
clear that modern equipment will give a command reject. So in the best of worlds, let's
assume that the older machines did this too. CMDREJ (or NRF) has an advantge in that they
are not retried as an EQC will be proessed.

When reading the reference books for 2314 and older they primarily describe how the CCWs
are to be used in running text. 3830/330 has a section for each CCW which (to my surprise)
actually state: NONE for Chaining and Special Requirements. Same for 2305!
When other than None is stated, it is typically related to Set File Mask.
For 3880 it is clarified that with F/C 3005 (J/K) there is a requirement of a preceding
Seek or Locate. and in more recent books, it is also statement that a CMD REJ will occur.

Of course there was always a requirement for Seek AND Search in almost all successful DASD
CCW chains.
I am enclosing a copy a DASD Introduction book. On pages 3-27 to 3-28 is a description of
the 'StandAloneSeek' that was used to avoid typing up channels while the physical seek
is in progress. This illustration really suggests that the follow up chain can start with
a search. Assuming then that the CU keeps the CCHH information from one chain to next.
I doubt this. As I remember there is a seek in the second chain too, but if the access
is in place, it should be fast. And with a 'shared DASD' config there will be at least
one different machine which could touch the same volume between the two chains!
Unfortunately the IECIOSAM source code with the TurnKey system is not complete, so I
am lost here. I'll continue to look into this and am also reading about special considerations
for R0 (which is seek'd to here) mentioned in ckddasd.c.

For 2314 a command reject is explained as invalid command code but with another bit set
it means 'invalid sequence of commands'. And this scheme is refined for later machines.

So there are requirements. But whether you would get a CMD REJ or possibly(?) NRF
on other machines is not really important. As long as correct chains run.
No reason to get an EQC.

I hope you can recreate using any of the chains above.

andersedlund@telia.com
338001.zip

@SE20225 SE20225 closed this as completed May 15, 2024
@SE20225 SE20225 reopened this May 15, 2024
@SE20225
Copy link
Author

SE20225 commented May 15, 2024

I could successfully recreate the EQC by running the handloop i 'step mode'. Hercules remained up.

@Fish-Git
Copy link
Member

I could send my assembler code but of course it takes a MVS to run.

Not necessarily. I could more than likely get it to compile just fine with SATK's ASMA assembler (which is what I normally use and highly recommend for any/all stand alone tests. Let me know if you need any help getting it installed and/or configured).

Interestingly, the handloop code that I have causes Hercules 4.7 to crash and take
a dump.

Is that included in the .zip? Does it always occur? That is to say, can it be easily and reliably recreated?

p.s. Have you tried 4.8-DEV? Does it behave the same?

@SE20225
Copy link
Author

SE20225 commented May 19, 2024

No problem downloading the SATK-ASMA stuff and I expect no problem in coding a test/recreate program, but it takes some reading before understanding how the SATK process works. I'll model new code on what I received from the ICKDSF related changes, 615.

How do I download the 4.8-DEV code? Or rather from where? Is it readily executable or does it require compilation like any 4.x source code?

Anders

@Fish-Git
Copy link
Member

Fish-Git commented May 19, 2024

... but it takes some reading before understanding how the SATK process works.

Yes, it seems quite intimidating at first, but give me a few (minutes? hours? days? centuries?) and I'll try to whip up a Quick Start guide for you. There are only a couple of things you really need to do before you're up and running. It's not as complicated as it may seem.

How do I download the 4.8-DEV code? Or rather from where?

You can't. At the moment(*), you have to build it for yourself.

Is it readily executable or does it require compilation like any 4.x source code?

No, there are no pre-built executables for the 4.8-DEV code (yet(**)). At the current time you still have to build it for yourself. I recommend using Bill Lewis's most excellent Hercules Helper product. It automates the entire process for you, making it incredibly easy to build any version of Hercules for yourself directly from the repository (which is what we recommend).

Not sure what you mean by "like any 4.x source code" though. Pre-built downloadable binaries for ALL official releases of Hercules have always been available for Windows. Only the Linux crowd needs to build Hercules for themselves (due to the plethora of different distributions and platforms out there). But for those using Windows, downloadable pre-builts have always been available for every 4.x release from 4.1 all the way to 4.7.


(*) As I understand it, Bill (and someone else?) is supposedly working on trying to automate the building of the current development branch of our repository each time there is a new commit made. This way, anyone can download a "bleeding edge" still-under-development version of Hercules at any time without having to build it for themselves. But AFAIK it's not ready yet.

(**) ibid.

@wrljet
Copy link
Member

wrljet commented May 19, 2024

Fish,

(*) As I understand it, Bill (and someone else?) is supposedly working on trying to automate the building of the current development branch of our repository each time there is a new commit made. This way, anyone can download a "bleeding edge" still-under-development version of Hercules at any time without having to build it for themselves. But AFAIK it's not ready yet.

There's really nothing preventing me from checking that in, other than embarrassment of the crufty code.

@Fish-Git
Copy link
Member

Fish-Git commented Jun 3, 2024

There's really nothing preventing me from checking that in, other than embarrassment of the crufty code.

Understood. I leave that to you.

You've certainly proven you're an extremely competent Hercules developer who knows wtf they're doing, so I leave it to you to commit your changes whenever you dang well feel like it! (i.e. whenever you're fricking good and ready to!)  :)

@Fish-Git Fish-Git removed the QUESTION... A question was asked but has not been answered yet, -OR- additional feedback is requested. label Jun 3, 2024
@SE20225
Copy link
Author

SE20225 commented Jun 3, 2024 via email

@Fish-Git
Copy link
Member

Fish-Git commented Jun 3, 2024

I am currently in hospital for some minor planned surgery.

Minor or not, we all, of course, wish you a satisfactory outcome and a speedy recovery!

As for testing with the leading edge 4.8 development code, I still have not been able to figure out how to find and download it.

Building the development branch of Hercules is identical to building the current release, with the exception of one additional command to checkout the development branch of the repository (called develop):

git  clone  https://github.com/SDL-Hercules-390/hyperion.git  hyperion-dev
cd   hyperion-dev
git  checkout  develop

I believe Bill's "Hercules Helper" also supports checking out and building the develop branch too:

$ cd ~
$ git clone https://github.com/wrljet/hercules-helper.git
$ mkdir herctest && cd herctest
$ ~/hercules-helper/hercules-buildall.sh  --flavor=sdl-hyperion  --auto

@wrljet
Copy link
Member

wrljet commented Jun 3, 2024

The develop branch is the default for Hercules-Helper.

That can be changed using a custom --config= file.
If more info on that is desired, just ask.

Bill

@SE20225
Copy link
Author

SE20225 commented Jun 5, 2024

In the attached HERCPB06.zip file you will find:

  • GH660.CORE     which is a loadcore file with a handloop based on the code I received for issue #615.
     
    The first test runs successfully and performs a very standard SEEK+SEARCH+Write CKD chain.
     
    The second one is the same but starting directly on the search. On my Hercules code, which is release 4.7, it does produce the EQC. At least running like this shows that the Hercules DASD CU does not allow any information about the 'current track' to survive from one chain to the next.
     
    I think you can easily try it on the most recent development code to show whether it reproduces the problem or whether it happens to have been fixed in the interim.

  • LOG_CORE     is the Hercules log from the run.

  • GH660.asm and GH660.asmlist     are the source and assembler output.
     
    I tested it by adding tests to the #615 code and made a few changes while trying this out. I would say improvements of course. They are summarized within the first 100 lines or so.

  • 338001.282     could really be any 3380 volume where Cyl 2 Trk 0 will be written on.

This remains a very low priority problem. You have to start with a hopeless CCW chain and the result is not disastrous, just a waste of retries.

For my part, it is more interesting to resume to try old CCW chains from my MVS CE days than installing Hercules code which you probably already have. And loading the .core file and trying it should not take long.

Is there any reason to believe that this "bug" could have been fixed in the interim? Maybe I will find into other odd situations and now I have better tools for giving you an easily reproducible and verifiable test case.

Anders

@Fish-Git
Copy link
Member

Fish-Git commented Jun 5, 2024

Thanks, Anders!

I will download your .zip file and give your test a whirl, and then get back to you.

Fish-Git added a commit that referenced this issue Jun 6, 2024
@Fish-Git Fish-Git removed the Researching... The issue is being looked into or additional information is being gathered/located. label Jun 6, 2024
@Fish-Git
Copy link
Member

Fish-Git commented Jun 6, 2024

That didn't take long!  :)

The chaining requirement checks were all coded to test for only 3990 control units instead of both 3990 or 3880 control units too.

You can add a cu=3990 to your dasd statement to workaround the problem until 4.8 comes out. This will prevent the EQPCK from occurring and cause the CMD REJ to occur instead. (The default cu= for 3380 dasd is of course 3880, which was causing the chaining requirement test to be skipped, leading to an eventual Equipment Check when the device's buffer overflowed, causing message HHC00419E "error attempting to read past end of track" due to it getting stuck in a read count loop because the of bad track orientation because of the missing Seek.)

The fix I just committed (41c8617) will appear in the next release. Until then, if you want the fix, you'll have to build the 4.8-DEV development version (develop branch) yourself. You can use Bill Lewis's most excellent Hercules Helper product to do that for you. Otherwise you'll have to wait for 4.8 to be released.

Closing as resolved.

@Fish-Git Fish-Git closed this as completed Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.
Projects
None yet
Development

No branches or pull requests

3 participants