Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect calls to getrusage() #217

Open
PeterCoghlan opened this issue Apr 11, 2017 · 41 comments
Open

Incorrect calls to getrusage() #217

PeterCoghlan opened this issue Apr 11, 2017 · 41 comments

Comments

@PeterCoghlan
Copy link

I had a problem with the CPU timer (SPT and STPT instructions). I found the error which was one I made myself and I fixed it. In the course of investigating the problem, I came across code like this in two locations in clock.c:

    rc = getrusage((int)sysblk.cputid[regs->cpuad], &rusage);
    if (unlikely(rc == -1))
        result = host_tod();
    else
        result  = timeval2etod(&rusage.ru_utime);

The first argument appears to be the thread-id of a CPU thread, however, all the documentation I can find for getrusage() suggests that the first argument is supposed to be a small integer constant and that a thread-id is not valid in this position.

I inserted some debugging code to output the value of rc and errno and found that (under Linux on a Raspberry Pi at least), getrusage() fails every time it is called with rc = -1 and errno = EINVAL. This is in accordance expected behaviour documented in 'man getrusage' on the same platform. If nothing else, this means the use of unlikely() is inappropriate here.

It seems that there is no getrusage() in Windows and a getrusage() replacement is provided in w32util.c. However, this variant appears to have been coded to cope with a thread-id in the first argument position. (I don't have anything running Windows so I can't test whether this theory is correct.) My feeling is that this is not a valid implementation of getrusage() and if it is to be retained the way it is, it should be renamed to something else to avoid confusion.

There is a similar call to getrusage() in hsccmd.c for the qproc command:

            if (getrusage((int)sysblk.cputid[i], &rusage) == 0)
            {
             ...
             }

As no fallback method is provided in this case, it appears to me that under Linux anyway, getrusage() will always fail here and the block of code will never get executed.

There are two more calls to getrusage() in diagmssf.c like this:

            if (IS_CPU_ONLINE(i))
            {
                /* Get CPU times in microseconds */
                getrusage((int)sysblk.cputid[i], &usage);
                uCPU[i] = timeval2us(&usage.ru_utime);
                tCPU[i] = uCPU[i] + timeval2us(&usage.ru_stime);
            }

As there is no check to see if getrusage() failed here and it looks like it always will, on Linux anyway so
results are likely to be incorrect.

I think it may be possible to rework the code in clock.c so that getrusage() is called correctly but I don't see how to do this in the other cases.

My suggestion is to remove the calls to getrusage() in clock.c, hsccmd.c and diagmssf.c and replace them with calls to a new function implemented elsewhere which takes a thread-id as an argument and calls platform specific code depending on a macro set in hostopts.h.

@jphartmann
Copy link
Contributor

Originally, getrusage() got the CPU usage of the process or the process and its children.

As of the linux 2.6.26 kernel, you can also obtain thread usage by:

getrusage(RUSAGE_THREAD_SELF, ...

The getrusage you quote above is not likely to give a meaningful answer on Linux. How about checking the return code? EINVAL means that you are asking for something you cannot get.

Thread usage wast ported to/from FreeBSD 8.1. Other UNIXen must be investigated; it is not clear to me that they would return EINVAL, but it is likely that RUSAGE_THREAD_SELF is undefined on such a beast.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 11, 2017

I read RUSAGE_THREAD (not RUSAGE_THREAD_SELF) in my documentation for getrusage() - but it's a detail.

I'm interested in seeing this through because I'm getting some issues running z/VM under z/VM which MAY be related to a CPU Timer issue (but I haven't finished my analysis yet).

@dasdman
Copy link
Contributor

dasdman commented Apr 11, 2017 via email

@dasdman
Copy link
Contributor

dasdman commented Apr 11, 2017 via email

@jphartmann
Copy link
Contributor

Neither the FreeBSD manpage nor the Linux one mention specifying a thread ID. If it "works", that is by the UNIX weenie's definition of "works". Stay away from that.

@dasdman
Copy link
Contributor

dasdman commented Apr 11, 2017 via email

@ivan-w
Copy link
Contributor

ivan-w commented Apr 11, 2017

I think there are multiple situations :

  • Getting CPU/Wait time usage from within the thread (for SPT/STPT)
    -- The OS provides a facility for getting thread time info from within the thread (Linux, Windows, ?)
    -- The OS does not provide such facility
  • Getting CPU/Wait time from another thread (diagmss, panel command, panel display,...)
    -- The OS provides a facility for getting thread time info from another thread (Windows)
    -- The OS provides a facility for getting thread time info from within the thread only (Linux)
    -- The OS does not provide such facility

If there is no facility to retrieve thread run time information - there is not much that can be done - but only to fall back to process times or wall times.
For systems that only provide thread time info to self, for a CPU thread, it's probably possible to use some exchange mechanism to request the information from the thread (it's already done for some other purpose such as CPU synchronization, Broadcast PTLB, etc..).

I'm personally more concerned about the 1st case (and its sub-cases)... The rest is usually only needed for information purpose, not for anything nanosecond accurate !

@dasdman
Copy link
Contributor

dasdman commented Apr 12, 2017 via email

@jphartmann
Copy link
Contributor

You'll find the standard here:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/getrusage.html

Note that it does not mention RUSAGE_THREAD.

Also note that the contents of the returned structure is not specified.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 12, 2017

John,

Of course it doesn't. RUSAGE_THREAD is a linuxism (and possibly a GNUism). But I cannot find in posix (overall) or posix threads a capability to retrieve thread specific run time information.

The definition of the layout of "rusage" structure isn't defined per se in getrusage(), but the fields the structure should include are defined in :
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_resource.h.html

@jphartmann
Copy link
Contributor

getrusage() has been there as long as I remember. It is both in SVr4 and 4.3BSD and thus AIX, but not the thread support, of course.

@PeterCoghlan
Copy link
Author

I have one more suggestion.

It appears to me that at the moment the fallback method in the event that thread specific CPU time information is not available is to use wall clock time for the CPU timer. My suggestion is that the fallback method should be to use wall clock time less whatever time the particular Hercules CPU in question has spent in a wait state. This ought to be a much better approximation than assuming the CPU has been running all the time.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

Peter,
That would be odd since the CPU timer is supposed to keep running when the CPU is in a wait state.

@PeterCoghlan
Copy link
Author

Well, now I'm confused!

Do the times returned by getrusage() include the time that a CPU thread spends waiting I wonder? I assumed that they don't and that the reason for having the calls to getrusage() was because the CPU timer only ran when the CPU was executing instructions.

If the CPU timer runs all the time then perhaps using wall clock time is the right thing to do and there is no reason to be calling getrusage() at all?

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

Peter,

No.. Because the wall time may not reflect the time the CPU thread is actually executing instructions. The thread may be preempted by the OS to execute another thread, some paging may occur, the CPU instruction execution code may make system calls which may suspend the thread, or the system running hercules may itself be virtualized.

Using the 'process' getrusage would on the other hand mean you are counting for 1 CPU thread the amount of time running ALL threads in hercules between 2 points in time.
The only solution for an accurate CPU Timer maintenance is to be able to get the actual usage time for the thread while it is executing instructions.

@PeterCoghlan
Copy link
Author

Ok. It sounds like the time spent executing instructions by the CPU thread would need to have the time that spent by the same CPU in a wait state added to it then to come up with an accurate figure.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

Peter..

Yeah... We know that ! The issue (as reported) is exactly that : Getting the time spent by the CPU thread executing instructions. The rest is housekeeping.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

For info... I tried that : (just to see) :


#if defined(RUSAGE_THREAD)
    if(pthread_equal(sysblk.cputid[regs->cpuad],pthread_self()))
    {
        rc = getrusage(RUSAGE_THREAD, &rusage);
    }
    else
#endif
    {
        rc = getrusage((int)sysblk.cputid[regs->cpuad], &rusage);
    }
    if (unlikely(rc == -1))
        result = host_tod();
    else
        result  = timeval2etod(&rusage.ru_utime);

    return (result);

It doesn't work.. AT ALL ! (the CPU Timer gets really crazy values).

(Example : a program running in a loop, disabled...)

(Wait ~16 seconds)
17:45:04 HHC02324I CP00: PSW=0008000080042010 INST=B208C42A SPT 1066(12) set_cpu_timer
17:45:04 HHC02326I CP00: R:00042430:K:06=00000010 00000000 00000000 00000000 ................
...
(1st STPT)
17:45:04 HHC02324I CP00: PSW=0008100080042418 INST=B209C432 STPT 1074(12) store_cpu_timer
17:45:04 HHC02326I CP00: R:00042438:K:06=00000000 00000000 00000010 00000001 ................
(0 ?????)
....
(2nd STPT)
17:45:04 HHC02324I CP00: PSW=0008100080042418 INST=B209C432 STPT 1074(12) store_cpu_timer
17:45:04 HHC02326I CP00: R:00042438:K:06=2D9E184C 009EB600 00000010 00000001 ...<............
(What the ????)
....
(3rd STPT)
17:45:04 HHC02324I CP00: PSW=0008100080042418 INST=B209C432 STPT 1074(12) store_cpu_timer
17:45:04 HHC02326I CP00: R:00042438:K:06=2D9E184B FB0B2700 00000010 00000001 ................
(We're decrementing alright...)

--Ivan

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

Side comment... Apparently Posix Threads have a per thread clocking mechanism

pthread_getcpuclockid() used in conjunction with clock_gettime()... Could you use this ?

--Ivan

@Fish-Git
Copy link
Contributor

Fish-Git commented Apr 13, 2017

(sigh!) Sorry about that. Here are the corrected (non-framed) links:

@dasdman
Copy link
Contributor

dasdman commented Apr 13, 2017

From the comments, it appears that folks are repeating the research already done to even post what I already posted (2017-04-12 11:32:40 UTC) -- well BEFORE the current barrage of responses! Would it be possible to actually move the comments and research ahead?

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

True ! My bad - missed your previous comment about pthread_getcpuclockid() et al.

So..
Question 1 : Do we need to invoke the pthread_getcpuclockid() function every time or just at thread creation (and store it in an array - one for each CPU - like we do for the thread id) ?
Question 2 (for Fish) : What would it take to implement a fthread__getcpuclockid() and clock_gettime() ?

--Ivan

@jphartmann
Copy link
Contributor

Q1: Yes.

@ivan-w
Copy link
Contributor

ivan-w commented Apr 13, 2017

John,

Why ? Are there any conditions that would cause the clock id returned by the pthread_getcpuclockid() to change within the lifespan of the thread ?

@jphartmann
Copy link
Contributor

Sorry. Perhaps you could ask one question only?

Q1: No, Yes.

@Fish-Git
Copy link
Contributor

Fish-Git commented Apr 13, 2017

Question 2 (for Fish) : What would it take to implement a fthread__getcpuclockid() and clock_gettime()?

Coding identical functionality in Windows should be trivial I should think. We're basically almost already doing the same thing with our current getrusage() implementation.

@Peter-J-Jansen
Copy link
Contributor

I think to have fixed this problem.

The issue was first described by Peter Coghlan (as Issue #217 in the official Hyperion GitHub), and has now been solved along the lines as initially described by Mark "dasdman" L. Gaubatz, and commented on by Ivan, John, and Fish (all of which can be read up on GitHub).

The solution as described by Mark needed some amendments, and I would like to receive feedback on the choices I made in the process. The extra problems to solve were these :

(1) Mark's original getrusage() produced two statistics : thread user time, and thread system time.  Fish's Windows equivalent implementation for getrusage() supplies the Windows thread user and kernel times for these.  Only when running under Windows does the Hercules qproc command correctly show these CPU thread user and kernel times.  This qproc behavior has been retained as is.

(2) The clock_gettime() replacement for getrusage() has been mentioned on the 'net to produce the sum of both user and system time.  I have assumed that this is correct, even though I'd like further confirmation this to be correct.  The Windows clock_gettime() equivalent for thread CPU time therefore returns the sum of the Windows user and kernel times. 

(3) The original "diagmssf.c" code for DIAG 204 delivered the VM virtual CPU time as the getrusage() user time (uCPU); the VM total CPU time (tCPU) was computed as the sum of the getrusage() user+system time; the VM wait time (wCPU) was computed as the difference tCPU-uCPU.  This would imply that the VM wait time equals the getrusage() system time.  I don't think that is correct.  In order to correct this I have added and computed regs->waittime_accumulated in "timer.c" to set wCPU. (This requires OPTION_MIPS_COUNTING to be #defined, but after inspection, I think that is already implicitly required anyway in other places.)  With uCPU set to clock_gettime() thread user+system time, tCPU is now computed as uCPU+wCPU. I believe this delivers the more accurate DIAG 204 data.

(4) The clock_gettime() for thread user+system CPU time needs the thread clockid.  Following the thinking as voiced by Ivan and John, I've added sysblk.cpuclockid[cpu] so that only one pthread_getcpuclockid() call per CPU thread is needed.  

(5) The Windows equivalent for thread clockid simulates this as the negative of the thread_id, and the clock_gettime() equivalent then uses this negative to call the existing Windows getrusage() to obtain and sum the Windows thread user and kernel times.  Fish may wish to improve or beautify this simple but straightforward concept.   

I'd welcome feedback on the above.

Cheers,

Peter J.

Peter-J-Jansen added a commit that referenced this issue Jan 23, 2018
The issue was first described by Peter Coghlan (as Issue #217 in the official Hyperion GitHub), and has now been solved along the lines as initially described by Mark "dasdman" L. Gaubatz, and commented on by Ivan, John, and Fish (all of which can be read up on GitHub).

The solution as described by Mark needed some amendments, and required some extra problems to be solved.  These included :

 (1) Mark's original getrusage() produced two statistics : thread user time, and thread system time.  Fish's Windows equivalent implementation for getrusage() supplies the Windows thread user and kernel times for these.  Only when running under Windows does the Hercules qproc command correctly show these CPU thread user and kernel times.  This qproc behavior has been retained as is.

 (2) The clock_gettime() replacement for getrusage() has been mentioned on the 'net to produce the sum of both user and system time.  I have assumed that this is correct, even though I'd like further confirmation this to be correct.  The Windows clock_gettime() equivalent for thread CPU time therefore returns the sum of the Windows user and kernel times.

 (3) The original "diagmssf.c" code for DIAG 204 delivered the VM virtual CPU time as the getrusage() user time (uCPU); the VM total CPU time (tCPU) was computed as the sum of the getrusage() user+system time; the VM wait time (wCPU) was computed as the difference tCPU-uCPU.  This would imply that the VM wait time equals the getrusage() system time.  I don't think that is correct.  In order to correct this I have added and computed regs->waittime_accumulated in "timer.c" to set wCPU. (This requires OPTION_MIPS_COUNTING to be #defined, but after inspection, I think that is already implicitly required anyway in other places.)  With uCPU set to clock_gettime() thread user+system time, tCPU is now computed as uCPU+wCPU. I believe this delivers the more accurate DIAG 204 data.

 (4) The clock_gettime() for thread user+system CPU time needs the thread clockid.  Following the thinking as voiced by Ivan and John, I've added sysblk.cpuclockid[cpu] so that only one pthread_getcpuclockid() call per CPU thread is needed.

 (5) The Windows equivalent for thread clockid simulates this as the negative of the thread_id, and the clock_gettime() equivalent then uses this negative to call the existing Windows getrusage() to obtain and sum the Windows thread user and kernel times.
Peter-J-Jansen added a commit that referenced this issue Jan 23, 2018
There are (were) problems when running VM 2nd level under VM, in that the 2nd level VM would suffer form ABENDs (EXT105, DSP005 and PRG009).  They would occur intermittently, even on idling systems.  When left that way overnight, the 2nd level VM system would automatically restart, but in the morning the CPDUMP files would show these problems.  Additionally, it was observed that the 2nd level VM system would intermittently go into a CPU loop, exhibiting 100% CPU but on a single CPU only.  The ABENDs were observed to occur at that time.  The analysis of the ABENDs showed a probably incorrect CPU Timer emulation in Hyperion.

This was originally believed to be related to Hyperion issue #217, but this was proven to not be the case.  Rather, the problem has now been determined to be caused by the CPU Timer implementation being based on thread_cputime().  Such CPU Timer emulation relied on three variables which were lacking being treated in a critical section.  Also, basing the CPU Timer emulation on thread_cputime() would imply that a running mainframe CPU Timer would be stopped when the Hyperion host would pre-empt the host CPU thread emulating the mainframe CPU, which is incorrect.  Additionally, the Windows emulation of thread_cputime() suffers from a poor resolution (1/16th of a second).

This commit reverts the CPU Timer emulation to the earlier one, to not rely on thread_cputime(), but on hw_clock() and hw_tod.  The processing around DIAG 204 and CPU MAXRATES has not been altered and still uses thread_cputime_us().
@superdave
Copy link

There seem to be a few Unix implementations (including OS X 10.11, which I'm running, but also at least Solaris 11) that don't have one or more of pthread_getcpuclockid() or clock_gettime().

@superdave
Copy link

On earlier OS X, it looks like Mach's thread_info() is the way to go (yuck).

https://github.com/st3fan/osx-10.9/blob/master/xnu-2422.1.72/osfmk/mach/thread_info.h

On Solaris, I'm not quite sure what the answer is; it's surmised that getrusage() with RUSAGE_LWP is probably correct, since a thread is an LWP in Solaris.

I don't think there's a consistent solution for every Unix OS for this since threads arrived well after the various paths of Unix diverged and GNU/Linux does its own thing anyway.

@PeterCoghlan
Copy link
Author

Thanks to everyone for their contributions to this issue.

I originally reported this issue only because I found that the calls to getrusage() were failing all the time resulting in inefficiency due to falling back to an alternative method of calculating the CPU timer value.

Having read the many responses and studied the issue some more, I have become convinced that trying to use thread specific CPU usage time to calculate the value of the Hercules CPU timer is not the right way to go and will never yield the correct results.

Given that the CPU timer is to run both when the CPU in question is running and when the CPU in question is waiting, I believe that the correct solution is to use a CPU specific offset from wall clock time to calculate the current value of the CPU timer except when the CPU in question is stopped, in which case, it's CPU timer is also stopped.

That methods of calculating thread specific CPU usage are variably implemented and system specific is all the more reason not to use them, on top of their not being suitable for the task at hand.

@dasdman
Copy link
Contributor

dasdman commented Feb 7, 2018 via email

@PeterCoghlan
Copy link
Author

The understanding I have come to is that "CPU" in the title "CPU timer" is there purely to indicate that each CPU has one. It is not there to suggest that this timer has anything to do with measuring anything associated with the CPU. It is just a timer. It's purpose is to measure time, like the clock on the wall. It should keep going at the same rate whether the CPU is running or waiting and it only stops when the CPU is manually stopped.

It could be argued that it is a more realistic emulation of hardware by Hercules to slow a particular CPU timer when the associated Hercules CPU emulation thread is not getting much execution time but I disagree. A real hardware CPU timer is not going to diverge from running at wall clock speed as long as the associated CPU is not manually stopped. Attempting to measure CPU usage of CPU threads may lead to more realistic emulation in one respect but it is being traded for a loss of accuracy elsewhere. I feel the smaller gain in accuracy does not justify the larger loss of accuracy.

Regarding CP emulation of CPU timers, the CPU that CP is emulating has the same characteristics, architecture and instruction set as the real CPU running the emulation so perhaps it may be feasable for CP to adjust it's emulated CPU timers based on what it's emulated CPUs are doing. However, in Hercules, the CPU(s) running the emulation is/are complete unknown quantities with characteristics, architectures and instruction sets which are almost always completely different to the CPUs being emulated. I feel it is unrealistic and prone to error to equate usage of CPU time by an emulation thread in the Hercules environment with emulated CPU time made available by that thread.

@jphartmann
Copy link
Contributor

In an overcommitted LPAR environment, the CPU timer will not follow the wall clock.

Incidentally, there is no longer a CPU timer register; the value is computed by millicode on demand.

@PeterCoghlan
Copy link
Author

Can I put it another way and say that when we are not emulating an LPAR environment, the CPU timer will follow the wall clock?

If it is desirable to to be able to very accurately emulate an overcommitted LPAR environment, I think we need a way to measure emulated CPU time made available by Hercules CPU emulation threads. I don't think measuring the CPU usage time of the unknown quantity emulating CPU or CPUs is the way to do this. Perhaps it would be possible to assign a theoretical time taken by each instruction emulated by Hercules, individually sum the times required for all the instructions emulated by each particular CPU thread, add time spent waiting by each emulated CPU (should that be wall clock time waiting or some other definintion of time waiting based on what?) and use that figure to drive the CPU timer for each Hercules emulated CPU? Will this produce a more accurate result in the case of an overcommitted LPAR environment? Perhaps, depending on the accuracy of the instruction timings that were assigned but noting that in real hardware, differing operand data will likely result in differing timings and we have not accounted for that. Will it produce a more accurate result in a non-LPAR environment? If it is more accurate than using the wall clock, I'll eat my hat.

@ivan-w
Copy link
Contributor

ivan-w commented Feb 7, 2018

In an overcommitted LPAR environment, the CPU timer will not follow the wall clock.

That's not what the POO says...

When the TOD-clock-steering facility is not installed,
and both the CPU timer and the TOD clock are running,
the stepping rates are synchronized such that
both are stepped at the same rate.

So unless someone issued a PTFF instruction (to account for a leap second or a NTP update..)

And no matter what...

How can the CPU timer run SLOWER when executing instructions rather than being in a wait state ?

--Ivan

@jphartmann
Copy link
Contributor

In an LPAR environment, a partition can be suspended; in that state its CPU timer is not decremented.

As there no longer are CPU timer registers in the hardware, it is futile to argue what something that isn't there does or does not.

@PeterCoghlan
Copy link
Author

There is not necessarily a conflict there. Perhaps the CPU timers (all of them) and TOD clock will all run equally slow in an overcommitted LPAR environment when compared to the wall clock?

Thinking some more, if all the CPU timers in a particular environment - LPAR, Hercules instance or whatever don't run at the same rate all the time (excluding the stopped CPU case), timing similar events on different (emulated) CPUs using the CPU timer will yield different results. This is unlikely to be the desired effect.

Here's another case to chew on. Consider Hercules emulating two CPUs while running single CPU hardware. Using the thread CPU usage model, each emulated CPU timer could end up running at less than half wall clock speed. Exactly the same setup running on a four CPU host would likely have it's emulated CPU timers running much faster.

@ivan-w
Copy link
Contributor

ivan-w commented Feb 7, 2018

In an LPAR environment, a partition can be suspended; in that state its CPU timer is not decremented.

That's the equivalent of a manual stopped state... Nothing to do with a CPU overcommit in a PR/SM or z/VM hypervisor...

As there no longer are CPU timer registers in the hardware, it is futile to argue what something that isn't there does or does not.

That's an implementation detail of some three letter vendor - nothing to do with what this implementation does !

@Fish-Git
Copy link
Contributor

Fish-Git commented Feb 7, 2018

For those of us with little or no experience with LPAR hardware, what is "an overcommitted LPAR environment"?

@ivan-w
Copy link
Contributor

ivan-w commented Feb 7, 2018

For those of us with little or no experience with LPAR hardware, what is "an overcommitted LPAR environment"?

That's a good question. I understand memory overcommit (where more memory is allocated than what is available) - which may entail paging/swapping if the hypervisor allows it - and permanent storage overcommit in the case of thin provisionned storage resources - but CPU ?

I guess it would mean the sum of LPAR absolute share exceed 100% of the available resources in the assigned resource pool... But that's dubious !

@Peter-J-Jansen
Copy link
Contributor

Am I correct in assuming that when an LPAR partition is in the suspended state, all it's CPU's are in the stopped state ? If not, in what state would all it's CPU's be otherwise ?

The PoP specifies that a CPU timer is only not being decremented when it's CPU is in the stopped or check-stop states, and I believe that this is being catered for in the Hercules implementation.

Fish-Git referenced this issue in SDL-Hercules-390/hyperion Apr 4, 2018
The issue was first described by Peter Coghlan (as Issue #217 in the official Hyperion GitHub), and has now been solved along the lines as initially described by Mark "dasdman" L. Gaubatz, and commented on by Ivan, John, and Fish (all of which can be read up on GitHub).

The solution as described by Mark needed some amendments, and required some extra problems to be solved.  These included :

 (1) Mark's original getrusage() produced two statistics : thread user time, and thread system time.  Fish's Windows equivalent implementation for getrusage() supplies the Windows thread user and kernel times for these.  Only when running under Windows does the Hercules qproc command correctly show these CPU thread user and kernel times.  This qproc behavior has been retained as is.

 (2) The clock_gettime() replacement for getrusage() has been mentioned on the 'net to produce the sum of both user and system time.  I have assumed that this is correct, even though I'd like further confirmation this to be correct.  The Windows clock_gettime() equivalent for thread CPU time therefore returns the sum of the Windows user and kernel times.

 (3) The original "diagmssf.c" code for DIAG 204 delivered the VM virtual CPU time as the getrusage() user time (uCPU); the VM total CPU time (tCPU) was computed as the sum of the getrusage() user+system time; the VM wait time (wCPU) was computed as the difference tCPU-uCPU.  This would imply that the VM wait time equals the getrusage() system time.  I don't think that is correct.  In order to correct this I have added and computed regs->waittime_accumulated in "timer.c" to set wCPU. (This requires OPTION_MIPS_COUNTING to be #defined, but after inspection, I think that is already implicitly required anyway in other places.)  With uCPU set to clock_gettime() thread user+system time, tCPU is now computed as uCPU+wCPU. I believe this delivers the more accurate DIAG 204 data.

 (4) The clock_gettime() for thread user+system CPU time needs the thread clockid.  Following the thinking as voiced by Ivan and John, I've added sysblk.cpuclockid[cpu] so that only one pthread_getcpuclockid() call per CPU thread is needed.

 (5) The Windows equivalent for thread clockid simulates this as the negative of the thread_id, and the clock_gettime() equivalent then uses this negative to call the existing Windows getrusage() to obtain and sum the Windows thread user and kernel times.  Fish may wish to improve or beautify this simple but straightforward concept.
Fish-Git pushed a commit to SDL-Hercules-390/hyperion that referenced this issue Apr 4, 2018
Closes https://github.com/Fish-Git/hyperion/pull/83

There are (were) problems when running VM 2nd level under VM, in that the 2nd level VM would suffer form ABENDs (EXT105, DSP005 and PRG009).  They would occur intermittently, even on idling systems.  When left that way overnight, the 2nd level VM system would automatically restart, but in the morning the CPDUMP files would show these problems.  Additionally, it was observed that the 2nd level VM system would intermittently go into a CPU loop, exhibiting 100% CPU but on a single CPU only.  The ABENDs were observed to occur at that time.  The analysis of the ABENDs showed a probably incorrect CPU Timer emulation in Hyperion.

This was originally believed to be related to Hyperion issue hercules-390/hyperion#217, but this was proven to not be the case.  Rather, the problem has now been determined to be caused by the CPU Timer implementation being based on thread_cputime().  Such CPU Timer emulation relied on three variables which were lacking being treated in a critical section.  Also, basing the CPU Timer emulation on thread_cputime() would imply that a running mainframe CPU Timer would be stopped when the Hyperion host would pre-empt the host CPU thread emulating the mainframe CPU, which is incorrect.  Additionally, the Windows emulation of thread_cputime() suffers from a poor resolution (1/16th of a second).

This commit reverts the CPU Timer emulation to the earlier one, to not rely on thread_cputime(), but on hw_clock() and hw_tod.  The processing around DIAG 204 and CPU MAXRATES has not been altered and still uses thread_cputime_us().
gappelman pushed a commit to gappelman/hyperion that referenced this issue Apr 10, 2020
Various fixes for local 3270 devices connected using tn3270:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants