Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timed statistics to instruction counting #588

Merged
merged 9 commits into from Aug 26, 2023
Merged

add timed statistics to instruction counting #588

merged 9 commits into from Aug 26, 2023

Conversation

76bav
Copy link
Contributor

@76bav 76bav commented Aug 1, 2023

Add timed statistics to instruction counting

sample output:

00:27:00 HHC02292I Inst '47'   count    215794590 (15%) time      4665559 (0.021620)
00:27:00 HHC02292I Inst '58'   count    164770795 (12%) time      3808130 (0.023112)
00:27:00 HHC02292I Inst '91'   count    107608676 ( 7%) time      2167367 (0.020141)
00:27:00 HHC02292I Inst 'A704' count    100009950 ( 7%) time      2807802 (0.028075)
00:27:00 HHC02292I Inst '41'   count     66744429 ( 4%) time      1480319 (0.022179)
00:27:00 HHC02292I Inst '50'   count     66068882 ( 4%) time      1643570 (0.024877)
00:27:00 HHC02292I Inst '18'   count     49057127 ( 3%) time      1091045 (0.022240)
00:27:00 HHC02292I Inst 'D2'   count     32589787 ( 2%) time       876037 (0.026881)
00:27:00 HHC02292I Inst '1F'   count     28897666 ( 2%) time       586237 (0.020287)
00:27:00 HHC02292I Inst '07'   count     28708214 ( 2%) time       833780 (0.029043)
00:27:00 HHC02292I Inst '95'   count     28216550 ( 2%) time       592255 (0.020990)
00:27:00 HHC02292I Inst 'BF'   count     22686289 ( 1%) time       524232 (0.023108)
00:27:00 HHC02292I Inst '98'   count     20248859 ( 1%) time       551115 (0.027217)
00:27:00 HHC02292I Inst '43'   count     19863982 ( 1%) time       389031 (0.019585)
00:27:00 HHC02292I Inst '55'   count     19733596 ( 1%) time       434151 (0.022001)
00:27:00 HHC02292I Inst '45'   count     19700337 ( 1%) time       570025 (0.028935)
00:27:00 HHC02292I Inst 'D5'   count     19028215 ( 1%) time       376478 (0.019785)
00:27:00 HHC02292I Inst '90'   count     18097914 ( 1%) time       408490 (0.022571)
00:27:00 HHC02292I Inst '12'   count     16809393 ( 1%) time       347684 (0.020684)
00:27:00 HHC02292I Inst '59'   count     15475891 ( 1%) time       317297 (0.020503)
...
00:27:00 HHC02292I Inst 'B241' count           12 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'ED24' count           12 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'E30D' count           12 ( 0%) time            1 (0.083333)
00:27:00 HHC02292I Inst '29'   count            8 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst '6E'   count            8 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'E397' count            8 ( 0%) time            2 (0.250000)
00:27:00 HHC02292I Inst 'FD'   count            7 ( 0%) time          102 (14.571428)
00:27:00 HHC02292I Inst 'DE'   count            6 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B299' count            4 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B38C' count            4 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B384' count            4 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B3B5' count            3 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst '22'   count            3 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst '2C'   count            3 ( 0%) time            1 (0.333333)
00:27:00 HHC02292I Inst 'B277' count            3 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'E8'   count            2 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'E32E' count            2 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B25D' count            2 ( 0%) time            1 (0.500000)
00:27:00 HHC02292I Inst 'B99F' count            2 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst '97'   count            1 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'B311' count            1 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'DF'   count            1 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'FC'   count            1 ( 0%) time            0 (0.000000)
00:27:00 HHC02292I Inst 'E30C' count            1 ( 0%) time            0 (0.000000)

rewrite without loop
@76bav 76bav marked this pull request as draft August 1, 2023 17:14
@76bav 76bav marked this pull request as ready for review August 1, 2023 17:19
Copy link
Member

@Fish-Git Fish-Git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE:  Added patch containing required changes as well as first two suggested changes.

Hi Andrey!

Sorry it has taken me so long to review your PR.

At the minimum, you will need to make two changes before I can accept your Pull Request.

  1. The first is extremely minor: in opcode.h where elapsed_usecs is defined, you need to declare the variable first, then use it. Yes, I know with modern compilers that's no longer necessary, but I'm using an older compiler.  :)

Change:

            struct timeval end_time;                                \
            struct timeval dur;                                     \
            gettimeofday(&end_time, NULL);                          \
            timeval_subtract(&sysblk.start_time, &end_time, &dur);  \
            U64 elapsed_usecs = (dur.tv_sec * 1000000) + dur.tv_usec;\

To:

            struct timeval end_time;                                \
            struct timeval dur;                                     \
            U64 elapsed_usecs;                                      \
                                                                    \
            gettimeofday(&end_time, NULL);                          \
            timeval_subtract(&sysblk.start_time, &end_time, &dur);  \
            elapsed_usecs = (dur.tv_sec * 1000000) + dur.tv_usec;   \
  1. The second change is to revert the changes you made to the SLA instruction in general2.c. It needs to be put back to the way it was. It needs to match current SDL Hyperion git.

Yes, I realize that's probably why you wanted to add the instruction timing support to the icount command: because you noticed how inefficient certain basic instructions where (like SLA), and you are correct: they are inefficient, and yes they do need to be changed to be more efficient!

BUT... That's not the purpose of this Pull Request, is it? No. This Pull Request should be only for your icount changes. You should create a separate Pull Request if you want to change the SLA instruction to make it more efficient.

Finally, while not required, I would like to make a few suggestions as well:

  1. Rename all occurrences of ICOUNT_INST to BEG_COUNT_INSTR, and all occurrences of your new ICOUNT_INST_ macro to END_COUNT_INSTR instead.
  2. Rename all occurrences of OPTION_INSTRUCTION_COUNTING to OPTION_INSTR_COUNT_AND_TIME instead.
  3. It would be nice I think if you could add a new sort option to the icount command, to allow sorting and displaying of the instructions in descending time sequence, with the most inefficient (slowest; i.e. greatest elapsed usecs) listed first.

These are only suggestions and are purely optional. Should you decide to not make them, I will be glad to accept your Pull Request anyway. The only required changes you need to make are the two I listed at the beginning of this comment.

All in all, I like you change very much! It should prove to be very helpful with identifying those instructions that are the best candidates for improvement! Well done!  :)

I look forward to your updated pull request!  :)

p.s. Be sure to do a 'pull' on your working repository clone too, to pickup the changes that have been made since you created it. You're missing a bunch of commits that have been these past few weeks. Not necessarily required, but certainly helpful.


(patch file containing required changes as well as first two suggested changes):

--- hyperion-76bav-1/cmdtab.h	2023-08-22 19:05:46.132620400 -0700
+++ hyperion-76bav/cmdtab.h	2023-08-22 18:35:40.257330200 -0700
@@ -2181,7 +2181,7 @@
 COMMAND( "hao",                     hao_cmd,                SYSPROGDEVEL,       hao_cmd_desc,           hao_cmd_help        )
 #endif
 COMMAND( "http",                    http_cmd,               SYSCONFIG,          http_cmd_desc,          http_cmd_help       )
-#if defined( OPTION_INSTRUCTION_COUNTING )
+#if defined( OPTION_INSTR_COUNT_AND_TIME )
 COMMAND( "icount",                  icount_cmd,             SYSCMDNOPER,        icount_cmd_desc,        icount_cmd_help     )
 #endif
 #if defined( OPTION_IODELAY_KLUDGE )
--- hyperion-76bav-1/featall.h	2023-08-22 19:05:46.181623200 -0700
+++ hyperion-76bav/featall.h	2023-08-22 18:34:53.660665000 -0700
@@ -75,7 +75,7 @@
 #define MAX_DEVICE_THREAD_IDLE_SECS 300 /* 5 Minute thread timeout   */
 //efine OPTION_LONG_HOSTINFO            /* Detailed host & logo info */
 #undef  OPTION_FOOTPRINT_BUFFER /* 2048 ** Size must be a power of 2 */
-#undef  OPTION_INSTRUCTION_COUNTING     /* First use trace and count */
+#define OPTION_INSTR_COUNT_AND_TIME     /* First use trace and count */
 #undef  MODEL_DEPENDENT_STCM            /* STCM, STCMH always store  */
 #define OPTION_NOP_MODEL158_DIAGNOSE    /* NOP mod 158 specific diags*/
 
--- hyperion-76bav-1/general2.c	2023-08-22 19:05:46.185623400 -0700
+++ hyperion-76bav/general2.c	2023-08-22 17:56:26.531704800 -0700
@@ -837,9 +837,8 @@
 int     r1, r3;                         /* Register numbers          */
 int     b2;                             /* effective address base    */
 VADR    effective_addr2;                /* effective address         */
-U32     n, n2;                          /* 32-bit operand values     */
-U32     j;                              /* Integer work areas        */
-U64     n1;
+U32     n, n1, n2;                      /* 32-bit operand values     */
+U32     i, j;                           /* Integer work areas        */
 
     RS(inst, regs, r1, r3, b2, effective_addr2);
 
@@ -854,32 +853,24 @@
         return;
     }
 
-    if (n > 31)  /* special case result will be zero */
+    /* Load the numeric and sign portions from the R1 register */
+    n1 = regs->GR_L(r1) & 0x7FFFFFFF;
+    n2 = regs->GR_L(r1) & 0x80000000;
+
+    /* Shift the numeric portion left n positions */
+    for (i = 0, j = 0; i < n; i++)
     {
+        /* Shift bits 1-31 left one bit position */
+        n1 <<= 1;
 
-        if (regs->GR_L(r1))     /* if contain anything sing will change */
+        /* Overflow if bit shifted out is unlike the sign bit */
+        if ((n1 & 0x80000000) != n2)
             j = 1;
-        regs->GR_L(r1) = 0;     /* always will be zero */
-    }
-    else
-    {
-        /* Load the numeric and sign portions from the R1 register */
-        n1 = regs->GR_L(r1); // & 0x7FFFFFFF; 
-        n2 = regs->GR_L(r1) & 0x80000000;
-
-        if (n2) // if sing non zero fill hi DWORD of U64 by 1
-            n1 |= 0xFFFFFFFF00000000;
-        n1 <<= n;
-        /* check for sign change */
-        j = (n2 ? 
-                  ((n1 & 0xFFFFFFFF00000000) ^ 0xFFFFFFFF00000000) /* some 0 bits shifted left */
-                : (n1 & 0xFFFFFFFF00000000)                        /* some 1 bits shifted left */
-            )
-            || (0x80000000 & n1) != n2;                            /* sign changed             */
-        /* Load the updated value into the R1 register */
-        regs->GR_L(r1) = (n1 & 0x7FFFFFFF) | n2;
     }
 
+    /* Load the updated value into the R1 register */
+    regs->GR_L(r1) = (n1 & 0x7FFFFFFF) | n2;
+
     /* Condition code 3 and program check if overflow occurred */
     if (j)
     {
--- hyperion-76bav-1/hscemode.c	2023-08-22 19:05:46.198624100 -0700
+++ hyperion-76bav/hscemode.c	2023-08-22 18:35:40.239329200 -0700
@@ -2685,7 +2685,7 @@
 }
 
 
-#if defined( OPTION_INSTRUCTION_COUNTING )
+#if defined( OPTION_INSTR_COUNT_AND_TIME )
 /*-------------------------------------------------------------------*/
 /* icount command - display instruction counts                       */
 /*-------------------------------------------------------------------*/
@@ -2928,7 +2928,7 @@
 
     return 0;
 }
-#endif /* defined( OPTION_INSTRUCTION_COUNTING ) */
+#endif /* defined( OPTION_INSTR_COUNT_AND_TIME ) */
 
 
 /*-------------------------------------------------------------------*/
--- hyperion-76bav-1/hstructs.h	2023-08-22 19:05:46.207624700 -0700
+++ hyperion-76bav/hstructs.h	2023-08-22 18:38:17.959350300 -0700
@@ -1075,7 +1075,11 @@
         gid_t   rgid, egid, sgid;
 #endif /*!defined(NO_SETUID)*/
 
-#if defined( OPTION_INSTRUCTION_COUNTING )
+/*-------------------------------------------------------------------*/
+/*          Individual instruction counting and timing               */
+/*-------------------------------------------------------------------*/
+
+#if defined( OPTION_INSTR_COUNT_AND_TIME )
 
         bool    icount;                 /* true = enabled, else not. */
         struct timeval start_time;      /* OPCODE start time         */
@@ -1124,7 +1128,7 @@
         U64 imapedT[256];
         U64 imapxxT[256];
 
-#define IMAP_SIZE \
+  #define IMAP_SIZE \
             ( sizeof(sysblk.imap01) \
             + sizeof(sysblk.imapa4) \
             + sizeof(sysblk.imapa5) \
@@ -1166,7 +1170,7 @@
             + sizeof(sysblk.imapedT) \
             + sizeof(sysblk.imapxxT) )
 
-#endif // defined( OPTION_INSTRUCTION_COUNTING )
+#endif // defined( OPTION_INSTR_COUNT_AND_TIME )
 
         char    *cnslport;              /* console port string       */
         char    *sysgport;              /* SYSG console port string  */
--- hyperion-76bav-1/opcode.h	2023-08-22 19:06:10.244999500 -0700
+++ hyperion-76bav/opcode.h	2023-08-22 18:35:40.267330800 -0700
@@ -222,12 +222,12 @@
 OPCD_DLL_IMPORT int iprint_router_func( int arch_mode, BYTE inst[], char mnemonic[], char* prtbuf );
 
 /*-------------------------------------------------------------------*/
-/*               Individual instruction counting                     */
+/*          Individual instruction counting and timing               */
 /*-------------------------------------------------------------------*/
 
-#if defined( OPTION_INSTRUCTION_COUNTING )
+#if defined( OPTION_INSTR_COUNT_AND_TIME )
 
-#define ICOUNT_INST( _inst, _regs )                                 \
+  #define BEG_COUNT_INSTR( _inst, _regs )                           \
     do                                                              \
     {                                                               \
         if (sysblk.icount)                                          \
@@ -305,24 +305,26 @@
         }                                                           \
     } while (0)
 
-#else // !defined( OPTION_INSTRUCTION_COUNTING )
+#else // !defined( OPTION_INSTR_COUNT_AND_TIME )
 
-#define ICOUNT_INST(_inst, _regs)
+  #define BEG_COUNT_INSTR( _inst, _regs )
 
-#endif // defined( OPTION_INSTRUCTION_COUNTING )
+#endif
 
-#if defined( OPTION_INSTRUCTION_COUNTING )
+#if defined( OPTION_INSTR_COUNT_AND_TIME )
 
-#define ICOUNT_INST_(_inst, _regs)                                  \
+  #define END_COUNT_INSTR( _inst, _regs )                           \
     do                                                              \
     {                                                               \
         if (sysblk.icount)                                          \
         {                                                           \
             struct timeval end_time;                                \
             struct timeval dur;                                     \
+            U64 elapsed_usecs;                                      \
+                                                                    \
             gettimeofday(&end_time, NULL);                          \
             timeval_subtract(&sysblk.start_time, &end_time, &dur);  \
-            U64 elapsed_usecs = (dur.tv_sec * 1000000) + dur.tv_usec;\
+            elapsed_usecs = (dur.tv_sec * 1000000) + dur.tv_usec;   \
                                                                     \
             switch ((_inst)[0]) {                                   \
             case 0x01:                                              \
@@ -388,14 +390,11 @@
         }                                                           \
     } while (0)
 
+#else // !defined( OPTION_INSTR_COUNT_AND_TIME )
 
-#else // !defined( OPTION_INSTRUCTION_COUNTING )
-#define ICOUNT_INST_(_inst, _regs)
-#endif // defined( OPTION_INSTRUCTION_COUNTING )
-
-
-
+  #define END_COUNT_INSTR( _inst, _regs )
 
+#endif
 
 /*-------------------------------------------------------------------*/
 /*                         SIE macros                                */
@@ -1466,9 +1465,9 @@
 #define EXECUTE_INSTRUCTION( _oct, _ip, _regs )                       \
 do {                                                                  \
     FOOTPRINT( (_ip), (_regs) );                                      \
-    ICOUNT_INST( (_ip), (_regs) );                                    \
+    BEG_COUNT_INSTR( (_ip), (_regs) );                                \
     (_oct)[ fetch_hw( (_ip) )]( (_ip), (_regs) );                     \
-    ICOUNT_INST_( (_ip), (_regs) );                                   \
+    END_COUNT_INSTR( (_ip), (_regs) );                                \
 } while (0)
 
 #if defined( FEATURE_073_TRANSACT_EXEC_FACILITY )
@@ -1478,9 +1477,9 @@
   do {                                                                \
       CHECK_TXF_CONSTRAINTS( (_ip), (_regs) );                        \
       FOOTPRINT( (_ip), (_regs) );                                    \
-      ICOUNT_INST( (_ip), (_regs) );                                  \
+      BEG_COUNT_INSTR( (_ip), (_regs) );                              \
       (_oct)[ fetch_hw( (_ip) )]( (_ip), (_regs) );                   \
-      ICOUNT_INST_( (_ip), (_regs) );                                 \
+      END_COUNT_INSTR( (_ip), (_regs) );                              \
   } while (0)
 
   #undef  TXF_UNROLLED_EXECUTE

@76bav
Copy link
Contributor Author

76bav commented Aug 25, 2023

Hi David,
I accept all of your suggestions.
I think about sorting option later.
For now, I want to play with "SLA", "AP", "SP" and probably "SLDA".

Best regards
Andrey

@76bav 76bav requested a review from Fish-Git August 25, 2023 18:44
Copy link
Member

@Fish-Git Fish-Git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, Andrey! Thanks!

@Fish-Git
Copy link
Member

For now, I want to play with "SLA", "AP", "SP" and probably "SLDA".

Are you familiar with Harold Grovesteen's (one of our developes) most excellent SATK/ASMA product?

It allows you to easily construct standalone test programs that you can then easily run in Hercules. The Hercules project itself makes extensive use of it to create the many "QA" Quality Assurance tests we have that are used to verify proper Hercules functionality, including many instructions. It's great!

I only mention it because I have taken an Instruction Timing test program someone wrote a long time ago for MVS (called "MIPS370" I think) and have converted it into a standalone program that can be easily run under Hercules.

I haven't released it yet but it seems to work quite well:

I believe it should provide much more accurate and meaningful instruction timing report that your patched OPTION_INSTR_COUNT_AND_TIME technique currently provides.

Of course, I suppose on the grand scale of things it doesn't really matter which technique you use, as long as it is able to show your changes you want to make actually improve a given instruction's speed and performance.

ANYWAY... good luck to you in your endeavor! I look forward to seeing your proposed improvements to the instructions you listed!

p.s. You should probably include the SLAG and SLAK instruction along with your proposed SLA and SLDA improvements. They're all doing the same thing.

@Fish-Git Fish-Git merged commit 434aefb into SDL-Hercules-390:develop Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants