Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update makefiles to build with local tools instead of /build/toolchain #2

Closed
derekbruening opened this issue Nov 27, 2014 · 4 comments

Comments

@derekbruening
Copy link
Contributor

From derek.br...@gmail.com on February 11, 2009 15:46:52

The makefiles are currently assuming that the VMware toolchain is present.
For a proprietary product, IMHO having the toolchain standardized and
either in the repository or on a server is the right way to go, in order to
properly build old versions.

However, for an open source project we want to support building on as wide
a variety of toolchains as possible, and we don't really need to fix bugs
in old versions (though we could support that by listing not only minimum
versions but also maximum in our Makefiles). (Plus, we can't exactly
commit, say, the Windows DDK into our repository.) So instead of only
supporting a single version, long-term we should try to expand support.
There are many issues there as different versions of a compiler bring up
different subtle issues.

Here is a short list of what we currently require:

Linux:

  • Require gcc built to target both 32-bit and 64-bit
  • Require gas built to target both 32-bit and 64-bit
  • Require gas >= 2.18.50 for -msyntax=intel -mmnemonic=intel
  • Require gas >= 2.16 for fxsaveq
  • Require gcc >= 3.4 for -fvisibility: though we do have a workaround
    Windows:
  • Require cl for intrinsics
  • Require cl >= 14.0 to target both 32-bit and 64-bit and for C99 macro varargs
  • Require cygwin perl and other cygwin tools
    Both:
  • Require doxygen >= 1.5.1, 1.5.3+ better

We can split future work (such as switching to cmake, getting version #s
from top level instead of being hardcoded in certain places, parallelizing
the build, re-enabling dependency checking, etc.) off into separate Issues
but for now this issues covers getting things working for the initial
developers (we'll reduce the priority once that's covered).

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=2

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 12, 2009 13:02:45

(note that Google Code does not automatically include commit logs into issues)
I just committed revision 5 that did the following:

issue #2: build using local tools

  • removed /build/toolchain references and PATH setting
  • re-added make/as-2.18.50 for building on older Linux
  • added missing license to top of trunk/Makefile
  • changed default trunk/Makefile target to be one arch package
  • made must_escape_from static to work with the new INTERNAL=1
    all-symbols-visible setting (put in for better callstacks)
    to avoid undefined symbol error at runtime
  • made -dT variable-controlled for older ld versions

I also made a decision to simplify the exports dirs
and move away from the old model to match the new release model:

  • moved install dirs include/, bin/, lib32/, lib64/ into exports/
    and eliminated exports/x____*/
  • "make install" is now more like "make exports" since we always
    put into install dirs, but no longer have windows vs linux in exports/
  • api/docs/Makefile looks for ../../exports/include
    or for $DYNAMORIO_HOME, which should now be set to /exports/

I believe Linux is in pretty good shape now.
Leaving open for getting Windows to build more easily.
Then we'll open separate issues for the future work listed above and resolve this.

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 14, 2009 07:53:45

We don't build on RHEL3:

If typedef conflicts arise (such as on RHEL3), currently you'll have

to manually resolve by defining one of our DR_DO_NOT_DEFINE_*

defines (DR_DO_NOT_DEFINE_uint, DR_DO_NOT_DEFINE_ushort, etc.).

Eventually we'll have a pre-make step that automates this.

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 15, 2009 10:37:59

r14 : build windows without vmware toolchain

  • added support for building with either Visual Studio 2005 or the Vista
    SDK, combined with either the 2003 DDK or Vista WDK. With the Vista SDK
    the Vista WDK is required (for ml64.exe), and DRgui can only be built if
    Visual Studio 98 is available.
  • added instructions to wiki on how to set up and build: wiki/ HowToBuild ( r15 , r16 )
  • added support for building core with just the DDK
  • fix two compilation issues that show up with DDK cl 14.00.50727.220 but
    don't with SDK cl 14.00.50727.762:
    • libutil/utils.c _wopen() 3rd arg optional-ness
    • core/win32/injector.c time_forbidden_function

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 16, 2009 11:44:15

split issues mentioned in 1st post as issue #17 , issue #18 , and issue #19 resolving this one since developers with certain setups can now build

Status: Verified

cmannett85-arm added a commit that referenced this issue Feb 2, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LDFF1B  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.B }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #3}]
LDFF1H  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SB { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]

Issue #3044
cmannett85-arm added a commit that referenced this issue Feb 2, 2023
#5850)

This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LDFF1B  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1B  { <Zt>.B }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #3}]
LDFF1H  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SB { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]

Issue #3044
jackgallagher-arm added a commit that referenced this issue Mar 3, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LD1B    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SB   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2]
ST1B    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Mar 3, 2023
This patch adds the appropriate macros, tests and codec entries to
encode the following variants:
LD1B    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SB   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2]
ST1B    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Mar 3, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LD1B    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1B    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SB   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1SB   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1B  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #3]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2]
ST1B    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1B    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1H    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1]
ST1H    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1W    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2]
ST1W    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Mar 3, 2023
This patch adds the appropriate macros, tests and codec entries to
encode the following variants:
LD1B    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1B    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3]
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SB   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1SB   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1B  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1B  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3]
LDFF1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LDFF1H  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LDFF1H  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1]
LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1]
LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2]
LDFF1W  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2]
LDFF1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #3]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2]
ST1B    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1B    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1]
ST1H    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1H    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1]
ST1H    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2]
ST1W    { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>]
ST1W    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2]
ST1W    { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>]

Issue: #3044
joshua-warburton added a commit that referenced this issue Mar 10, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD1H    { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD2D    { <Zt1>.D, <Zt2>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD2H    { <Zt1>.H, <Zt2>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD2W    { <Zt1>.S, <Zt2>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD3D    { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD3H    { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD3W    { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD4D    { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD4H    { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD4W    { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LDNT1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LDNT1H  { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LDNT1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST1H    { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST1W    { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST2D    { <Zt1>.D, <Zt2>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST2H    { <Zt1>.H, <Zt2>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST2W    { <Zt1>.S, <Zt2>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST3D    { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST3H    { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST3W    { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST4D    { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST4H    { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST4W    { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
STNT1D  { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
STNT1H  { <Zt>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
STNT1W  { <Zt>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]

issues: #3044

Change-Id: Ic9d21cd27f8b2a52bd8bc2ced76ba79f5deae69a
joshua-warburton added a commit that referenced this issue Mar 11, 2023
…5904)

This patch adds the appropriate macros, tests and codec entries to
encode the following variants:

LD1D    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD1H    { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1H    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1H    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SH   { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SH   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1SW   { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD1W    { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD1W    { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD2D    { <Zt1>.D, <Zt2>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD2H    { <Zt1>.H, <Zt2>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD2W    { <Zt1>.S, <Zt2>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD3D    { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD3H    { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD3W    { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LD4D    { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD4H    { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD4W    { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
LDNT1D  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LDNT1H  { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LDNT1W  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]
ST1D    { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST1H    { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST1W    { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST2D    { <Zt1>.D, <Zt2>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST2H    { <Zt1>.H, <Zt2>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST2W    { <Zt1>.S, <Zt2>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST3D    { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST3H    { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST3W    { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
ST4D    { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
ST4H    { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
ST4W    { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
STNT1D  { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3]
STNT1H  { <Zt>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
STNT1W  { <Zt>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
jackgallagher-arm added a commit that referenced this issue Apr 20, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
LD1RQB  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQB  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>]
LD1RQD  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQD  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD1RQH  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQH  { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1RQW  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQW  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Apr 21, 2023
This patch adds the appropriate macros, tests and codec entries to
encode the following variants:
LD1RQB  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQB  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>]
LD1RQD  { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQD  { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3]
LD1RQH  { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQH  { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1]
LD1RQW  { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, #<simm>}]
LD1RQW  { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Apr 26, 2023
This patch adds the appropriate macros, tests and codec entries
to encode the following variants:
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Xm>]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #3]

Issue: #3044
jackgallagher-arm added a commit that referenced this issue Apr 27, 2023
This patch adds the appropriate macros, tests and codec entries to
encode the following variants:
PRFB    <prfop>, <Pg>, [<Xn|SP>, <Xm>]
PRFH    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #1]
PRFW    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
PRFD    <prfop>, <Pg>, [<Xn|SP>, <Xm>, LSL #3]

Issue: #3044
derekbruening added a commit that referenced this issue Aug 8, 2023
Switches from using the tid in scheduler_launcher to distinguish
inputs to the input ordinal.  Tid values can be duplicated so they
should not be used as unique identifiers across workloads.

Tested: No automated test currently relies on the launcher; it is
there for experimentation and as an example for how to use the
scheduler, so we want it to use the recommended techniques.  I ran it
on the threadsig app and confirmed record and replay are using
ordinals:
  ===========================================================================
  $ rm -rf drmemtrace.*.dir; bin64/drrun -stderr_mask 12 -t drcachesim -offline -- ~/dr/test/threadsig 16 2000 && bin64/drrun -t drcachesim -simulator_type basic_counts -indir drmemtrace.*.dir > COUNTS 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -record_file record.zip > RECORD 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -replay_file record.zip > REPLAY 2>&1 && tail -n 4 RECORD REPLAY
  Estimation of pi is 3.141592674423126
  Received 89 alarms
  ==> RECORD <==
  Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7
  Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16
  Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10
  Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4

  ==> REPLAY <==
  Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7
  Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16
  Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10
  Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4
  ===========================================================================

Issue: #5843
derekbruening added a commit that referenced this issue Aug 9, 2023
Switches from using the tid in scheduler_launcher to distinguish inputs
to the input ordinal. Tid values can be duplicated so they should not be
used as unique identifiers across workloads.

Tested: No automated test currently relies on the launcher; it is there
for experimentation and as an example for how to use the scheduler, so
we want it to use the recommended techniques. I ran it on the threadsig
app and confirmed record and replay are using ordinals:
```
$ rm -rf drmemtrace.*.dir; bin64/drrun -stderr_mask 12 -t drcachesim -offline -- ~/dr/test/threadsig 16 2000 && bin64/drrun -t drcachesim -simulator_type basic_counts -indir drmemtrace.*.dir > COUNTS 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -record_file record.zip > RECORD 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -replay_file record.zip > REPLAY 2>&1 && tail -n 4 RECORD REPLAY
Estimation of pi is 3.141592674423126
Received 89 alarms
==> RECORD <==
Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7
Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16
Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10
Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4

==> REPLAY <==
Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7
Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16
Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10
Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4
 ```

Issue: #5843
derekbruening added a commit that referenced this issue Aug 11, 2023
Removes the original (but never implemented) report_time() heartbeat
design in favor of the simulator passing the current time to a new
version of next_record().

Implements QUANTUM_TIME by recording the start time of each input when
it is first scheduled and comparing to the new time in next_record().
Switches are only done at instruction boundaries for simplicity of
interactions with record-replay and skipping.

Adds 2 unit tests.

Adds time support with wall-clock time to the scheduler_launcher.
This was tested manually on some sample traces.  For threadsig traces,
with DEPENDENCY_TIMESTAMPS, the quanta doesn't make a huge differences
as the timestamp ordering imposes significant constraints.  I added an
option to ignore the timestamps ("-no_honor_stamps") and there we
really see the effects of the smaller quanta with more context switches.

===========================================================================
With timestamp deps and a 2ms quantum (compare to 2ms w/o deps below):
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -honor_stamps
Core #0: 15 12 1 15 1 15 7 12 15 7 6 9 5
Core #1: 13 10 11 15 12 10 15 12 10 15 10 11 10 15 10 8 2
Core #2: 16 11 15 10 11 15 11 15 11 15 4 7 12 4 0 14
Core #3: 3 1 15 12 10 15 12 10 1 12 15 7 15 4 15

===========================================================================
Without, but a long quantum of 20ms:
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 20000 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 5 8 12 16 4 11 15
Core #1: 1 4 9 14 0 7 9 0
Core #2: 2 6 10 13 1 6 13 1
Core #3: 3 7 11 15 2 3 8 10 14 16

===========================================================================
Without, but a smaller quantum of 2ms:
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 5 9 13 1 7 9 13 0 4 8 11 15 3 7 9 13 3 5 10 13 1 7 6 12 14 7 8 11 0 7 8 10 16 3 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 8 6 15 0 9 11 13
Core #1: 1 4 8 12 16 2 6 11 15 1 5 10 14 2 8 11 16 1 7 9 15 0 4 9 15 0 2 6 12 16 3 5 12 13 1 5 10 16 7 8 12 13 3 4 9 15 0 1 5 10 16 7 2 9 13 1 15
Core #2: 2 7 10 14 0 4 8 12 16 2 6 12 16 1 5 10 15 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 4 9 11 14 4 10 11 14 4 16 0
Core #3: 3 6 11 15 3 5 10 14 3 7 9 13 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 7 8 12 13 3 4 9 15 14 2 6 11 14 2 6 15 0 1 5 12 16 2 12 1

===========================================================================
Without, but a tiny quantum of 200us:
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 200 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 4 7 11 15 2 6 10 14 1 7 9 12 16 5 10 12 4 8 3 11 12 1 8 11 16 7 8 15 12 0 6 12 7 13 10 2 8 15 16 2 3 15 6 11 7 13 6 10 1 8 5 10 1 3 6 14 11 7 15 2 4 12 13 5 9 10 15 6 9 10 7 6 2 1 11 3 14 16 12 13 5 1 11 6 8 2 10 0 7 5 10 0 3 14 1 15 6 7 5 4 0 12 14 9 10 16 14 8 10 11 13 7 9 0 13 3 9 1 13 3 5 2 16 14 4 15 0 6 4 15 0 13 12 8 10 1 3 4 15 2 14 3 5 11 16 13 6 15 10 2 12 6 4 9 12 6 15 10 1 7 8 11 2 14 13 5 10 1 12 8 5 0 14 8 3 4 16 6 15 11 2 1 8 3 4 0 14 7 4 2 6 14 7 11 10 1 9 13 2 14 12 3 5 10 0 9 13 11 8 12 7 16 10 0 3 5 10 0 3 13 11 15 12 13 11 8 3 13 15 9 12 7 2 8 0 4 6 15 9 3 6 15 14 12 5 15 8 0 3 6 2 0 1 6 13 0 3 6 2 11 9 4 2 0 10 4 2 0 10 4 13 0 10 4 13 0 1 15 2 12 1 15 2 0 11 15 6 13 1 15 4 16 14 11 4 16 14 10 4 16 14 11 5 2 13 9 3 4 6 1 11 7 2 16 15 12 4 5 1 12 10 6 13 9 4 2 5 15 3 10 5 15 12 11 16 1 14 7 2 13 9 4 10 6 8 14 3 2 15 7 4 0 6 8 4 0 6 7 12 3 16 6 1 4 16 15 9 12 3 5 13 8 11 0 6 1 4 16 2 7 14 10 5 13 12 3 0 9 8 11 10 6 1 14 16 2 7 11 0 9 1 4 3 5 13 12 3 5 13 4 11 10 6 8 15 11 5 8 4 11 5 13 15 3 6 8 15 11 16 2 7 12 3 5 8 4 14 16 2 15 12 0 5 13 4 14 3 2 4 14 3 8 10 1 16 6 13 4 7 5 13 10 16 11 9 2 12 3 5 6 10 1 7 0 15 12 14 8 2 10 3 11 9 15 16 7 13 2 12 3 13 2 0 16 14 5 6 16 14 5 15 4 1 11 9 6 10 14 2 0 16 14 9 6 12 14 8 4 10 9 0 13 1 12 8 15
Core #1: 1 5 ... <ommitted rest for space but all are as long as Core #0>
===========================================================================

Issue: #5843
derekbruening added a commit that referenced this issue Aug 15, 2023
Removes the original (but never implemented) report_time() heartbeat
design in favor of the simulator passing the current time to a new
version of next_record().

Implements QUANTUM_TIME by recording the start time of each input when
it is first scheduled and comparing to the new time in next_record().
Switches are only done at instruction boundaries for simplicity of
interactions with record-replay and skipping.

Adds 2 unit tests.

Adds time support with wall-clock time to the scheduler_launcher. This
was tested manually on some sample traces. For threadsig traces, with
DEPENDENCY_TIMESTAMPS, the quanta doesn't make a huge differences as the
timestamp ordering imposes significant constraints. I added an option to
ignore the timestamps ("-no_honor_stamps") and there we really see the
effects of the smaller quanta with more context switches.

With timestamp deps and a 2ms quantum (compare to 2ms w/o deps below):
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -honor_stamps
Core #0: 15 12 1 15 1 15 7 12 15 7 6 9 5
Core #1: 13 10 11 15 12 10 15 12 10 15 10 11 10 15 10 8 2
Core #2: 16 11 15 10 11 15 11 15 11 15 4 7 12 4 0 14
Core #3: 3 1 15 12 10 15 12 10 1 12 15 7 15 4 15
```
Without, but a long quantum of 20ms:
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 20000 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 5 8 12 16 4 11 15
Core #1: 1 4 9 14 0 7 9 0
Core #2: 2 6 10 13 1 6 13 1
Core #3: 3 7 11 15 2 3 8 10 14 16
```
Without, but a smaller quantum of 2ms:
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 5 9 13 1 7 9 13 0 4 8 11 15 3 7 9 13 3 5 10 13 1 7 6 12 14 7 8 11 0 7 8 10 16 3 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 8 6 15 0 9 11 13
Core #1: 1 4 8 12 16 2 6 11 15 1 5 10 14 2 8 11 16 1 7 9 15 0 4 9 15 0 2 6 12 16 3 5 12 13 1 5 10 16 7 8 12 13 3 4 9 15 0 1 5 10 16 7 2 9 13 1 15
Core #2: 2 7 10 14 0 4 8 12 16 2 6 12 16 1 5 10 15 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 4 9 11 14 4 10 11 14 4 16 0
Core #3: 3 6 11 15 3 5 10 14 3 7 9 13 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 7 8 12 13 3 4 9 15 14 2 6 11 14 2 6 15 0 1 5 12 16 2 12 1
```
Without, but a tiny quantum of 200us:
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 200 -sched_time -verbose 1 -no_honor_stamps
Core #0: 0 4 7 11 15 2 6 10 14 1 7 9 12 16 5 10 12 4 8 3 11 12 1 8 11 16 7 8 15 12 0 6 12 7 13 10 2 8 15 16 2 3 15 6 11 7 13 6 10 1 8 5 10 1 3 6 14 11 7 15 2 4 12 13 5 9 10 15 6 9 10 7 6 2 1 11 3 14 16 12 13 5 1 11 6 8 2 10 0 7 5 10 0 3 14 1 15 6 7 5 4 0 12 14 9 10 16 14 8 10 11 13 7 9 0 13 3 9 1 13 3 5 2 16 14 4 15 0 6 4 15 0 13 12 8 10 1 3 4 15 2 14 3 5 11 16 13 6 15 10 2 12 6 4 9 12 6 15 10 1 7 8 11 2 14 13 5 10 1 12 8 5 0 14 8 3 4 16 6 15 11 2 1 8 3 4 0 14 7 4 2 6 14 7 11 10 1 9 13 2 14 12 3 5 10 0 9 13 11 8 12 7 16 10 0 3 5 10 0 3 13 11 15 12 13 11 8 3 13 15 9 12 7 2 8 0 4 6 15 9 3 6 15 14 12 5 15 8 0 3 6 2 0 1 6 13 0 3 6 2 11 9 4 2 0 10 4 2 0 10 4 13 0 10 4 13 0 1 15 2 12 1 15 2 0 11 15 6 13 1 15 4 16 14 11 4 16 14 10 4 16 14 11 5 2 13 9 3 4 6 1 11 7 2 16 15 12 4 5 1 12 10 6 13 9 4 2 5 15 3 10 5 15 12 11 16 1 14 7 2 13 9 4 10 6 8 14 3 2 15 7 4 0 6 8 4 0 6 7 12 3 16 6 1 4 16 15 9 12 3 5 13 8 11 0 6 1 4 16 2 7 14 10 5 13 12 3 0 9 8 11 10 6 1 14 16 2 7 11 0 9 1 4 3 5 13 12 3 5 13 4 11 10 6 8 15 11 5 8 4 11 5 13 15 3 6 8 15 11 16 2 7 12 3 5 8 4 14 16 2 15 12 0 5 13 4 14 3 2 4 14 3 8 10 1 16 6 13 4 7 5 13 10 16 11 9 2 12 3 5 6 10 1 7 0 15 12 14 8 2 10 3 11 9 15 16 7 13 2 12 3 13 2 0 16 14 5 6 16 14 5 15 4 1 11 9 6 10 14 2 0 16 14 9 6 12 14 8 4 10 9 0 13 1 12 8 15
Core #1: 1 5 ... <ommitted rest for space but all are as long as Core #0>
```

Issue: #5843
derekbruening added a commit that referenced this issue Aug 18, 2023
Adds printing '.' for every record and '-' for waiting to the
scheduler unit tests and udpates all the expected output.  This makes
it much easier to understand some of the results as now the lockstep
timing all lines up.

Adds -print_every to the launcher and switches to printing letters for
a better output of what happened on each core (if #inputs<=26).
Example:
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 60000 -print_every 5000
Core #0: GGGGGGGGG,HH,F,B,G,I,A,CC,G,BB,A,FF,AA,GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
Core #1: D,C,D,B,H,FF,EE,CC,II,AA,C,D,G,HH,D,G,II,G,I,G,I,G,HH,BB,II,BB,C,H,I,AA,C,F,I,H,II,AA,C,H,A,H,F,CC,DD,C,BB,HH,CC,F,BB,C,D,H,BB,D,B,EE,I,E,DD,B,F,H,A,D,C,D,E,B,D,I,D,AA,E,DD,EE,CC,II,C,D,I,AA,DD,B,E,I,D,C,E,FF,E,BB,EE,FF,E,AA,D,E,DD,H,BB,HH,D,H,BB,I,AA,II,H,A,FF,H,I,HH,DD,I,H,F,DD,I,A,HH,AA,CC,BB,CC,BB,D,B,FF,H,F,D,I,DD,FF,C,A,C,AA,F,AA,EE,A,D,E,FF,AA,F,A,E,A,E,DD,EE,F,E,F,A
Core #2: F,E,F,C,F,H,I,B,HH,II,FF,CC,G,H,DD,E,A,G,H,G,DD,G,F,D,A,H,I,FF,H,C,A,CC,II,A,FF,C,I,F,CC,B,FF,C,B,H,CC,B,D,B,DD,B,F,I,F,II,D,A,DD,I,D,H,E,H,I,D,HH,FF,BB,II,AA,EE,B,A,BB,E,II,A,BB,A,HH,E,AA,E,F,A,DD,HH,F,H,A,E,I,FF,I,B,F,II,A,FF,D,H,DD,I,AA,F,D,FF,AA,D,A,HH,A,H,F,A,FF,C,F,B,F,C,F,AA,B,FF,D,F,DD,B,C,H,CC,B,C,E,D,EE,C,E,D,EE,F,DD,E,F,D,A,DD,E,D,EE,D,E,D,AA,D,A,DD,F,D,C,D
Core #3: E,A,F,A,D,I,DD,BB,AA,BB,DD,G,EE,AA,H,G,D,B,G,B,G,II,F,HH,B,AA,I,B,A,HH,CC,HH,F,A,FF,C,HH,BB,F,D,F,C,FF,H,C,FF,DD,AA,I,B,II,AA,I,A,B,A,F,A,C,I,B,H,A,F,C,A,C,EE,F,D,EE,CC,E,BB,E,DD,E,CC,B,EE,C,EE,B,I,E,D,E,II,H,B,EE,I,EE,B,II,F,EE,A,D,AA,DD,HH,F,A,F,HH,D,A,II,H,F,II,FF,CC,B,AA,F,A,C,FF,D,C,D,CC,B,C,DD,H,I,F,CC,A,F,C,FF,E,A,DD,E,D,A,FF,AA,EE,F,DD,FF,E,F,EE,FF,AA,EEEEEEEEEEEEEEEE
```

Issue: #5843
derekbruening added a commit that referenced this issue Aug 21, 2023
Adds printing '.' for every record and '-' for waiting to the scheduler
unit tests and udpates all the expected output. This makes it much
easier to understand some of the results as now the lockstep timing all
lines up.

Adds -print_every to the launcher and switches to printing letters for a
better output of what happened on each core (if #inputs<=26). Example:
```
$ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 60000 -print_every 5000
Core #0: GGGGGGGGG,HH,F,B,G,I,A,CC,G,BB,A,FF,AA,GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
Core #1: D,C,D,B,H,FF,EE,CC,II,AA,C,D,G,HH,D,G,II,G,I,G,I,G,HH,BB,II,BB,C,H,I,AA,C,F,I,H,II,AA,C,H,A,H,F,CC,DD,C,BB,HH,CC,F,BB,C,D,H,BB,D,B,EE,I,E,DD,B,F,H,A,D,C,D,E,B,D,I,D,AA,E,DD,EE,CC,II,C,D,I,AA,DD,B,E,I,D,C,E,FF,E,BB,EE,FF,E,AA,D,E,DD,H,BB,HH,D,H,BB,I,AA,II,H,A,FF,H,I,HH,DD,I,H,F,DD,I,A,HH,AA,CC,BB,CC,BB,D,B,FF,H,F,D,I,DD,FF,C,A,C,AA,F,AA,EE,A,D,E,FF,AA,F,A,E,A,E,DD,EE,F,E,F,A
Core #2: F,E,F,C,F,H,I,B,HH,II,FF,CC,G,H,DD,E,A,G,H,G,DD,G,F,D,A,H,I,FF,H,C,A,CC,II,A,FF,C,I,F,CC,B,FF,C,B,H,CC,B,D,B,DD,B,F,I,F,II,D,A,DD,I,D,H,E,H,I,D,HH,FF,BB,II,AA,EE,B,A,BB,E,II,A,BB,A,HH,E,AA,E,F,A,DD,HH,F,H,A,E,I,FF,I,B,F,II,A,FF,D,H,DD,I,AA,F,D,FF,AA,D,A,HH,A,H,F,A,FF,C,F,B,F,C,F,AA,B,FF,D,F,DD,B,C,H,CC,B,C,E,D,EE,C,E,D,EE,F,DD,E,F,D,A,DD,E,D,EE,D,E,D,AA,D,A,DD,F,D,C,D
Core #3: E,A,F,A,D,I,DD,BB,AA,BB,DD,G,EE,AA,H,G,D,B,G,B,G,II,F,HH,B,AA,I,B,A,HH,CC,HH,F,A,FF,C,HH,BB,F,D,F,C,FF,H,C,FF,DD,AA,I,B,II,AA,I,A,B,A,F,A,C,I,B,H,A,F,C,A,C,EE,F,D,EE,CC,E,BB,E,DD,E,CC,B,EE,C,EE,B,I,E,D,E,II,H,B,EE,I,EE,B,II,F,EE,A,D,AA,DD,HH,F,A,F,HH,D,A,II,H,F,II,FF,CC,B,AA,F,A,C,FF,D,C,D,CC,B,C,DD,H,I,F,CC,A,F,C,FF,E,A,DD,E,D,A,FF,AA,EE,F,DD,FF,E,F,EE,FF,AA,EEEEEEEEEEEEEEEE
```

Issue: #5843
jackgallagher-arm added a commit that referenced this issue Jan 8, 2024
When debugging i#6499 we noticed that drcachesim was producing 0 byte
read/write records for some SVE load/store instructions:

```
 ifetch       4 byte(s) @ 0x0000000000405b3c a54a4681   ld1w   (%x20,%x10,lsl #2) %p1/z -> %z1.s
 read         0 byte(s) @ 0x0000000000954e80 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e84 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e88 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e8c by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e90 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e94 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e98 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e9c by PC 0x0000000000405b3c
 ifetch       4 byte(s) @ 0x0000000000405b4
```

This turned out to be due to drdecode being linked into drcachesim
twice: once into the drcachesim executable, once into libdynamorio.
drdecode uses a global variable to store the SVE vector length to use
when decoding so we end up with two copies of that variable and only
one was being initialized.
To fix this properly we would need to refactor the libraries so that
there is only one copy of the sve_veclen global variable, or change the
way that the decoder gets the vector length so its no longer stored in
a global variable. In the mean time we have a workaround which
makes sure both copies of the variable get initialized and drcachesim
produces correct results.

With that workaround in place however, the results were still wrong.
For expanded scatter/gather instructions when you are using an offline
trace, raw2trace doesn't have access to the load/store instructions
from the expansion, only the original app scatter/gather instruction.
It has to create the read/write records using only information from the
original scatter/gather instruction and it uses the size of the memory
operand to determine the size of each read/write. This works for x86
because the x86 IR uses the per-element data size as for the memory
operand of scatter/gather instructions. This doesn't work for AArch64
because the AArch64 codec uses the maximum data transferred
(per-element data size * number of elements) like other SIMD load/store
instructions.

We plan to make the AArch64 IR consistent with x86 by changing it to
use the same convention as x86 for scatter/gather instructions but in
the mean time we can work around the inconsistency by fixing the size
in raw2trace based on the instruction's opcode.

Issues: #6499, #5365
jackgallagher-arm added a commit that referenced this issue Jan 18, 2024
When debugging i#6499 we noticed that drcachesim was producing 0 byte
read/write records for some SVE load/store instructions:

```
 ifetch       4 byte(s) @ 0x0000000000405b3c a54a4681   ld1w   (%x20,%x10,lsl #2) %p1/z -> %z1.s
 read         0 byte(s) @ 0x0000000000954e80 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e84 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e88 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e8c by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e90 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e94 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e98 by PC 0x0000000000405b3c
 read         0 byte(s) @ 0x0000000000954e9c by PC 0x0000000000405b3c
 ifetch       4 byte(s) @ 0x0000000000405b4
```

This turned out to be due to drdecode being linked into drcachesim
twice: once into the drcachesim executable, once into libdynamorio.
drdecode uses a global variable to store the SVE vector length to use
when decoding so we end up with two copies of that variable and only one
was being initialized.
To fix this properly we would need to refactor the libraries so that
there is only one copy of the sve_veclen global variable, or change the
way that the decoder gets the vector length so its no longer stored in a
global variable. In the mean time we have a workaround which makes sure
both copies of the variable get initialized and drcachesim produces
correct results.

With that workaround in place however, the results were still wrong. For
expanded scatter/gather instructions when you are using an offline
trace, raw2trace doesn't have access to the load/store instructions from
the expansion, only the original app scatter/gather instruction. It has
to create the read/write records using only information from the
original scatter/gather instruction and it uses the size of the memory
operand to determine the size of each read/write. This works for x86
because the x86 IR uses the per-element data size as for the memory
operand of scatter/gather instructions. This doesn't work for AArch64
because the AArch64 codec uses the maximum data transferred (per-element
data size * number of elements) like other SIMD load/store instructions.

We plan to make the AArch64 IR consistent with x86 by changing it to use
the same convention as x86 for scatter/gather instructions but in the
mean time we can work around the inconsistency by fixing the size in
raw2trace based on the instruction's opcode.

Issues: #6499, #5365, #5036
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant