Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

Commit

Permalink
AMDGPU: Implement Target ID Proposal
Browse files Browse the repository at this point in the history
Changes from Austin Kerbow, Konstantin Zhuravlyov, Tony Tye, Yaxun Liu

Change-Id: Id14bfd70464114d2dca9249ce2472657da534aa8
  • Loading branch information
kzhuravl committed Nov 23, 2020
1 parent 61220cd commit 1ccc5cc
Show file tree
Hide file tree
Showing 249 changed files with 10,108 additions and 5,852 deletions.
117 changes: 106 additions & 11 deletions clang/docs/ClangCommandLineReference.rst
Expand Up @@ -26,6 +26,14 @@ Add <dir> to search path for binaries and object files used implicitly

Add directory to framework include search path

.. option:: -Mallocatable=<arg>

Select semantics for assignments to allocatables (F03 or F95)

.. option:: -Minform=<arg>

Set error level of messages to display

.. option:: -ObjC

Treat source input files as Objective-C inputs
Expand Down Expand Up @@ -324,6 +332,10 @@ Make the next included directory (-I or -F) an indexer header map

Enforce targets of indirect branches and function returns

.. option:: -menable-unsafe-fp-math

Allow unsafe floating-point math optimizations which may decrease precision

.. option:: -mharden-sls=<arg>

Select straight-line speculation hardening scope
Expand Down Expand Up @@ -476,12 +488,16 @@ Only modify files with a filename contained in the provided directory path

.. option:: --offload-arch=<arg>, --cuda-gpu-arch=<arg>, --no-offload-arch=<arg>

CUDA offloading device architecture (e.g. sm\_35), or HIP offloading target ID in the form of a device architecture followed by target ID features delimited by a colon. Each target ID feature is a pre-defined string followed by a plus or minus sign (e.g. gfx908:xnack+:sram-ecc-). May be specified more than once.
CUDA offloading device architecture (e.g. sm\_35), or HIP offloading target ID in the form of a device architecture followed by target ID features delimited by a colon. Each target ID feature is a pre-defined string followed by a plus or minus sign (e.g. gfx908:xnack+:sramecc-). May be specified more than once.

.. option:: -p, --profile

.. option:: -pagezero\_size<arg>

.. option:: -parallel-jobs=<arg>

Number of parallel jobs

.. option:: -pg

Enable mcount instrumentation
Expand Down Expand Up @@ -1601,7 +1617,7 @@ Enable support for int128\_t type

.. option:: -ffp-contract=<arg>

Form fused FP ops (e.g. FMAs): fast (everywhere) \| on (according to FP\_CONTRACT pragma) \| off (never fuse). Default is 'fast' for CUDA/HIP and 'on' otherwise.
Form fused FP ops (e.g. FMAs): fast (fuses across statements disregarding pragmas) \| on (only fuses in the same statement unless dictated by pragmas) \| off (never fuses) \| fast-honor-pragmas (fuses across statements unless diectated by pragmas). Default is 'fast' for CUDA, 'fast-honor-pragmas' for HIP, and 'on' otherwise.

.. option:: -ffp-exception-behavior=<arg>

Expand Down Expand Up @@ -1737,6 +1753,12 @@ Specify the maximum alignment to enforce on pointers lacking an explicit alignme

Enable heap memory profiling

.. program:: clang1
.. option:: -fmemory-profile=<directory>
.. program:: clang

Enable heap memory profiling and dump results into <directory>

.. option:: -fmerge-all-constants, -fno-merge-all-constants

Allow merging of constants
Expand Down Expand Up @@ -1845,6 +1867,10 @@ Directly create compilation output files. This may lead to incorrect incremental
.. option:: -fno\_pch-validate-input-files-content
.. program:: clang

.. program:: clang2
.. option:: -fno\_prebuilt-implicit-modules
.. program:: clang

.. option:: -fnoxray-link-deps

.. option:: -fobjc-abi-version=<arg>
Expand Down Expand Up @@ -1969,6 +1995,10 @@ Validate PCH input files based on content if mtime differs

Load the named plugin (dynamic shared object)

.. option:: -fprebuilt-implicit-modules

Look up implicit modules in the prebuilt module path

.. option:: -fpreserve-as-comments, -fno-preserve-as-comments

.. option:: -fprofile-arcs, -fno-profile-arcs
Expand Down Expand Up @@ -2128,6 +2158,8 @@ char is signed

.. option:: -fsigned-zeros, -fno-signed-zeros

Allow optimizations that ignore the sign of floating point zeros

.. option:: -fsized-deallocation, -fno-sized-deallocation

Enable C++14 sized global deallocation functions
Expand Down Expand Up @@ -2323,6 +2355,22 @@ Generate verbose assembly output

Enables dead virtual function elimination optimization. Requires -flto=full

.. option:: -fvisibility-dllexport=<arg>

The visibility for dllexport defintions \[-fvisibility-from-dllstorageclass\]

.. option:: -fvisibility-externs-dllimport=<arg>

The visibility for dllimport external declarations \[-fvisibility-from-dllstorageclass\]

.. option:: -fvisibility-externs-nodllstorageclass=<arg>

The visibility for external declarations without an explicit DLL dllstorageclass \[-fvisibility-from-dllstorageclass\]

.. option:: -fvisibility-from-dllstorageclass, -fno-visibility-from-dllstorageclass

Set the visiblity of symbols in the generated code from their DLL storage class

.. option:: -fvisibility-global-new-delete-hidden

Give global C++ operator new and delete declarations hidden visibility
Expand All @@ -2339,6 +2387,10 @@ When -fvisibility-inlines-hidden is enabled, static variables in inline C++ memb

Give global types 'default' visibility and global functions and variables 'hidden' visibility by default

.. option:: -fvisibility-nodllstorageclass=<arg>

The visibility for defintiions without an explicit DLL export class \[-fvisibility-from-dllstorageclass\]

.. option:: -fvisibility=<arg>

Set the default symbol visibility for all global declarations
Expand Down Expand Up @@ -2663,6 +2715,10 @@ Align selected branches (fused, jcc, jmp) within 32-byte boundary

Legacy option to specify code object ABI V2 (-mnocode-object-v3) or V3 (-mcode-object-v3) (AMDGPU only)

.. option:: -mcode-object-version=<version>

Specify code object ABI version. Defaults to 4. (AMDGPU only)

.. option:: -mconsole<arg>

.. program:: clang1
Expand Down Expand Up @@ -2827,6 +2883,10 @@ Use software floating point

.. option:: -mspeculative-load-hardening, -mno-speculative-load-hardening

.. option:: -msram-ecc, -mno-sram-ecc

Legacy option to specify SRAM ECC mode (AMDGPU only)

.. option:: -mstack-alignment=<arg>

Set the stack alignment
Expand Down Expand Up @@ -2959,13 +3019,9 @@ AMDGPU

Specify CU (-mcumode) or WGP (-mno-cumode) wavefront execution mode (AMDGPU only)

.. option:: -msram-ecc, -mno-sram-ecc

Specify SRAM ECC mode (AMDGPU only)

.. option:: -mxnack, -mno-xnack

Specify XNACK mode (AMDGPU only)
Legacy option to specify XNACK mode (AMDGPU only)

ARM
---
Expand Down Expand Up @@ -3537,6 +3593,10 @@ a Fortran input.

.. option:: -J<arg>

.. option:: -byteswapio

Swap byte-order for unformatted input/output

.. option:: -cpp

.. option:: -faggressive-function-elimination, -fno-aggressive-function-elimination
Expand All @@ -3545,10 +3605,16 @@ a Fortran input.

.. option:: -fall-intrinsics, -fno-all-intrinsics

.. option:: -fallow-fortran-gnu-ext, -fno-allow-fortran-gnu-ext

Allow Fortran GNU extensions

.. option:: -fautomatic, -fno-automatic

.. option:: -fbackslash, -fno-backslash

Treat backslash as C-style escape character

.. option:: -fbacktrace, -fno-backtrace

.. option:: -fblas-matmul-limit=<arg>
Expand Down Expand Up @@ -3587,18 +3653,18 @@ a Fortran input.

.. option:: -ff2c, -fno-f2c

.. option:: -ffixed-form, -fno-fixed-form

.. option:: -ffixed-line-length-<arg>

.. option:: -ffpe-trap=<arg>

.. option:: -ffree-form, -fno-free-form

.. option:: -ffree-line-length-<arg>

.. option:: -ffrontend-optimize, -fno-frontend-optimize

.. option:: -ffunc-args-alias, -fno-func-args-alias

Function argument may alias (equivalent to ansi alias)

.. option:: -fimplicit-none, -fno-implicit-none

.. option:: -finit-character=<arg>
Expand Down Expand Up @@ -3627,6 +3693,10 @@ a Fortran input.

.. option:: -fmodule-private, -fno-module-private

.. option:: -fno-fortran-main

Don't link in Fortran main

.. option:: -fpack-derived, -fno-pack-derived

.. option:: -fprotect-parens, -fno-protect-parens
Expand Down Expand Up @@ -3661,6 +3731,10 @@ a Fortran input.

.. option:: -funderscoring, -fno-underscoring

.. option:: -fuse-flang-math-libs, -fno-use-flang-math-libs

Use Flang internal runtime math library instead of LLVM math intrinsics.

.. option:: -fwhole-file, -fno-whole-file

.. option:: -imultilib <arg>
Expand All @@ -3669,6 +3743,26 @@ a Fortran input.

.. option:: -static-libgfortran

Fortran format Group
~~~~~~~~~~~~~~~~~~~~
.. option:: -ffixed-form, -fno-fixed-form

Enable fixed-form format for Fortran

.. option:: -ffree-form, -fno-free-form

Enable free-form format for Fortran

Flang runtime library Group
===========================
.. option:: -no-flang-libs

Do not link against Flang libraries

.. option:: -static-flang-libs

Link using static Flang libraries

Linker flags
============
Flags that are passed on to the linker
Expand Down Expand Up @@ -3764,3 +3858,4 @@ undef all system defines
.. option:: -z <arg>

Pass -z <arg> to the linker

0 comments on commit 1ccc5cc

Please sign in to comment.