Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master to hlsl-2021 #3722

Merged
merged 85 commits into from Apr 30, 2021
Merged

Conversation

hekota
Copy link
Member

@hekota hekota commented Apr 29, 2021

Fixes #3696

sooknarine and others added 30 commits Mar 4, 2021
ShaderOpTest::CreatePipelineState needs to check the return code from ID3D12Device2::CreatePipelineState so failures immediately stop here.

Otherwise we might get a failure in a subsequent call like when CommandListRefs::CreateForDevice() calls ID3D12Device::CreateCommandQueue() - this one will fail with return code 0x887a0005 - The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.
An unrelated earlier change mistakenly removed the return on a line
calling the Scatterer constructor for an argument value. As a result,
the constructor is called, but the resulting object which would insert
new extraction ops in the entry block is not used at all. Instead,
the default case is encountered where the vector PHI op is used as an
insertion point, which places the extractions after the new scalar PHIs
that use them. This is caught in the loop case by an assert because the
assumptions that caused LCSSA to be skipped for this BB are faulty.
Cycles in the dependency graph can result in phis and other instructions
failing to get a wave sensitivity result after analysis. This can only
happen when nothing in the loop has wave sensitive operations.

By collecting phis that are unknown and marking them as non-sensitive
after analyze is complete and rerunning it, we can correctly evaluate
their dependants too.

Because incompletely processed blocks might still be wave-sensitive,
unknown phis are only added to be reprocessed when their preds have all
be processed fully.

In addition to cycles preventing wave sensitivity determination on
dependant operations, they can prevent entering blocks that depend on
values with cyclic dependencies. Even when reprocessing all unknown
phis, this can result in not all values getting a determination.

To resolve this, all blocks are added to the block list from the start
to ensure that they are all processed eventually. Since instruction
processing is more intelligent and targetted, this change is accompanied
by a change that only processes a single block from the worklist before
attempting to address queued instructions again.

To catch wave-sensitive instructions sooner, when visiting an
instruction, all arguments are checked for wave sensitivity even after
an unknown argument is found whereas before an unknown argument ended
the search.

Add some tricky loop tests that failed in ToT and various intermediate
stages of the change. Add a more aggressive assert to check for failures
to find the wave-sensitivity.

Fixed microsoft#2556 (again)
Comparing two unbound arrays where one was initilizing the other
resulted in infinitely incrementing both and never reaching a
conclusion.

This adds detection to FlattenedTypeIterator for the situation whereupon
the comparison is terminated and the result returned.

A simple test of the assignment is added expecting the correct error.
microsoft#3549)

merge from microsoft#3545

(cherry picked from commit 7e1a660)

authored-by: Justin Holewinski <jholewinski@nvidia.com>
An esoteric build produced errors as a result of only using assigned
variables in asserts. This makes a dummy use of them to quiet it.
The mesh shader path for QuadReadTest expects at least one render target. Without, the test framework attempts to clear a nonexistent RTV resource:

D3D12 ERROR: ID3D12CommandList::ClearRenderTargetView: Specified CPU descriptor handle ptr=0xCCCCCCCCCCCCCCCC does not refer to a location in a descriptor heap.

The commit in this PR simply adds an RTV, and will proceed through to test completion.
…ft#3558)

* Take care case dynamic indexing on vector start with offset.
Instead of verifying that the args of an known phi were not known to be
sensitive, it was checking the users, which can very well be wave
sensitive
…mum precision variables (microsoft#3546)

Currently, arrays of min16float etc. types don't get a RelaxedPrecision decoration for the OpVariable. This has some real performance implications since at least some hardware can fit more RelaxedPrecision variables into registers, and maybe also groupshared memory.

This pull request add the missing RelaxedPrecision decoration for OpVariables of array types in a similar way vector and matrix minimum precision types are already handled.
Technically, the test matched the spec, but the D3D header didn't. I'm
more interested in making this work than who was right though ;)
* Support -export-shaders-only for linker.
This extension adds qualifiers for payload structures accompanied with semantic checks and code generation. This feature is opt-in for SM 6.6 libraries.  The information added by the developer is stored in the DXIL type system and a new metadata node is emitted during code generation. The metadata is not necessary for correct translation of DXIL, so it may be safely ignored, but it provides hints to unlock potential optimizations in payload storage between DXR shader stages.
…icrosoft#3576)

- VFS captures output files for duration of test, enabling:
- %dxl test IDxcLinker
- add -D to FileCheck args to supply defined variables
- report failing RUN command when not consumed by FileCheck or XFail
…s = 0 (microsoft#3578)

- Fix missing component type and count for Texture2DMS in ResProps
- Fix reflection for Texture2DMS with default NumSamples
…ing (microsoft#3586)

* Use D3D12 headers from the same Windows 10 SDK version as cmake is using (cmake variable CMAKE_VS_WINDOWS_TARGET_PLATFORM_VERSION)

* For ARM64EC builds detect SDK version from VSDevCmd environment variable instead of a fixed version in hctbuild.cmd

* Add option -show-cmake-log to hctbuild and enable it in AppVeyor
adam-yang and others added 25 commits Mar 30, 2021
PR microsoft#3636 (commit 6b44d61) changes the definition of
`llvm_unreachable(msg)`. Now it evaluates the `msg` even when
NDEBUG flag is defined.

For the following code,
```
static const char *const BundlingNotImplementedMsg =
  "Aligned bundling is not implemented for this object format";

void MCObjectStreamer::EmitBundleAlignMode(unsigned AlignPow2) {
  llvm_unreachable(BundlingNotImplementedMsg);
}
```

`llvm_unreachable(BundlingNotImplementedMsg)` did not evaluate
`BundlingNotImplementedMsg`. After PR microsoft#3636, it is evaluated and it
results in "undefined object" for `BundlingNotImplementedMsg`.
The mark convergent pass is meant to prevent unwanted moving of
operations on derivative op input. It was previously only run on pixel
shaders. Because derivatives are supported in CS/MS/AS shaders as part
of shader model 6.6, it needs to be run on these stages for that target
too.
Amidst catching the CS/AS/MS cases where convergent markers weren't
getting generated, it was pointed out that libs may need these too.
dbg.value can occasionally return a null value. (Hit this in a customer (343) shader via PIX.) This is expected. From IntrinsicInst.cpp:

  // When the value goes to null, it gets replaced by an empty MDNode.
microsoft#3656)

spirv-val reports a validation error when we use variable offset for
OpImage*. We have to rely on spirv-val for the validation instead of
doing it in DXC. In particular, when the HLSL code uses the loop
invariant variable for the Sample() intrinsics, it should be a constant
value after conducting the loop unrolling. Since we rely on spirv-opt
for the loop unrolling, we should not report the validation error in
DXC.

Fixes microsoft#3575
* Save root sig for entry for lib.
…rosoft#3664)

Change the mapping of SV_ShadingRate from FragSizeEXT (VK_EXT_fragment_density_map) to PrimitiveShadingRateKHR/ShadingRateKHR (VK_KHR_variable_rate_fragment_shading).
)

When calling a method using an object, we have to correctly pass "this"
object. When the class of the object used for the method call has
multiple base classes, it does not use the correct base class. It simply
uses "this" object. This commit fixes the issue.

Fixes microsoft#3644
…serts/crashes (microsoft#3683)

When a diagnostic that should be emitted with DiagnoceOnce has already been emitted, an invalid DiagnosticsBuilder is returned to prohibit another emittance of the diagnostic. The current implementation, however, does not set IsActive=false and the DiagnosticsEngine tries to emit a diagnostic with an invalid DiagID. This function is currently only used with payload access qualifiers to emit a warning only once if qualifiers are dropped because PAQs are disabled because of the SM version or the user opted-out. (Update the corresponding test accordingly).
@hekota hekota requested review from pow2clk and tex3d Apr 29, 2021
tex3d
tex3d approved these changes Apr 29, 2021
Copy link
Contributor

@tex3d tex3d left a comment

Looks good to merge ;)

@AppVeyorBot
Copy link

@AppVeyorBot AppVeyorBot commented Apr 29, 2021

@hekota hekota merged commit b12f133 into microsoft:hlsl-2021 Apr 30, 2021
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet