Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASAN] Adjust asan instrumented GlobalVariable size to not include redzone #66666

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skc7
Copy link
Contributor

@skc7 skc7 commented Sep 18, 2023

Add SanitizerMetadata to instrumented GlobalVariables if not present. This is used by AsmPrinter to identify the globals which are instrumented and only emit the actual size without the redzones.

@skc7 skc7 changed the title [WIP][AMDGPU] Adjust asan instrumented GlobalVaribale size to not include redzone [WIP][AMDGPU] Adjust asan instrumented GlobalVariable size to not include redzone Sep 18, 2023
@skc7 skc7 changed the title [WIP][AMDGPU] Adjust asan instrumented GlobalVariable size to not include redzone [ASAN] Adjust asan instrumented GlobalVariable size to not include redzone Sep 19, 2023
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 19, 2023
@skc7 skc7 requested a review from vpykhtin September 19, 2023 12:50
@skc7
Copy link
Contributor Author

skc7 commented Sep 20, 2023

Clang codegen seems to add the SanitizerMetadata for globals, but when asan pass is run, some globals seems to have lost that. So I'm checking in asan pass if globals hasSanitizerMetadata() and if not present, adding one, with NoAddress set to false.
This is used in AsmPrinter, to identify the asan instrumented globals with padded redzones.

@@ -2445,6 +2445,11 @@ bool ModuleAddressSanitizer::InstrumentGlobals(IRBuilder<> &IRB, Module &M,
// zero so we can copy the metadata over as is.
NewGlobal->copyMetadata(G, 0);

// Set sanitizer metadata for newly created global,
// if it doesn't have it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we comment here that this may be a workaround for metadata lost between front end and this instrumentation pass?

@@ -2445,6 +2445,11 @@ bool ModuleAddressSanitizer::InstrumentGlobals(IRBuilder<> &IRB, Module &M,
// zero so we can copy the metadata over as is.
NewGlobal->copyMetadata(G, 0);

// Set sanitizer metadata for newly created global,
// if it doesn't have it.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang codegen seems to add the SanitizerMetadata for globals, but when asan pass is run, some globals seems to have lost that. So I'm checking in asan pass if globals hasSanitizerMetadata() and if not present, adding one, with NoAddress set to false.
This is used in AsmPrinter, to identify the asan instrumented globals with padded redzones.

@skc7 skc7 requested a review from bcahoon September 25, 2023 15:26
@bcahoon
Copy link
Contributor

bcahoon commented Sep 27, 2023

Sorry if I'm asking a naive question, but doesn't changing .size for the globals remove the redzone space that is needed?

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memtag test is not for Asan,

@vitalybuka
Copy link
Collaborator

This change looks suspicious.
Please wait for @hctim review.

Copy link
Collaborator

@hctim hctim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, clang/test/CodeGen/memtag-globals-asm.cpp is for MTE Globals, not ASan - and the sizes of the GVs should be multiple-of-16 bytes: https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#compilation

What problem are you trying to solve here?

@skc7
Copy link
Contributor Author

skc7 commented Sep 28, 2023

Yeah, clang/test/CodeGen/memtag-globals-asm.cpp is for MTE Globals, not ASan - and the sizes of the GVs should be multiple-of-16 bytes: https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#compilation

What problem are you trying to solve here?

ASAN pass identifies the global variables that needs to be instrumented and replaces them with new globals with size equal to actual size + redzone size. To identify such instrumented global variables, added SanitizerMetadata to the new global, which will have NoAddress set to false(which implies asan instrumented global). At asm printer stage, such gloabal would be identified and actual value without redzone size would be emitted.

This change was done under assumption that any target would only want the actual size of global in the elf and not the padded size. AMDGPU needs this change. Please let me know if it causes issue with other targets?

@hctim
Copy link
Collaborator

hctim commented Sep 28, 2023

Yeah, clang/test/CodeGen/memtag-globals-asm.cpp is for MTE Globals, not ASan - and the sizes of the GVs should be multiple-of-16 bytes: https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#compilation
What problem are you trying to solve here?

ASAN pass identifies the global variables that needs to be instrumented and replaces them with new globals with size equal to actual size + redzone size. To identify such instrumented global variables, added SanitizerMetadata to the new global, which will have NoAddress set to false(which implies asan instrumented global). At asm printer stage, such gloabal would be identified and actual value without redzone size would be emitted.

This change was done under assumption that any target would only want the actual size of global in the elf and not the padded size. AMDGPU needs this change. Please let me know if it causes issue with other targets?

Under MTE globals (not asan, but the test you're changing in memtag-globals-asm.cpp), there are no redzones - the round-to-16-byte size is what we want to be in the ELF.

Copying the sanitizer metadata over seems fine, but reusing NoAddress as an identifier that this is a sanitizer-instrumented GV is wrong. Other GVs that are explicitly not asan-ified (e.g. by using the __attribute__((no_sanitize("address"))) attribute) end up with the same attribute.

Also, to the premise, why is ANDGPU unhappy about having the sizeof(GV) == global+redzone in the ELF?

@b-sumner
Copy link

b-sumner commented Sep 28, 2023

The reason this is a problem is that AMD language runtimes provide queries for the size of device global symbols, and functions to copy data to and from device global variables. The runtime gets the needed information form the ELF symbol table. This approach works fine when the ELF size reflects the actual size of the variable. But when it reflects the instrumented size of the variable then it causes problems, e.g. reading from the poisoned redzone. In general it doesn't seem like instrumentation should have such an effect.

@skc7
Copy link
Contributor Author

skc7 commented Sep 28, 2023

Yeah, clang/test/CodeGen/memtag-globals-asm.cpp is for MTE Globals, not ASan - and the sizes of the GVs should be multiple-of-16 bytes: https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#compilation
What problem are you trying to solve here?

ASAN pass identifies the global variables that needs to be instrumented and replaces them with new globals with size equal to actual size + redzone size. To identify such instrumented global variables, added SanitizerMetadata to the new global, which will have NoAddress set to false(which implies asan instrumented global). At asm printer stage, such gloabal would be identified and actual value without redzone size would be emitted.
This change was done under assumption that any target would only want the actual size of global in the elf and not the padded size. AMDGPU needs this change. Please let me know if it causes issue with other targets?

Under MTE globals (not asan, but the test you're changing in memtag-globals-asm.cpp), there are no redzones - the round-to-16-byte size is what we want to be in the ELF.

Copying the sanitizer metadata over seems fine, but reusing NoAddress as an identifier that this is a sanitizer-instrumented GV is wrong. Other GVs that are explicitly not asan-ified (e.g. by using the __attribute__((no_sanitize("address"))) attribute) end up with the same attribute.

Also, to the premise, why is ANDGPU unhappy about having the sizeof(GV) == global+redzone in the ELF?

@hctim,
As @b-sumner replied, amd language runtime requires to know the actual global variable size rather than padded size. Do you suggest any other way of identifying such instrumented global variables (instead of checking NoAddress as false)?

@skc7
Copy link
Contributor Author

skc7 commented Oct 11, 2023

@hctim,
We want to identify global variables instrumented by asan at AsmPrinter stage. I can think of two approaches here:

  • Add SanitizerMetadata::NoAddress(false) at asan pass and also check if global is present in llvm.compiler.used list.

  • Create a new attribute (For Ex: "AsanInstrumentedVariable") and attach it to instrumented global variable at asan pass. Identify that GV at asmPrinter stage, using the attached attribute.

Please let me know if any of these approaches work here?

@hctim
Copy link
Collaborator

hctim commented Oct 12, 2023

My gut feeling is that it's a really bad idea to have a global variable whose symtab size differs from the underlying GV size. So I tested against lld, gold, and ld, and they all seem to end up with int-typed GVs having a filesize of 32 bytes and an ELF st_size of 4 bytes, and the runtime seems okay, but I think this breaks ELF rules:

https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-79797/index.html#:~:text=Symbol%20Values%22.-,st_size,-Many%20symbols%20have

st_size: Many symbols have associated sizes. For example, a data object's size is the number of bytes contained in the object. This member holds 0 if the symbol has no size or an unknown size.

I suspect this could break relinkers and various other things.

It doesn't seem clear to me why amdgpu has problems with copying the extra redzone padding. We may also actually use the redzone for metadata and would expect that it would be consistent.

@b-sumner
Copy link

@hctim any implementation will have a problem when copying the redzone padding. That is simply because the redzone is poisoned (hence the name). Any access to a poisoned location will be reported and the application terminated. That is how ASAN works.

@hctim
Copy link
Collaborator

hctim commented Oct 13, 2023

My assumption is that you have some driver code or preloaded DSO that effectively implements copy_to_amdgpu, which would do something with the symtab.

Can you just make your driver not be asan-ified (either by not building it with -fsanitize=address or using __attribute__((no_sanitize("address"))) on the copy_to_amdgpu function? Then you'd copy the right quantity of data, and the access of the redzone would be ignored as there's no instrumentation there.

This seems like a more appropriate thing than making st_other symtab entries not actually the right size.

@b-sumner
Copy link

@hctim as far as I'm concerned, the symbol is already not the right size. The right size of a global variable of type float is 4, not 32. And disabling ASAN checking in the copy mechanism would reduce the usefulness of the address sanitizer.

But I take your point about linker optimizations. We are looking at alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category compiler-rt:sanitizer llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants