Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GVN] Invalidate ICF cache when clearing the instructions #68145

Closed
wants to merge 1 commit into from

Conversation

xgupta
Copy link
Contributor

@xgupta xgupta commented Oct 3, 2023

This fixes #48805

/home/user/llvm-project/llvm/lib/Analysis/InstructionPrecedenceTracking.cpp:93: void llvm::InstructionPrecedenceTracking::validate(const llvm::BasicBlock *) const: Assertion `It->second == &Insn && "Cached first special instruction is wrong!"' failed.

This fixes llvm#48805

/home/user/llvm-project/llvm/lib/Analysis/InstructionPrecedenceTracking.cpp:93:
void llvm::InstructionPrecedenceTracking::validate(const llvm::BasicBlock *) const:
Assertion `It->second == &Insn && "Cached first special instruction is wrong!"' failed.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 3, 2023

@llvm/pr-subscribers-llvm-transforms

Changes

This fixes #48805

/home/user/llvm-project/llvm/lib/Analysis/InstructionPrecedenceTracking.cpp:93: void llvm::InstructionPrecedenceTracking::validate(const llvm::BasicBlock *) const: Assertion `It->second == &Insn && "Cached first special instruction is wrong!"' failed.


Full diff: https://github.com/llvm/llvm-project/pull/68145.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/GVN.cpp (+1)
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index bc54846ccf0ad2d..f7a905c2e13c4d4 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -2799,6 +2799,7 @@ bool GVNPass::processBlock(BasicBlock *BB) {
       salvageDebugInfo(*I);
       removeInstruction(I);
     }
+    ICF->clear();
     InstrsToErase.clear();
 
     if (AtStart)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing ICF as a whole is not the correct way to fix this. It needs to be incrementally invalidated, as other parts of GVN already do. The first step would be to provide a test case for the assertion failure.

@w2yehia
Copy link
Contributor

w2yehia commented Oct 4, 2023

We hit this on AIX while building python with -flto. Here's a reduced testcase:

; ModuleID = 'reduced.bc'
target datalayout = "E-m:a-Fi64-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64-ibm-aix7.3.0.0"

%struct.PyGetSetDef = type { ptr, ptr, ptr, ptr, ptr }
%struct.anon.28 = type { ptr, ptr, ptr, ptr, ptr, ptr }

@_PyMem_Raw = external global %struct.PyGetSetDef
@__profc_PyMem_SetAllocator = external global [5 x i64]
@__profc_PyMem_RawFree = external global [1 x i64]

define i32 @_PyMem_SetDefaultAllocator() #0 {
entry:
  %call = call fastcc i32 @pymem_set_default_allocator()
  ret i32 0
}

define fastcc i32 @pymem_set_default_allocator() #0 {
entry:
  store ptr @_PyMem_RawFree, ptr getelementptr inbounds (%struct.PyGetSetDef, ptr @_PyMem_Raw, i64 0, i32 4), align 8
  %pgocount6.i31 = load i64, ptr getelementptr inbounds ([5 x i64], ptr @__profc_PyMem_SetAllocator, i64 0, i64 1), align 8
  store i64 %pgocount6.i31, ptr @__profc_PyMem_SetAllocator, align 8
  ret i32 0
}

; Function Attrs: nounwind willreturn
declare void @_PyMem_RawFree() #1

define void @PyMem_RawFree(ptr %ptr) #0 {
entry:
  %pgocount = load i64, ptr %ptr, align 8
  %0 = or i64 %pgocount, 1
  store i64 %0, ptr @__profc_PyMem_RawFree, align 8
  %1 = load ptr, ptr getelementptr inbounds (%struct.PyGetSetDef, ptr @_PyMem_Raw, i64 0, i32 4), align 8
  tail call void %1(ptr null, ptr null)
  ret void
}

declare fastcc void @pathconfig_set_from_config(ptr nocapture)

define void @_PyConfig_InitPathConfig(i1 %cmp.not.i) #0 {
entry:
  %pathconfig.i11 = alloca [0 x [0 x %struct.anon.28]], i32 0, align 8
  %prefix.i.i.i.i = getelementptr %struct.anon.28, ptr %pathconfig.i11, i64 0, i32 1
  call fastcc void @pathconfig_set_from_config(ptr %pathconfig.i11)
  br i1 %cmp.not.i, label %if.end.i, label %config_calculate_pathconfig.exit

if.end.i:                                         ; preds = %entry
  %0 = load ptr, ptr %pathconfig.i11, align 8
  %cmp.not.i.i = icmp eq ptr %0, null
  %1 = load ptr, ptr %prefix.i.i.i.i, align 8
  %cmp.not.i66.i = icmp eq ptr %1, null
  %or.cond = select i1 %cmp.not.i.i, i1 %cmp.not.i66.i, i1 false
  br i1 %or.cond, label %config_calculate_pathconfig.exit, label %common.ret

common.ret:                                       ; preds = %config_calculate_pathconfig.exit, %if.end.i
  ret void

config_calculate_pathconfig.exit:                 ; preds = %if.end.i, %entry
  %call.i89.i = call i32 @_PyMem_SetDefaultAllocator()
  %2 = load ptr, ptr %pathconfig.i11, align 8
  call void @PyMem_RawFree(ptr %2)
  call void @PyMem_RawFree(ptr %prefix.i.i.i.i)
  br label %common.ret
}

attributes #0 = { "target-cpu"="pwr7" "target-features"="+altivec,+bpermd,+extdiv,+isa-v206-instructions,+vsx,-aix-small-local-exec-tls,-crbits,-crypto,-direct-move,-htm,-isa-v207-instructions,-isa-v30-instructions,-power8-vector,-power9-vector,-privileged,-quadword-atomics,-rop-protect,-spe" }
attributes #1 = { nounwind willreturn }

@xgupta
Copy link
Contributor Author

xgupta commented Oct 5, 2023

Thanks, @w2yehia, Can you also share how you got this test case for future purposes, I mean how do you run llvm-reduce for lto issue?

@nikic
Copy link
Contributor

nikic commented Oct 5, 2023

Somewhat reduced test case for -passes=gvn:

declare void @_PyMem_RawFree() nounwind willreturn

define i64 @test(ptr %p) {
entry:
  %a = alloca [2 x ptr], align 8
  %a2 = getelementptr ptr, ptr %a, i64 1
  call void null(ptr %a)
  br i1 false, label %if, label %exit

if:
  %p0 = load ptr, ptr %a, align 8
  %p1 = load ptr, ptr %a2, align 8
  br label %exit

exit:
  store ptr @_PyMem_RawFree, ptr %p
  %p2 = load ptr, ptr %a, align 8
  %pgocount.i = load i64, ptr %p2, align 8
  %fn = load ptr, ptr %p
  tail call void %fn()
  %res = load i64, ptr %a2, align 8
  ret i64 %res
}

@nikic
Copy link
Contributor

nikic commented Oct 5, 2023

Fixed in 46aac94. We need to remove users from ICF before performing RAUW.

@xgupta
Copy link
Contributor Author

xgupta commented Oct 5, 2023

Thanks, @nikic for the fix. How do you get the bc/ll file for llvm-reduce i.e. which flag is used when building python?

@xgupta xgupta closed this Oct 5, 2023
@xgupta xgupta deleted the GVN branch October 5, 2023 10:23
@xgupta xgupta restored the GVN branch October 5, 2023 10:23
@xgupta xgupta deleted the GVN branch October 5, 2023 10:23
@w2yehia
Copy link
Contributor

w2yehia commented Oct 5, 2023

Thanks, @w2yehia, Can you also share how you got this test case for future purposes, I mean how do you run llvm-reduce for lto issue?

I used -save-temps to dump intermediate IR, picked the *.0.2.internalize.bc file, and tried it with opt "-passes=lto<O3>" and was able to reproduce the failure. After that I ran a bugpoint, which took few hours but was slowly making progress, so took the reduced IR, and fed into llvm-reduce which finished in few minutes.

@xgupta
Copy link
Contributor Author

xgupta commented Oct 5, 2023

Thanks for the reply, I did set the variable for -save-temps

export CC=/home/shivam/llvm/new/llvm-project/build/bin/clang CXX=/home/shivam/llvm/new/llvm-project/build/bin/clang++ CFLAGS="-save-temps" CXXFLAGS="-save-temps" LDFLAGS="-save-temps"

And then run
./configure --with-lto --with-ensurepip=upgrade --enable-optimizations

It crashes and I checked the generated .bc file
find . -type f -name '*.bc'

There is no *.0.2.internalize.bc, only these

_collectionsmodule.bc, getplatform.bc, _operator.bc, typeobject.bc, ceval.bc, string_parser.bc, ast.bc, namespaceobject.bc, gcmodule.bc, longobject.bc, pylifecycle.bc, bootstrap_hash.bc, signalmodule.bc, ast_opt.bc, myreadline.bc, hamt.bc, rangeobject.bc, errnomodule.bc, genobject.bc, fileobject.bc, pystrcmp.bc, errors.bc, pyfpe.bc, token.bc, pytime.bc, unicodeobject.bc, methodobject.bc, getpath.bc, setobject.bc, symtablemodule.bc, bltinmodule.bc, picklebufobject.bc, odictobject.bc, _math.bc, pystrhex.bc, bytearrayobject.bc, capsule.bc, asdl.bc, structmember.bc, obmalloc.bc, pyarena.bc, moduleobject.bc, _weakref.bc, getcopyright.bc, pyctype.bc, stringio.bc, bytesobject.bc, mystrtoul.bc, _abc.bc, preconfig.bc, _threadmodule.bc, pystate.bc, context.bc, memoryobject.bc, modsupport.bc, _functoolsmodule.bc, dtoa.bc, descrobject.bc, _iomodule.bc, Python-ast.bc, tupleobject.bc, importdl.bc, atexitmodule.bc, _warnings.bc, main.bc, pymath.bc, unicodectype.bc, faulthandler.bc, symtable.bc, buff eredio.bc, accu.bc, iterobject.bc, classobject.bc, _localemodule.bc, mysnprintf.bc, bytesio.bc, frozenmain.bc, textio.bc, getbuildinfo.bc, traceback.bc, unionobject.bc, _tracemalloc.bc, config.bc, sliceobject.bc, getversion.bc, codeobject.bc, _testembed.bc, exceptions.bc, timemodule.bc, formatter_unicode.bc, thread.bc, ast_unparse.bc, python.bc, peg_api.bc, complexobject.bc, marshal.bc, dynamic_annotations.bc, floatobject.bc, cellobject.bc, pythonrun.bc, interpreteridobject.bc, enumobject.bc, compile.bc, bytes_methods.bc, dynload_shlib.bc, dictobject.bc, getopt.bc, fileutils.bc, xxsubtype.bc, codecs.bc, _sre.bc, future.bc, pegen.bc, sysmodule.bc, object.bc, boolobject.bc, structseq.bc, parser.bc, weakrefobject.bc, genericaliasobject.bc, abstract.bc, getargs.bc, initconfig.bc, _stat.bc, import.bc, pyhash.bc, itertoolsmodule.bc, posixmodule.bc, listobject.bc, suggestions.bc, getcompiler.bc, iobase.bc, hashtable.bc, _codecsmodule.bc, frozen.bc, call.bc, fileio.bc, pwdmodule.bc, frameobject.bc, pathconfig.bc, pystrtod.bc, tokenizer.bc, funcobject.bc

And none of them failed with -passes=lto running with opt.

Is this not right?

@w2yehia
Copy link
Contributor

w2yehia commented Oct 5, 2023

on AIX, we use libLTO.so (LTOCodegenerator) and it has the -save-temps option.
This option emits IR at intermediate steps of link-time LTO.
If you're using lld, you can pass -save-temps directly (i.e. without -plugin-opt=)
When I try it on ppclinux, I get these files for a simple hello world program:

a.out.0.0.preopt.bc  a.out.0.2.internalize.bc  a.out.0.4.opt.bc  a.out.0.5.precodegen.bc

Try a hello world testcase first, before applying it to the python link.

@w2yehia
Copy link
Contributor

w2yehia commented Oct 5, 2023

@nikic thanks for fixing; I couldn't find out if this was reviewed. If it wasn't, maybe we should?

@xgupta
Copy link
Contributor Author

xgupta commented Oct 6, 2023

on AIX, we use libLTO.so (LTOCodegenerator) and it has the -save-temps option. This option emits IR at intermediate steps of link-time LTO. If you're using lld, you can pass -save-temps directly (i.e. without -plugin-opt=)

Thanks. I get this working by using -Wl,-plugin-opt=save-temps so thought to add some documentation -
#68389.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clang/LLVM TOT/12.x unable to build Python TOT
4 participants