Skip to content

Conversation

tru
Copy link
Collaborator

@tru tru commented Oct 19, 2023

On windows if you passed /lldltocache:D:\tmp to lld and you didn't have D: mounted it fail to create the cache dir D:\tmp, but the error message is pretty hard to understand:

c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: no such file or directory

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
Exception Code: 0xC000001D

Which lead one of our users to report this as a crash. I have just added a bit better message so it now says:

c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: Can't create cache directory: D:\tmp

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.

I am not sure this is a fatal error because it's not something that really should be reported as a bug to LLVM. But at least this gives a bit more visibility on what to change.

@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2023

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-llvm-support

Author: Tobias Hieta (tru)

Changes

On windows if you passed /lldltocache:D:\tmp to lld and you didn't have D: mounted it fail to create the cache dir D:\tmp, but the error message is pretty hard to understand:

c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: no such file or directory

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
Exception Code: 0xC000001D

Which lead one of our users to report this as a crash. I have just added a bit better message so it now says:

c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: Can't create cache directory: D:\tmp

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.

I am not sure this is a fatal error because it's not something that really should be reported as a bug to LLVM. But at least this gives a bit more visibility on what to change.


Full diff: https://github.com/llvm/llvm-project/pull/69575.diff

1 Files Affected:

  • (modified) llvm/lib/Support/Caching.cpp (+1-1)
diff --git a/llvm/lib/Support/Caching.cpp b/llvm/lib/Support/Caching.cpp
index f20f08a865c76ff..722746e7b440390 100644
--- a/llvm/lib/Support/Caching.cpp
+++ b/llvm/lib/Support/Caching.cpp
@@ -145,7 +145,7 @@ Expected<FileCache> llvm::localCache(const Twine &CacheNameRef,
       // ensures the filesystem isn't mutated until the cache is.
       if (std::error_code EC = sys::fs::create_directories(
               CacheDirectoryPath, /*IgnoreExisting=*/true))
-        return errorCodeToError(EC);
+        return createStringError(EC, Twine("Can't create cache directory: ") + CacheDirectoryPath);
 
       // Write to a temporary to avoid race condition
       SmallString<64> TempFilenameModel;

@github-actions
Copy link

github-actions bot commented Oct 19, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks fine but can you add a test? Looks like unfortunately the other error conditions here are not tested anywhere, but let's just start by adding one for the error modified here. There are some existing caching tests in llvm/test/ThinLTO/X86/ (look for the cache-dir flag) that can be used as examples.

@tru
Copy link
Collaborator Author

tru commented Oct 19, 2023

I can, but I wasn't sure we usually tested error strings. Guess I was under the impression that we had tests for the error condition. I'll look into it tomorrow.

what do you think about the fact that it calls a fatal error here? Wouldn't a normal error be enough so that you don't get a scary trace back just for a missing directory?

@teresajohnson
Copy link
Contributor

what do you think about the fact that it calls a fatal error here? Wouldn't a normal error be enough so that you don't get a scary trace back just for a missing directory?

I don't think the caching code itself is reporting it as a fatal error, it is simply returning an error from localCache() and I guess the linker client is reporting as a fatal error? I'm not sure what the linker typically does for errors like this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually error messages are all lower case, I think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(& any test coverage for this?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be true for other parts, but this file has all error messages capitalized. I think I'll follow the current convention in the file for now and someone can happily do a NFC for fixing that later.

@tru
Copy link
Collaborator Author

tru commented Oct 20, 2023

what do you think about the fact that it calls a fatal error here? Wouldn't a normal error be enough so that you don't get a scary trace back just for a missing directory?

I don't think the caching code itself is reporting it as a fatal error, it is simply returning an error from localCache() and I guess the linker client is reporting as a fatal error? I'm not sure what the linker typically does for errors like this.

That's true, the fatal error actually comes from the LTO backend:

https://github.com/llvm/llvm-project/blob/main/llvm/lib/LTO/LTOBackend.cpp#L407

@tru
Copy link
Collaborator Author

tru commented Oct 20, 2023

I pushed a test, but it's not yet working 100% - I am struggling with this a bit. It seems since we are "dying" of report_fatal_error the normal not <cmd> construct is not working, lit still seems to think it has failed.

Lit output
➜ bin/llvm-lit -sa ../lld/test/COFF/lto-cache-errors.ll
llvm-lit: /home/tobias/code/llvm-project/llvm/utils/lit/lit/llvm/config.py:487: note: using ld.lld: /home/tobias/code/llvm-project/build/bin/ld.lld
llvm-lit: /home/tobias/code/llvm-project/llvm/utils/lit/lit/llvm/config.py:487: note: using lld-link: /home/tobias/code/llvm-project/build/bin/lld-link
llvm-lit: /home/tobias/code/llvm-project/llvm/utils/lit/lit/llvm/config.py:487: note: using ld64.lld: /home/tobias/code/llvm-project/build/bin/ld64.lld
llvm-lit: /home/tobias/code/llvm-project/llvm/utils/lit/lit/llvm/config.py:487: note: using wasm-ld: /home/tobias/code/llvm-project/build/bin/wasm-ld
FAIL: lld :: COFF/lto-cache-errors.ll (1 of 1)
******************** TEST 'lld :: COFF/lto-cache-errors.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /home/tobias/code/llvm-project/build/bin/opt -module-hash -module-summary /home/tobias/code/llvm-project/lld/test/COFF/lto-cache-errors.ll -o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o
+ /home/tobias/code/llvm-project/build/bin/opt -module-hash -module-summary /home/tobias/code/llvm-project/lld/test/COFF/lto-cache-errors.ll -o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o
RUN: at line 4: /home/tobias/code/llvm-project/build/bin/opt -module-hash -module-summary /home/tobias/code/llvm-project/lld/test/COFF/Inputs/lto-cache.ll -o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o
+ /home/tobias/code/llvm-project/build/bin/opt -module-hash -module-summary /home/tobias/code/llvm-project/lld/test/COFF/Inputs/lto-cache.ll -o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o
RUN: at line 5: rm -Rf /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache && mkdir /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache
+ rm -Rf /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache
+ mkdir /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache
RUN: at line 6: chmod 444 /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache
+ chmod 444 /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache
RUN: at line 9: not /home/tobias/code/llvm-project/build/bin/lld-link /lldltocache:/home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache/cache /out:/home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp3 /entry:main /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o 2>&1 | /home/tobias/code/llvm-project/build/bin/FileCheck /home/tobias/code/llvm-project/lld/test/COFF/lto-cache-errors.ll
+ not /home/tobias/code/llvm-project/build/bin/lld-link /lldltocache:/home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache/cache /out:/home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp3 /entry:main /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o /home/tobias/code/llvm-project/build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o
+ /home/tobias/code/llvm-project/build/bin/FileCheck /home/tobias/code/llvm-project/lld/test/COFF/lto-cache-errors.ll

--

********************
********************
Failed Tests (1):
  lld :: COFF/lto-cache-errors.ll


Testing Time: 0.12s
  Failed: 1

Another thing that bothered me is that the method of setting the permissions of the directory to make it fail only works on Linux, this this would have to be disabled on Windows and I was not able to find any prior art for how to handle this.

@teresajohnson
Copy link
Contributor

That's true, the fatal error actually comes from the LTO backend:

Ah ok there, not in the linker. In any case, given the layering here, I think it is harder to have the client distinguish between errors that should be fatal vs not. So let's just leave as is for now with your improved message. Open to suggestions as to how to improve this though.

I pushed a test, but it's not yet working 100% - I am struggling with this a bit. It seems since we are "dying" of report_fatal_error the normal not <cmd> construct is not working, lit still seems to think it has failed.

I think you need "not --crash".

Lit output
Another thing that bothered me is that the method of setting the permissions of the directory to make it fail only works on Linux, this this would have to be disabled on Windows and I was not able to find any prior art for how to handle this.

I'm not very familiar with Windows, but I think for these purposes a linux only test is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://llvm.org/docs/CodingStandards.html#error-and-warning-messages The majority of places prefer no capitalization.

tru and others added 6 commits October 23, 2023 12:37
On windows if you passed /lldltocache:D:\tmp to lld and you didn't have
D: mounted it fail to create the cache dir D:\tmp, but the error message
is pretty hard to understand:

```
c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: no such file or directory

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
Exception Code: 0xC000001D
```

Which lead one of our users to report this as a crash. I have just added
a bit better message so it now says:

```
c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: Can't create cache directory: D:\tmp

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
```

I am not sure this is a fatal error because it's not something that
really should be reported as a bug to LLVM. But at least this gives a
bit more visibility on what to change.
@tru
Copy link
Collaborator Author

tru commented Oct 23, 2023

I think this is ready now.

@tru tru force-pushed the thieta/cache_error_msg branch from 7e0fe4e to 6c12034 Compare October 23, 2023 10:43
Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me - might want to check @MaskRay's concerns are covered too.

@tru
Copy link
Collaborator Author

tru commented Oct 24, 2023

@teresajohnson @MaskRay Ok to land this now?

; RUN: chmod 444 %t.cache

;; Check emit warnings when we can't create the cache dir
; RUN: not --crash lld-link /lldltocache:%t.cache/cache /out:%t3 /entry:main %t2.o %t.o 2>&1 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the cache dir name something weirder like nonexistant and check that it appears below like can't create cache directory: {{.*}}nonexistant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

; RUN: chmod 444 %t.cache

;; Check emit warnings when we can't create the cache dir
; RUN: not --crash lld-link /lldltocache:%t.cache/cache /out:%t3 /entry:main %t2.o %t.o 2>&1 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the --crash is worrying, I'm guessing there's some report_fatal_error somewhere where we should be passing false as the second param to not treat it as a crash but a user error, and we should have a followup change to fix that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we did some discussion on the report_fatal_error above, I was under the impression that it always "crashed", but if it's just a bool to that function that's probably a good fix. I can follow up on that in a new PR if everyone agrees that this is a good way to handle that. cc @teresajohnson @dwblaikie

@@ -0,0 +1,20 @@
; REQUIRES: x86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://maskray.me/blog/2021-08-08-toolchain-testing#the-test-checks-at-the-wrong-layer

A better test is llvm-lto2 --cache-dir in llvm/test/ThinLTO/X86

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first iteration was there - but it doesn't seem to follow the same code path, I never got the error message that I got from lld-link.

I will merge this now, since a test is better than no test - but I might follow up with a test in the llvm layer when I do the changes to report_fatal_error if people think that's a good change.

@tru tru merged commit a6d509f into llvm:main Oct 26, 2023
@tru tru deleted the thieta/cache_error_msg branch October 26, 2023 06:31
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
On windows if you passed /lldltocache:D:\tmp to lld and you didn't have
D: mounted it fail to create the cache dir D:\tmp, but the error message
is pretty hard to understand:

```
c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: no such file or directory

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
Exception Code: 0xC000001D
```

Which lead one of our users to report this as a crash. I have just added
a bit better message so it now says:

```
c:\code\llvm\llvm-project\out\debug>bin\lld-link.exe /lldltocache:D:\tmp
hello.obj
LLVM ERROR: Can't create cache directory: D:\tmp

PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
```

I am not sure this is a fatal error because it's not something that
really should be reported as a bug to LLVM. But at least this gives a
bit more visibility on what to change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants