-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace manual mmap with llvm::MemoryBuffer #1032
Conversation
@sscalpone could you check if this has the defective fd behaviour you mentioned on the mailing list? I don't believe it should as llvm::MemoryBuffer doesn't keep the fd around after it has mmapped a file. |
370c29b
to
4430289
Compare
If the open file descriptor is closed, then the file is not mapped into the virtual address space. |
Per the POSIX standard, this isn't the case. See https://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html: |
lib/Parser/CMakeLists.txt
Outdated
@@ -33,7 +33,7 @@ add_library(FortranParser | |||
) | |||
|
|||
target_link_libraries(FortranParser | |||
FortranCommon | |||
FortranCommon LLVMSupport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General style seems to be one entry per library/file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
So the limit on the number of simultaneous open file descriptors won't be a problem? That's my only concern here. Have you tested this with a large number of |
It shouldn't be a problem, as we shouldn't have that many fds open at any one time. I don't have a test case that would have enough files to reach the fd limit though. Clang and Swift both use this MemoryBuffer class for their source file managing so I assume this has already been considered. We should check it though, I'll see if I can randomly generate a case. |
I've tried it with a file that |
lib/Parser/parsing.cpp
Outdated
@@ -29,12 +29,13 @@ const SourceFile *Parsing::Prescan(const std::string &path, Options options) { | |||
} | |||
} | |||
|
|||
std::stringstream fileError; | |||
std::string fileError_buf; | |||
llvm::raw_string_ostream fileError{fileError_buf}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is llvm::raw_string_ostream better than std::stringstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of llvm::raw_ostream over std::ostream is mandated for new code going in to llvm and this has been bought up on the mailing list with respect to f18, so if our goal is to submit to llvm need to change over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no good reason not to use std::stringstream and it's not clear that's what that link says. Right above there it says:
Note that using the other stream headers (
<sstream>
for example) is not problematic in this regard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does say that sstream is not problematic in the regard that it doesn't introduce global objects with non-static constructors. However, it goes on to say:
New code should always use raw_ostream for writing
Which I think is fairly clear, and we certainly are new code from LLVM's perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the context of the quote "New code should always use...." pertains to writing files. In this case, stringstream seems better and appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was specifically clarified by Hal on a community call a few weeks ago, that the policy is that the use of any of the stream libraries including std::stringstream
is highly discouraged for new code.
Could you explain why stringstream is better and more appropriate here? std::stringstream
and llvm::raw_string_ostream
have the same interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why stringstream is better and more appropriate here?
It's part of the language.
std::stringstream
andllvm::raw_string_ostream
have the same interface.
This change shows a case where that's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why stringstream is better and more appropriate here?
And llvm::raw_string_ostream
is a part of the library of the project we are supposed to be a part of, and is preferred by that project for a number of technical reasons that are outlined in the documentation. I fail to see how this makes std::stringstream
better.
This change shows a case where that's not the case.
Are you referring to the fact that a separate buffer needs to be stored here? That is a minor change in the construction of the class but the observable interface as used here is identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on the flang technical call on March 9 and agreed:
- The LLVM Coding Guidelines are unclear on whether or not sstream should be allowed
- The long-time LLVM community folks felt that the intention is for sstream not to be used at all in preference to LLVM APIs.
- We would make the change that David suggests in F18
- Johannes would start a thread to clarify the wording in the LLVM Coding Standards to make clear the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of std::stringstream
in Polly was removed: llvm/llvm-project@0e93f3b.
lib/Parser/preprocessor.cpp
Outdated
std::string error_buf; | ||
llvm::raw_string_ostream error{error_buf}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string error_buf; | |
llvm::raw_string_ostream error{error_buf}; | |
std::string errorBuf; | |
llvm::raw_string_ostream error{errorBuf}; |
lib/Parser/source.cpp
Outdated
bool is_dir = false; | ||
auto er = llvm::sys::fs::is_directory(path, is_dir); | ||
if (!er && !is_dir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool is_dir = false; | |
auto er = llvm::sys::fs::is_directory(path, is_dir); | |
if (!er && !is_dir) { | |
bool isDir{false}; | |
auto er{llvm::sys::fs::is_directory(path, isDir)}; | |
if (!er && !isDir) { |
lib/Parser/source.cpp
Outdated
} | ||
return wrote; | ||
std::size_t RemoveCarriageReturns(llvm::MutableArrayRef<char> buf) { | ||
auto end = llvm::remove_if(buf, [](const char c) { return c == '\r'; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if the lambda is inlined? If not, we should probably check the performance vs the original loop & memmove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original loop actually has O(n^2) complexity if I am not reading it wrong, whereas the complexity of remove_if is O(n). Regarding the specific question though, the lambda does get inlined by both compilers I've tested it on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think that the loop is O(n**2)? It makes one pass over the source, one iteration per \r
, and moves each of the other bytes exactly once. And the new version doesn't use memchr
, which has been highly optimized for SIMD. I'd like to see actual measured performance comparisons before signing off on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memmove is an O(n) function and is called inside an O(n) loop. In the worst case where every character is a carriage return n^2 operations will have been performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O(#of lines) * O(average line length) == O(#of lines) * O(#total bytes / #lines) == O(#total bytes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we move forward here?
Clearly we don't want to degrade performance unnecessarily, but that is not the only concern. The coding style of the proposed new implementation is certainly closer to what the LLVM community would expect and this particular code has been requested to be re-written - http://lists.llvm.org/pipermail/llvm-dev/2020-February/139464.html - before we upstream.
Given the data shown, I don't think it is clear that David's proposed new implementation is systematically worse than the current implementation. It seems that results vary depending probably on how optimised your C and C++ standard libraries are for the system you are on,
If we agree on that, I think we should make the proposed change on the grounds of coding style alignment to LLVM to unblock upstreaming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with DOS line endings is something that we have to do most often on Windows than elsewhere, so the x86 timings seem to indicate to me that we'll want to retain the fast approach in NVIDIA's product sources. You can do whatever you think you have to in your LLVM fork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in order to make any statements about windows we would have to test there rather than Linux; their C and C++ standard library implementations are completely different to the ones on Linux and therefore very likely to have different performance characteristics. In addition my macOS results still show remove_if
as being faster than the hand rolled function here, and that's also on x86. Again I suspect that this is because the C standard library on macOS is different to on Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function does also have to run on every platform, regardless of whether the input actually has carriage returns or not. So the performance matters everywhere not just on windows on x86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sscalpone
So that we can progress the rest of the patch, David has removed this change from the review. Can we merge as-is?
@sscalpone @tskeith is this ready to merge now? |
215aad7
to
bc57d30
Compare
8c6701a
to
8d882cc
Compare
@sscalpone rebased on top of the LLVM streams patch |
@DavidTruby This PR causes a fatal check when compiling Nyx/Exec/Scaling in my sandbox. I don't know if the problem is with the PR or if the PR is exposing a different issue.
I can't share the Nyx source tree that I'm using; however, I think it is based on this: The failing file is zero length.
|
@sscalpone I seem to be able to reproduce this with empty files. I'll fix it and add a lit test for empty files. |
@sscalpone Should be fixed now. I've left it as a separate commit so you can review the fix, please let me know if it's ok to squash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidTruby Please squash. Thanks!
The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code.
6a0af38
to
d34df84
Compare
@sscalpone I've squashed this to one commit and written a more descriptive commit message |
The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code. Original-commit: flang-compiler/f18@d34df84 Reviewed-on: flang-compiler/f18#1032
…morybuffer Replace manual mmap with llvm::MemoryBuffer Original-commit: flang-compiler/f18@35f7def Reviewed-on: flang-compiler/f18#1032
The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code. Original-commit: flang-compiler/f18@d34df84 Reviewed-on: flang-compiler/f18#1032
Fixes #840