Replace manual mmap with llvm::MemoryBuffer #1032

DavidTruby · 2020-02-27T15:18:52Z

Fixes #840

DavidTruby · 2020-02-27T15:21:02Z

@sscalpone could you check if this has the defective fd behaviour you mentioned on the mailing list? I don't believe it should as llvm::MemoryBuffer doesn't keep the fd around after it has mmapped a file.

klausler · 2020-02-27T16:24:53Z

@sscalpone could you check if this has the defective fd behaviour you mentioned on the mailing list? I don't believe it should as llvm::MemoryBuffer doesn't keep the fd around after it has mmapped a file.

If the open file descriptor is closed, then the file is not mapped into the virtual address space.

DavidTruby · 2020-02-27T16:27:27Z

If the open file descriptor is closed, then the file is not mapped into the virtual address space.

Per the POSIX standard, this isn't the case. See https://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html:
" The mmap() function adds an extra reference to the file associated with the file descriptor fildes which is not removed by a subsequent close() on that file descriptor. This reference is removed when there are no more mappings to the file. "

kiranchandramohan · 2020-02-27T16:42:12Z

lib/Parser/CMakeLists.txt

@@ -33,7 +33,7 @@ add_library(FortranParser
 )

 target_link_libraries(FortranParser
-  FortranCommon
+  FortranCommon LLVMSupport


General style seems to be one entry per library/file.

klausler · 2020-02-27T16:42:22Z

If the open file descriptor is closed, then the file is not mapped into the virtual address space.

Per the POSIX standard, this isn't the case. See https://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html:
" The mmap() function adds an extra reference to the file associated with the file descriptor fildes which is not removed by a subsequent close() on that file descriptor. This reference is removed when there are no more mappings to the file. "

So the limit on the number of simultaneous open file descriptors won't be a problem? That's my only concern here. Have you tested this with a large number of INCLUDE files?

DavidTruby · 2020-02-27T16:46:26Z

So the limit on the number of simultaneous open file descriptors won't be a problem? That's my only concern here. Have you tested this with a large number of INCLUDE files?

It shouldn't be a problem, as we shouldn't have that many fds open at any one time. I don't have a test case that would have enough files to reach the fd limit though.

Clang and Swift both use this MemoryBuffer class for their source file managing so I assume this has already been considered. We should check it though, I'll see if I can randomly generate a case.

DavidTruby · 2020-02-27T16:53:20Z

We should check it though, I'll see if I can randomly generate a case.

I've tried it with a file that INCLUDEs 10,000 other files, which is well over my fd limit per process on my system. I don't get any errors and get correct output from -fget-symbols-sources.

tskeith · 2020-02-27T17:49:21Z

lib/Parser/parsing.cpp

@@ -29,12 +29,13 @@ const SourceFile *Parsing::Prescan(const std::string &path, Options options) {
    }
  }

-  std::stringstream fileError;
+  std::string fileError_buf;
+  llvm::raw_string_ostream fileError{fileError_buf};


How is llvm::raw_string_ostream better than std::stringstream?

Use of llvm::raw_ostream over std::ostream is mandated for new code going in to llvm and this has been bought up on the mailing list with respect to f18, so if our goal is to submit to llvm need to change over.

There is no good reason not to use std::stringstream and it's not clear that's what that link says. Right above there it says:

Note that using the other stream headers (<sstream> for example) is not problematic in this regard

It does say that sstream is not problematic in the regard that it doesn't introduce global objects with non-static constructors. However, it goes on to say:

New code should always use raw_ostream for writing

Which I think is fairly clear, and we certainly are new code from LLVM's perspective.

I believe the context of the quote "New code should always use...." pertains to writing files. In this case, stringstream seems better and appropriate.

I believe this was specifically clarified by Hal on a community call a few weeks ago, that the policy is that the use of any of the stream libraries including std::stringstream is highly discouraged for new code.

Could you explain why stringstream is better and more appropriate here? std::stringstream and llvm::raw_string_ostream have the same interface.

Could you explain why stringstream is better and more appropriate here?

It's part of the language.

std::stringstream and llvm::raw_string_ostream have the same interface.

This change shows a case where that's not the case.

Could you explain why stringstream is better and more appropriate here?

And llvm::raw_string_ostream is a part of the library of the project we are supposed to be a part of, and is preferred by that project for a number of technical reasons that are outlined in the documentation. I fail to see how this makes std::stringstream better.

This change shows a case where that's not the case.

Are you referring to the fact that a separate buffer needs to be stored here? That is a minor change in the construction of the class but the observable interface as used here is identical.

We discussed this on the flang technical call on March 9 and agreed:

The LLVM Coding Guidelines are unclear on whether or not sstream should be allowed

The long-time LLVM community folks felt that the intention is for sstream not to be used at all in preference to LLVM APIs.

We would make the change that David suggests in F18

Johannes would start a thread to clarify the wording in the LLVM Coding Standards to make clear the intent.

The use of std::stringstream in Polly was removed: llvm/llvm-project@0e93f3b.

tskeith · 2020-02-27T17:50:04Z

lib/Parser/preprocessor.cpp

+    std::string error_buf;
+    llvm::raw_string_ostream error{error_buf};


Suggested change

std::string error_buf;

llvm::raw_string_ostream error{error_buf};

std::string errorBuf;

llvm::raw_string_ostream error{errorBuf};

tskeith · 2020-02-27T17:52:32Z

lib/Parser/source.cpp

+    bool is_dir = false;
+    auto er = llvm::sys::fs::is_directory(path, is_dir);
+    if (!er && !is_dir) {


Suggested change

bool is_dir = false;

auto er = llvm::sys::fs::is_directory(path, is_dir);

if (!er && !is_dir) {

bool isDir{false};

auto er{llvm::sys::fs::is_directory(path, isDir)};

if (!er && !isDir) {

sscalpone · 2020-03-05T14:11:31Z

lib/Parser/source.cpp

-  }
-  return wrote;
+std::size_t RemoveCarriageReturns(llvm::MutableArrayRef<char> buf) {
+  auto end = llvm::remove_if(buf, [](const char c) { return c == '\r'; });


Do you know if the lambda is inlined? If not, we should probably check the performance vs the original loop & memmove.

The original loop actually has O(n^2) complexity if I am not reading it wrong, whereas the complexity of remove_if is O(n). Regarding the specific question though, the lambda does get inlined by both compilers I've tested it on.

Why do you think that the loop is O(n**2)? It makes one pass over the source, one iteration per \r, and moves each of the other bytes exactly once. And the new version doesn't use memchr, which has been highly optimized for SIMD. I'd like to see actual measured performance comparisons before signing off on this.

Memmove is an O(n) function and is called inside an O(n) loop. In the worst case where every character is a carriage return n^2 operations will have been performed.

O(#of lines) * O(average line length) == O(#of lines) * O(#total bytes / #lines) == O(#total bytes).

How can we move forward here?

Clearly we don't want to degrade performance unnecessarily, but that is not the only concern. The coding style of the proposed new implementation is certainly closer to what the LLVM community would expect and this particular code has been requested to be re-written - http://lists.llvm.org/pipermail/llvm-dev/2020-February/139464.html - before we upstream.

Given the data shown, I don't think it is clear that David's proposed new implementation is systematically worse than the current implementation. It seems that results vary depending probably on how optimised your C and C++ standard libraries are for the system you are on,

If we agree on that, I think we should make the proposed change on the grounds of coding style alignment to LLVM to unblock upstreaming.

Dealing with DOS line endings is something that we have to do most often on Windows than elsewhere, so the x86 timings seem to indicate to me that we'll want to retain the fast approach in NVIDIA's product sources. You can do whatever you think you have to in your LLVM fork.

I think in order to make any statements about windows we would have to test there rather than Linux; their C and C++ standard library implementations are completely different to the ones on Linux and therefore very likely to have different performance characteristics. In addition my macOS results still show remove_if as being faster than the hand rolled function here, and that's also on x86. Again I suspect that this is because the C standard library on macOS is different to on Linux.

This function does also have to run on every platform, regardless of whether the input actually has carriage returns or not. So the performance matters everywhere not just on windows on x86.

@sscalpone
So that we can progress the rest of the patch, David has removed this change from the review. Can we merge as-is?

lib/Semantics/mod-file.cpp

lib/Parser/source.cpp

DavidTruby · 2020-03-12T01:14:55Z

@sscalpone @tskeith is this ready to merge now?

lib/Parser/source.cpp

DavidTruby · 2020-03-19T17:44:21Z

@sscalpone rebased on top of the LLVM streams patch

sscalpone · 2020-03-22T15:24:36Z

@DavidTruby This PR causes a fatal check when compiling Nyx/Exec/Scaling in my sandbox. I don't know if the problem is with the PR or if the PR is exposing a different issue.

fatal internal error: CHECK(range_.Contains(at)) failed at /home/sjs/work/pgi/f18/pr1032/f18/lib/Parser/provenance.cpp(389)

I can't share the Nyx source tree that I'm using; however, I think it is based on this:
https://github.com/AMReX-Astro/Nyx/tree/master/Exec/Scaling.

The failing file is zero length.

% pwd
.../pr1032/build
% touch zero.f90
% ls -al zero.f90
-rw-rw-r-- 1 xxx yy 0 Mar 22 08:46 zero.f90
% tools/f18/bin/f18 zero.f90

fatal internal error: CHECK(range_.Contains(at)) failed at /home/sjs/work/pgi/f18/pr1032/f18/lib/Parser/provenance.cpp(389)
Aborted

DavidTruby · 2020-03-23T15:23:24Z

@DavidTruby This PR causes a fatal check when compiling Nyx/Exec/Scaling in my sandbox. I don't know if the problem is with the PR or if the PR is exposing a different issue.

fatal internal error: CHECK(range_.Contains(at)) failed at /home/sjs/work/pgi/f18/pr1032/f18/lib/Parser/provenance.cpp(389)

I can't share the Nyx source tree that I'm using; however, I think it is based on this:
https://github.com/AMReX-Astro/Nyx/tree/master/Exec/Scaling.

The failing file is zero length.
% pwd
.../pr1032/build
% touch zero.f90
% ls -al zero.f90
-rw-rw-r-- 1 xxx yy 0 Mar 22 08:46 zero.f90
% tools/f18/bin/f18 zero.f90

fatal internal error: CHECK(range_.Contains(at)) failed at /home/sjs/work/pgi/f18/pr1032/f18/lib/Parser/provenance.cpp(389)
Aborted

@sscalpone I seem to be able to reproduce this with empty files. I'll fix it and add a lit test for empty files.

DavidTruby · 2020-03-23T17:46:53Z

@sscalpone Should be fixed now. I've left it as a separate commit so you can review the fix, please let me know if it's ok to squash.

sscalpone

@DavidTruby Please squash. Thanks!

The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code.

DavidTruby · 2020-03-24T13:39:15Z

@sscalpone I've squashed this to one commit and written a more descriptive commit message

The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code. Original-commit: flang-compiler/f18@d34df84 Reviewed-on: flang-compiler/f18#1032

…morybuffer Replace manual mmap with llvm::MemoryBuffer Original-commit: flang-compiler/f18@35f7def Reviewed-on: flang-compiler/f18#1032

The previous code had handling for cases when too many file descriptors may be opened; this is not necessary with MemoryBuffer as the file descriptors are closed after the mapping occurs. MemoryBuffer also internally handles the case where a file is small and therefore an mmap is bad for performance; such files are simply copied to memory after being opened. Many places elsewhere in the code assume that the buffer is not empty, and the old file opening code handles this by replacing an empty file with a buffer containing a single newline. That behavior is now kept in the new MemoryBuffer based code. Original-commit: flang-compiler/f18@d34df84 Reviewed-on: flang-compiler/f18#1032

DavidTruby requested a review from sscalpone February 27, 2020 15:19

DavidTruby force-pushed the memorybuffer branch 2 times, most recently from 370c29b to 4430289 Compare February 27, 2020 16:10

kiranchandramohan reviewed Feb 27, 2020

View reviewed changes

tskeith reviewed Feb 27, 2020

View reviewed changes

isuruf mentioned this pull request Feb 27, 2020

Support Windows build compilers to match LLVM as closely as possible. #971

Open

DavidTruby changed the title ~~Replaced manual mmap with llvm::MemoryBuffer~~ Replace manual mmap with llvm::MemoryBuffer Mar 2, 2020

sscalpone reviewed Mar 5, 2020

View reviewed changes

lib/Semantics/mod-file.cpp Show resolved Hide resolved

sscalpone reviewed Mar 5, 2020

View reviewed changes

lib/Parser/source.cpp Show resolved Hide resolved

DavidTruby force-pushed the memorybuffer branch from 215aad7 to bc57d30 Compare March 12, 2020 13:02

tskeith reviewed Mar 12, 2020

View reviewed changes

lib/Parser/source.cpp Outdated Show resolved Hide resolved

DavidTruby force-pushed the memorybuffer branch 2 times, most recently from 8c6701a to 8d882cc Compare March 19, 2020 17:31

sscalpone approved these changes Mar 24, 2020

View reviewed changes

DavidTruby force-pushed the memorybuffer branch from 6a0af38 to d34df84 Compare March 24, 2020 13:37

sscalpone merged commit 35f7def into flang-compiler:master Mar 24, 2020

		std::string error_buf;
		llvm::raw_string_ostream error{error_buf};

Replace manual mmap with llvm::MemoryBuffer #1032

Replace manual mmap with llvm::MemoryBuffer #1032

Conversation

DavidTruby commented Feb 27, 2020

DavidTruby commented Feb 27, 2020

klausler commented Feb 27, 2020

DavidTruby commented Feb 27, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klausler commented Feb 27, 2020

DavidTruby commented Feb 27, 2020

DavidTruby commented Feb 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavidTruby commented Mar 12, 2020

DavidTruby commented Mar 19, 2020

sscalpone commented Mar 22, 2020 • edited Loading

DavidTruby commented Mar 23, 2020

DavidTruby commented Mar 23, 2020

sscalpone left a comment

Choose a reason for hiding this comment

DavidTruby commented Mar 24, 2020 • edited Loading

DavidTruby commented Feb 27, 2020 •

edited

Loading

sscalpone commented Mar 22, 2020 •

edited

Loading

DavidTruby commented Mar 24, 2020 •

edited

Loading