Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compiler caching support for CI checks #2624

Merged
merged 33 commits into from
Jun 16, 2023
Merged

Add compiler caching support for CI checks #2624

merged 33 commits into from
Jun 16, 2023

Conversation

daljit46
Copy link
Member

@daljit46 daljit46 commented May 2, 2023

This pull request add supports for compiler caching in the CI checks workflow. It uses sccache which is a compiler cache that speeds up recompilation by caching previous compilations by automatically detecting whether the same compilation is already cached. This allows us to leverage the free cache (up to 10GB) provided by Github Actions. At the moment, it only works with Clang/GCC builds on Linux and MacOS.

In our current workflow, all the checks are done by compiling from scratch and for example, a typical CI build with Clang on Linux takes about 35-40 minutes. My preliminary tests show that using sccache we can complete a cached build in about 5-10 minutes.

I've decided to use sccache instead of the more popular ccache because in my experience it has been more reliable on Windows (particularly with MSVC) and because they provide a pre-made Github Action to facilitate integration. However, this is not necessary and we could use ccache if wanted (integrating it should be fairly straightforward).

Any feedback welcome!

@daljit46
Copy link
Member Author

daljit46 commented May 11, 2023

Ok so I think most things are working now for all platforms. The improvements in build times are significant and should considerably improve our build and test cycle.
The only problem that remains now is that on MSYS2, the meshconvert test is failing for some reason and also gcc 12/13 triggers warning during compilation. Our current CI workflow was all tested against gcc 9 and thus these warnings weren't there. I think we should fix all the warnings before we merge this in.
Also upgrading all workflows to use the latest gcc and clang is probably a good idea.

@daljit46
Copy link
Member Author

Ok, so it seems that even fixing the warnings doesn't solve the issue of the binary test failing. @Lestropie @jdtournier could you take a look at why the meshconvert test fails? It's very strange that this is only failing on Windows.

@jdtournier
Copy link
Member

I can replicate on my Win10 laptop - though I also get failures for mrhistmatch & tckconvert. Go figure...
I'll investigate when I have a minute.

@jdtournier
Copy link
Member

jdtournier commented May 19, 2023

OK, mrhistmatch & tckconvert failures were simply due to the absence of diff & bc on my system... Easy fix.

For meshconvert, the issue is due to opening the file in text mode, which only differs from binary mode on Windows systems. Because these VTK files can be written as text or as binary, I'm not sure how to handle this. If we change to always opening these files in binary mode, the other tests fail (for the VTK text files). But since the binary files start as text and switch to binary halfway through, it's going to require switching modes halfway through or something... I'll have a go at that in a minute.

@daljit46
Copy link
Member Author

For meshconvert, the issue is due to opening the file in text mode, which only differs from binary mode on Windows systems.

But why didn't the check fail on the previous Windows CI workflow?

@jdtournier
Copy link
Member

Ok, having looked into this, I'm starting to think this might be a compiler bug... The offset reported by input stream's tellg() when the file is opened in non-binary mode is completely off, which is what is causing the issue when the file re-opened in binary mode. I'll investigate further when I get the chance, but it might be a while...

@jdtournier
Copy link
Member

OK, a bit more context around the meshconvert failure. I wrote this simple test command:


test_tellg()

#include <fstream>
#include <iostream>
#include <string>

int main (int argc, char* argv[])
{
  std::ifstream in (argv[1], std::ios_base::in);
  std::string line;
  std::cerr << "offset = " << in.tellg() << "\n";

  for (int n = 1; n <= 5; ++n) {
    std::getline (in, line);
    if (!in) break;
    std::cerr << "line " << n << ": \"" << line << "\" (size = " << line.size() << "), current offset = " << in.tellg() << "\n";
  }
}

and ran it on the testing/binaries/data/meshconvert/in_le.vtk file, which is on our testing repo (and one of the input files where the testing currently fails), and this is what it produced (on MSYS2, clang 64):

offset = 0
line 1: "# vtk DataFile Version 1.0" (size = 26), current offset = 3499
line 2: "ù♥@#J{â/,-└¾²Èxy2@/·" (size = 21), current offset = 7253
@Ûò▓ 3: "└╩2─▒.n↨@ä×═¬¤§
qý5└'âúõı╔!@&³R?o¬      @¥ƒ" (size = 43), current offset = -1

So the offset is already wrong as soon as we're read the first line...

@jdtournier
Copy link
Member

Yet more context on this issue... Here's another test cpp file that demonstrates the issue. This time, it writes a few lines of text (with a file open in binary mode), then dumps binary data, with single bytes incrementing in value and wrapping around for a long as requested.


test_tellg.cpp

#include <fstream>
#include <iostream>
#include <string>

int main ()
{
  using namespace std;

  {
    ofstream out ("testfile.txt", ios_base::binary);
    out << "first line of text\nsecond line of text\nthird line of text\nsome binary after this line\n";

    vector<char> data (1000);
    char n = 0;
    for (auto& c : data) {
      //if (n==26) n++;    <-- uncomment and it works as expected
      c = n++;
    }
    out.write (data.data(), data.size());
  }


  ifstream in ("testfile.txt", ios_base::in);
  string line;
  cerr << "offset = " << in.tellg() << "\n";

  for (int n = 1; n <= 5; ++n) {
    getline (in, line);
    if (!in) break;
    cerr << "line " << n << ": \"" << line << "\" (size = " << line.size() << "), current offset = " << in.tellg() << "\n";
  }
}

It turns out that things behave as expected as long as the file does not contain the ASCII 26 (substitute) character. Output from code above:

offset = 0
line 1: "first line of text" (size = 18), current offset = 993
line 2: "ïîìÄÅÉæÆôöòûùÿÖÜø£Ø׃áíóúñѪº¿®¬½¼¡«»░▒▓│┤ÁÂÀ©╣║╗╝¢¥┐└┴┬├─┼ãÃ╚╔╩╦╠═╬¤ðÐÊËÈıÍÎÏ┘┌█▄¦Ì▀ÓßÔÒõÕµþ" (size = 93), current offset = -1

and output of same command with the commented line enabled (i.e. skipping the character with ASCII code 26):

offset = 0
line 1: "first line of text" (size = 18), current offset = 19
line 2: "second line of text" (size = 19), current offset = 39
line 3: "third line of text" (size = 18), current offset = 58
line 4: "some binary after this line" (size = 27), current offset = 86
line 5: "☺☻♥♦♣♠ " (size = 10), current offset = 97

According to Wikipedia, this ASCII 26 character used to mark the end-of-file, so maybe that has something to do with it. But regardless, the problematic character is encountered much later in the file, and yet it affects the file offsets reported way earlier. Feels like there's a glitch somewhere in MSYS2...

@jdtournier
Copy link
Member

OK, one solution that might work is to always open the file in binary mode, and strip any trailing carriage return '\r' characters from the end of each line after every getline() call. Not pretty, but it seems to work on that toy example above...

@daljit46
Copy link
Member Author

Hmm that's quite strange. I compiled this MSVC (second example) and get:

offset = 0
line 1: "first line of text" (size = 18), current offset = 988
line 2: "second line of text" (size = 19), current offset = 1009
line 3: "third line of text" (size = 18), current offset = 1029
line 4: "some binary after this line" (size = 27), current offset = 1058
line 5: "☺☻♥♦♣♠ " (size = 10), current offset = 1070

so this may be a specific Windows issue.

@daljit46
Copy link
Member Author

daljit46 commented May 24, 2023

So it seems that as per this answer and here, tellg() is not guaranteed to give you the byte offset if you're reading the file in text mode, but only in binary mode (on Unix systems this doesn't make a difference). So I think we cannot rely on tellg giving us exact byte offsets if we are reading the files in text mode. This might be relevant.

@jdtournier
Copy link
Member

So it seems that as per this answer and here, tellg() is not guaranteed to give you the byte offset if you're reading the file in text mode, but only in binary mode

Yes, I came across similar information, but this seems worse than that. In that first link, note the answer states:

As a matter of fact tellg() is not to return the offset of a byte in a stream, but a pos_type descriptor which is reusable by seekg().

And this is confirmed on the cppreference page for ftell():

If the stream is open in text mode, the value returned by this function is unspecified and is only meaningful as the input to std::fseek.

What we get is clearly not re-usable in any sense - it's just plain wrong.

But given the situation seems to affect pretty much all implementations on Windows (if I understand that second link correctly), I think the only sensible thing to do is always open all files in binary mode on all platforms, and strip out carriage returns if needed. We already open all output files in binary mode by default, and always write Unix newlines (no std::endl) to make sure outputs are always identical no matter what platform files were generated on. This issue strongly suggests we need to adopt a similar approach on the input side too...

@Lestropie
Copy link
Member

Can we move the MSYS2 meshconvert discussion to a separate issue? It's happening on master and is unrelated to the contents of this PR.

@daljit46 daljit46 marked this pull request as ready for review June 16, 2023 11:49
@daljit46 daljit46 requested a review from a team June 16, 2023 11:49
@jdtournier jdtournier merged commit 584cfe2 into dev Jun 16, 2023
5 checks passed
@jdtournier jdtournier deleted the compilercaching branch June 16, 2023 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants