Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src/comp/Depend: use unique names for cpp input #52

Merged
merged 1 commit into from
Mar 28, 2020

Conversation

cbiffle
Copy link
Contributor

@cbiffle cbiffle commented Feb 12, 2020

Previously, the input to the C preprocessor was always deposited into a
file called _t_o_p.c, while the output was given a unique name. This
proved problematic for build systems that invoke several bsc processes
concurrently, using the same output directory.

This change renames the input file to match the output file using the
tmpnam-generated unique-ish name.

With this change, I can build the Mergesort tutorial projects using a
Ninja file and 4 cores. It takes a little under 50% of the time compared
to Ninja with -j1.

Previously, the input to the C preprocessor was always deposited into a
file called _t_o_p.c, while the output was given a unique name. This
proved problematic for build systems that invoke several bsc processes
concurrently, using the same output directory.

This change renames the input file to match the output file using the
tmpnam-generated unique-ish name.

With this change, I can build the Mergesort tutorial projects using a
Ninja file and 4 cores. It takes a little under 50% of the time compared
to Ninja with -j1.
@quark17
Copy link
Collaborator

quark17 commented Feb 12, 2020

I don't think it's necessary to add a different kind of name generator. Just define topName like this:

tempName <- tmpNam
let topName = tempName ++ ".c"
let tmpNameOut = tempName ++ ".out"

Unless there's a reason this won't work, that I'm missing? if not, I can check in that fix.

@cbiffle
Copy link
Contributor Author

cbiffle commented Feb 12, 2020

The issue is that the generated file contains

#include "src/Top.bs" (for example)

and no -I directives are passed to cc. So my options, as I see it, are:

  1. Deposit the file in the same place (which actually appears to be the cwd, not the build dir) with a random-er name, or
  2. Work out how to get the CWD in a way that is compatible with bsc's full range of supported GHC versions and generate a -I directive.

The first option was simpler for me, but I'm totally open to the second!

@quark17
Copy link
Collaborator

quark17 commented Feb 14, 2020

Ah, ok, thanks.

BSC wants to run the C preprocessor on a non-C source file. It generates a top.c that contains only an include statement for the C file (with possibly a relative path from the CWD). Because of that, cc needs to have a proper search path, either implied by the location of top.c or given with -I.

Why don't we just run cc on the original source file, and then we don't need any include path? Is it because the file needs to have an appropriate extension, so we generate a file ending in .c? If so, couldn't we just copy the source to tmpName + ".c" and run on that?

Even if we did want to keep the current structure, we could easily do it this way: generate top.c with an include statement mentioning just the base name of the source filename; and to cc provide -I with the directory part of the filename. That avoids having to know what CWD is, and uses information that we already have -- we just split the source filename into directory and non-directory. I was going to this a try, since it seemed easiest.

@quark17
Copy link
Collaborator

quark17 commented Feb 15, 2020

Extracting the path and providing it with -I worked, so I pushed that as commit e5af869.

@quark17 quark17 closed this Feb 15, 2020
@quark17
Copy link
Collaborator

quark17 commented Mar 14, 2020

I hadn't properly tested my changes before committing them and I see in the testsuite (which I'm working to make public) that a behavior change was flagged. Specifically, the filename that appears in messages was mangled. This is because BSC encodes relative locations within absolute paths, using /// to mark the rest of the string as relative. In my commit, I used basename to construct the include statement in the C file and dirName to construct the argument to -I -- so instead of messages that mention Foo.bsv, the messages would have /abs/path/to//Foo.bsv. At the very least, I should have instead used FileNameUtil.getRelativeFilePath (which strips through to /// when it exists) and . for the -I path. However, this still creates the name ./Foo.bsv in the messages, when we'd prefer to have just Foo.bsv.

It's still unclear to me why we don't call CPP directly on the BSV file. If we do that, there's no need for -I which is what is adding ./ to the filename. If we have to create a new C file with an include statement, then we have to create that locally, as you originally proposed. Without knowing whether it's safe to run CPP directly on the BSV file, I'll just implement your fix.

@quark17 quark17 reopened this Mar 28, 2020
quark17 added a commit that referenced this pull request Mar 28, 2020
This reverts commit e5af869,
which solved the issue of unique names but mangled the paths
in position info.  At the very least, it should have used
"getRelativeFilePath" rather than "baseName" and "dirName",
but that still has imperfect names.  So revert this commit
and we'll accept PR #52 instead.
@quark17 quark17 merged commit 8bcfc18 into B-Lang-org:master Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants