-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored argument parsing #170
Conversation
A try build with the patch from bug 1324892 and using this version of sccache: |
That's a bummer. :-( |
} | ||
} | ||
|
||
pub fn to_str(&self) -> Option<&'static str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably more conventional to name this as_str
, since it's not allocating anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, as_str
doesn't usually return an Option (never in libstd, at least)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, as a data point OsStr::to_str
doesn't allocate on unix (and returns an Option<&str>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then it seems fine as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's a lot of new code! Overall this looks great, even considering the complexity of the new code. (It would be nice if we didn't have to have a binary search implementation here, but I understand why you do given the constraints.)
I bet some of this could be done in a nicer way with a procedural macro, that might be interesting to experiment with in advance of them stabilizing.
The only thing I'm really hung up on here is your use
statements for enum variant names. That's not something I've seen used in all the Rust code I've read, and I think the namespace pollution isn't worth it. If you want it because it makes writing the argument arrays easier then maybe a macro would smooth that over?
Other than that just some minor things that could be tweaked.
src/compiler/args.rs
Outdated
/// Transforms a parsed argument into an iterator. | ||
impl IntoIterator for Argument { | ||
type Item = OsString; | ||
type IntoIter = std::vec::IntoIter<OsString>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a little unfortunate to allocate a Vec
every time into_iter
gets invoked for any Argument
. Unfortunately without having generators in Rust it gets a bit verbose to write this in a simpler way. This was the best I could come up with:
https://play.rust-lang.org/?gist=148372bc1c25ad8174cb8cf74009f8ab&version=stable
src/compiler/args.rs
Outdated
|
||
/// How a value is passed to an argument with a value. | ||
#[derive(PartialEq, Clone, Debug)] | ||
pub enum ArgDisposition<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be generic over T
? Do you use it with anything other than Delimiter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes normalize() non awkward to call.
src/compiler/args.rs
Outdated
Concatenated(T), | ||
} | ||
|
||
pub use self::ArgDisposition::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty unconventional Rust style. Is this just to shorten typing for the code in the compiler implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it? That's technically how you get Ok(), Err(), Some() and None for Result and Option...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find a strong guideline about it anywhere, but clippy has a lint for glob importing enum variants.
src/compiler/args.rs
Outdated
#[derive(PartialEq, Clone, Debug)] | ||
pub enum ArgumentValue { | ||
StringVal(OsString), | ||
PathVal(PathBuf), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably leave off the Val
suffix on these names, since you would generally write them as ArgumentValue::StringVal
anyway.
src/compiler/args.rs
Outdated
pub fn unwrap_path(self) -> PathBuf { | ||
match self { | ||
PathVal(p) => p, | ||
StringVal(_) => panic!("Can't unwrap_path an ArgumentValue::StringVal"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to just do PathBuf::from(s)
here? AFAIK that doesn't allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that you only get a Path if the option was meant to be one.
type Info = T; | ||
|
||
fn search(&self, key: &str) -> Option<&Self::Info> { | ||
match (self.0.search(key), self.1.search(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it matters in practice, but this will wind up searching both arrays every time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and there's no way around that.
src/compiler/gcc.rs
Outdated
// Refuse to cache arguments such as "-include@foo" because they're a | ||
// mess. See https://github.com/mozilla/sccache/issues/150#issuecomment-318586953 | ||
if let Argument::WithValue(_, ref v, CanBeSeparated(_)) = item.arg { | ||
if OsString::from(v.clone()).to_string_lossy().starts_with("@") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use OsStrEx::starts_with
here as well.
src/compiler/gcc.rs
Outdated
return CompilerArguments::CannotCache("@"); | ||
} | ||
} | ||
match item.data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how this code worked out!
src/compiler/gcc.rs
Outdated
None => { | ||
match item.arg { | ||
Argument::Raw(ref val) => { | ||
if input_arg.is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this regress your change to make linking invocations produce NotCompilation
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh, yes (why didn't this comment appear on the main PR page?)
for item in ArgsIter::new(it, &ARGS[..]) { | ||
match item.data { | ||
Some(TooHard) => { | ||
return CompilerArguments::CannotCache(item.arg.to_str().expect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make TooHard
contain a &'static str
and just return that in the CannotCache
, but it's not particularly pressing.
src/compiler/c.rs
Outdated
@@ -77,6 +77,19 @@ impl ParsedArguments { | |||
} | |||
} | |||
|
|||
pub fn normalized_extension(file: &Path) -> Option<&'static str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had actually mentioned this on a different PR a while ago, but it might be nicer to have an enum of C dialects to use here.
src/compiler/c.rs
Outdated
@@ -276,6 +291,7 @@ pub fn hash_key(compiler_digest: &str, | |||
let mut m = Digest::new(); | |||
m.update(compiler_digest.as_bytes()); | |||
m.update(CACHE_VERSION); | |||
m.update(extension.as_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you should rev CACHE_VERSION
here, since you're changing the hash inputs.
src/compiler/gcc.rs
Outdated
// Set extension to one matching the language set with -x | ||
let lang = item.arg.get_value().map(OsString::from); | ||
let lang = lang.as_ref().map(|a| a.to_string_lossy()); | ||
extension = match lang.as_ref().map(|a| a.as_ref()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this wants an ArgumentValue::as_str
or something like that? I guess you'd have to return a Cow<str>
because of the to_string_lossy
in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That still requires 3 lines thanks to the borrow checker.
src/compiler/gcc.rs
Outdated
@@ -291,7 +299,7 @@ where | |||
|
|||
CompilerArguments::Ok(ParsedArguments { | |||
input: input.into(), | |||
extension: extension, | |||
extension: extension.expect("extension should always be set"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop this if you rewrite the bit above to be like:
let extension = match extension {
None => return CompilerArguments::CannotCache("unknown source extension"),
Some(e) => e,
};
I wrote a macro to do this for required options in the Rust compiler implementation.
Oh heh, I forgot that the "review changes" button applies to the entire PR. Oh well. The second patch looks fine modulo those few little comments, but I'd really like to see the change to not use the enum variant names in the first patch. |
Each compiler essentially had its own, rather minimalist, argument handling. It didn't allow to fully handle the vast breadth of arguments that compilers can take, and wouldn't allow to find out which arguments are related to preprocessing, and which aren't. Furthermore, gcc-type arguments are very lax, and allow things like "-includefile" and "-include file" to mean the same thing, making things even harder. And then there's rust, that allows "--emit=dep-info" and "--emit dep-info" to mean the same thing. We introduce a new generic API that can handle all sorts of arguments, based on flat descriptions of the arguments that can be handled, and proceed with defining more of the arguments that the various compilers support. While the argument list for gcc/clang is rather exhaustive, it is still based on a combination of what ccache handles, and what sccache used to handle, and might still miss some of the arguments that can take values. The argument list for msvc is not exhaustive and might need some improvement. For instance, I haven't actually checked whether some of the arguments allow non-concatenated forms.
First and foremost, to generically convey the type of compilation, we normalize the file extension, and will use that information to feed the right -x argument to the preprocessor and compiler. When the compiler command sccache is invoked with already contains a -x argument, we set the normalized file extension accordingly, such that it matches what the caller wants, rather than what the source file extension says. On the preprocessor end, we then always pass one of `-x c`, `-x c++`, `-x objective-c` or `-x objective-c++` according to the normalized file extension. On the compiler end, we always pass one of `-x cpp-output`, `-x c++-cpp-output`, `-x objective-c-cpp-output` or `-x objective-c++-cpp-output` accordingly. Note that we used to pass `-x objc-cpp-output` to gcc, but that's a deprecated form that it now warns about. The new form has been available since at least gcc 4.3, so is fine to use. And because a same source compiled as C or C++ will yield different object code, include the normalized extension when computing the hashed value. Fixes mozilla#163.
Some(Language) => { | ||
let lang = item.arg.get_value().map(OsString::from); | ||
let lang = lang.as_ref().map(|a| a.to_string_lossy()); | ||
language = match lang.as_ref().map(|a| a.as_ref()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a heck of a pile of as_ref
s.
Language::C => "c", | ||
Language::Cxx => "c++", | ||
Language::ObjectiveC => "objective-c", | ||
Language::ObjectiveCxx => "objective-c++", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of a bummer that there are so many ways to spell these that you can't just have a common method on Language
.
Thanks for the nice refactor! |
See the descriptive commit messages. I verified that this fixes the issues with openvr on OSX with the new SDK (with @luser's patch from bug 1324892).
Unfortunately, this doesn't fix the large number of warnings, and it looks like the only solution for those is to build from the actual source instead of from the preprocessed output :(
This actually makes the refactor largely unnecessary, because fixing that will make distinguishing between preprocessor and other args much less useful. I do think it makes things clearer, though, so I didn't scrape it all.