New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommend using set/unset gitattributes (vs. true/false strings) #4624
Conversation
This makes sense to me. I'm not sure what the historical reason for using the string "false" was though as it was made over 5 years ago. That said, we can't start suggesting this as it won't work for the negation thanks to: That would need to be changed to account for the old behaviour and the correctly interpreted behaviour you're proposing. Additional tests would also need to be added too. Feel free to update this PR with those changes if you want.
Technically, it isn't, but I can make the change once we've got Linguist working |
Can you clarify what doesn't work? When I opened the issue, I wasn't able to test linguist behavior locally because I couldn't get it running after battling Ruby stuff for about 30 minutes. But I did test it on GitHub and setting/unsetting attributes does seem to work as expected (i.e., GitHub correctly interprets them the same way that git does). Here's a demo: https://github.com/cespare/misc/pull/4/files There, using
When I was trying to understand the code, it seemed that this work is done by the but then I couldn't figure out what the |
Sure. The negation, eg The behaviour is confirmed locally too:
You're not testing the Linguist functionality yet 😄.
If the ... to then determine if Linguist should treat the file as generated or not when performing the language analysis. |
🤔 I'm starting to question myself now I actually play with this directly 😁 |
Are you saying that GitHub runs code which is not linguist but which uses the |
Oh, I see my mistake now.
It's not regardless of the value... it needs to equate to
Yes. There are many places we use Linguist methods directly without running a full language analysis. |
Oh yes, and I don't claim to have any knowledge of where all of these places are and why 😁 |
I'm not sure I follow. I was asking if GitHub is reading the You mentioned things like "language detection" and "full language analysis" a few times, but I'm not sure why they necessarily need to be involved. I used
Yeah, I was assuming/hoping that rugged returns |
Sorry about the confusion. Yes, the diff code is taking advantage of some of the Linguist methods but it is reading the
Yes it is, and it's not the problem I've highlighted. That's why I said your example is only showing how the diffing code behaves and not the language detection, which is really Linguist's primary function. I see my initial explanation didn't really make this clear. Sorry.
It does. I think the problem with the language analysis comes about because the |
This quick change seems to do the trick for negation: diff --git a/lib/linguist/lazy_blob.rb b/lib/linguist/lazy_blob.rb
index 389cc243..cf5a9294 100644
--- a/lib/linguist/lazy_blob.rb
+++ b/lib/linguist/lazy_blob.rb
@@ -38,7 +38,7 @@ module Linguist
end
def documentation?
- if attr = git_attributes['linguist-documentation']
+ if attr = git_attributes['linguist-documentation'].to_s
boolean_attribute(attr)
else
super
@@ -46,7 +46,7 @@ module Linguist
end
def generated?
- if attr = git_attributes['linguist-generated']
+ if attr = git_attributes['linguist-generated'].to_s
boolean_attribute(attr)
else
super
@@ -54,7 +54,7 @@ module Linguist
end
def vendored?
- if attr = git_attributes['linguist-vendored']
+ if attr = git_attributes['linguist-vendored'].to_s
return boolean_attribute(attr)
else
super
@@ -102,7 +102,7 @@ module Linguist
# Returns true if the attribute is present and not the string "false".
def boolean_attribute(attribute)
- attribute != "false"
+ attribute.to_s.eql?('true') ? true : false
end
def load_blob! ... though I've not thoroughly tested this or thought this through too closely (it's the end of my day). |
I was using "unset" in the specific sense of the gitattributes man page; that is, explicitly marked as unset via I don't think your proposed changes really make sense to me. Strings, even empty strings, are always truthy in Ruby, so
will always take the first branch no matter what, correct? I need to think about this more too. I'll try to get this ruby code working on my machine again so I can try stuff out locally... |
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions. |
@lildude any thoughts about my last message? |
Ah yes. Good point. As I said, I didn't put much thought or testing into my quick suggestion 😄 We're still going to need to come up with a solution for Linguist. |
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions. |
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions. |
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions. |
Sorry about the delay in coming back to this. I've worked out the necessary code changes to that we can honour the PR coming 🔜 . |
I'm going to pull your doc changes into my PR and make it a co-authored commit when I merge so you don't lose the credit. |
Closing in favour of #4780. |
@lildude thanks! |
Hi! I was going to file an issue, but the change I want is small (doc changes in README.md) so I just sent a PR for discussion.
For context, see the git documentation for gitattributes. There are four states for an attribute; "set", "unset", "set to a value", and "unspecified". All of linguist's special attributes except for
linguist-language
are boolean attributes. For these, linguist considers the option to be "on" if the attribute is set or set to the string "true". It considers the option to be "off" if the attribute is unset or set to the string "false".That is, these are equivalent to linguist:
as are these:
I found this confusing: linguist planted the notion that the strings "true" and "false" are special, so when I read
on the gitattributes man page I did not understand that those strings are not special to gitattributes, only linguist. (The gitattributes man page could probably be more clear too.)
It's obviously too late to change linguist's behavior, but I propose that we change the documentation to promote set/unset vs. "true"/"false". Set/unset is the functionality for boolean options built right into gitattributes; "true"/"false" is special to linguist. (I initially stumbled onto this wondering while looking at the gitattributes file at my company and wondering why
binary
is specified like that butlinguist-generated=true
uses "true".)Furthermore, note that the examples for
linguist-vendored
andlinguist-documentation
in the README currently mix the set/unset and "true"/"false" styles:This seems extra confusing.
Finally, I think the example on the help page Customizing how changed files appear on GitHub should be changed from
to
(This might not be the right place to request that change.)