Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.gitattributes linguist-vendored overrides not suppressing vendored files in diff #2206

Closed
dougnukem opened this issue Mar 6, 2015 · 51 comments

Comments

@dougnukem
Copy link

I'm trying to ensure that certain paths are suppressed in pull-request diffs and are treated as vendored files.

I setup a .gitattributes file specifying those paths as linguist-vendored as indicated here https://github.com/github/linguist#overrides

"Vendored files are also hidden by default in diffs on github.com."

node_modules/* linguist-vendored
cordova/* linguist-vendored
Godeps/_workspace/* linguist-vendored

Then I created a pull-request that adds a file in node_modules/ but I'm still seeing that file unsuppressed in the pull-request diff.

Am I doing something wrong or misunderstanding the purpose of the .gitattributes linguist-vendored?

@seanhealy
Copy link

I'm also seeing this and wondering the exact same thing.

It seems related to #714 which was closed years ago.

@dougnukem
Copy link
Author

What's odd is that it seems like node_modules/* is an explicit ignored file already so I shouldn't need to add it to the linguist-vendored but I'd like to be able to use the .gitattributes to manage per-project diff/pull-request ignored files.

@arfon
Copy link
Contributor

arfon commented Mar 6, 2015

Hi @dougnukem thanks for the question. Unfortunately the diff suppression isn't customizable, it's handled by the generated class here: https://github.com/github/linguist/blob/master/lib/linguist/generated.rb#L53-L69

vendored is only really used for language classifications/statistics and not for behaviour on GitHub.com. We'd eventually like to add support for overriding behaviour like this using the .gitattributes data but it's more complicated than just adding a new attribute similar to this one as we don't actually have git data available on the front-end machines when generating these diffs.

Also, incase it isn't obvious from this comment, this statement is unfortunately incorrect and we need to update the docs:

Vendored files are also hidden by default in diffs on github.com.

@dougnukem
Copy link
Author

Thanks for the clarification @afron.

So even without the .gitattributes linguist-vendored suppressing the diffs it looks like specifically node_modules is listed in generated.rb. So shouldn't anything in the node_modules be suppressed in the pull-request diff (https://github.com/dougnukem/linguist-test/pull/2/files)?

https://github.com/github/linguist/blob/master/lib/linguist/generated.rb#L57

https://github.com/github/linguist/blob/master/lib/linguist/generated.rb#L235-L240

   # Internal: Is the blob part of node_modules/, which are not meant for humans in pull requests.
    #
    # Returns true or false.
    def node_modules?
      !!name.match(/node_modules\//)
    end

@arfon
Copy link
Contributor

arfon commented Mar 6, 2015

Yep, I would expect that to work. This is definitely going to need further investigation...

@arfon arfon added the Bug label Mar 6, 2015
@seanhealy
Copy link

Thanks for your help @arfon!

@arfon
Copy link
Contributor

arfon commented Mar 15, 2015

@seanhealy this issue with the existing generated? check is now fixed.

@dougnukem
Copy link
Author

@arfon thank 👍 everything looks good now for generated file suppression in diffs!!

Are there any plans to extend that to allow per repo configuration using .gitattributes?

@arfon arfon removed the Bug label Mar 17, 2015
@arfon
Copy link
Contributor

arfon commented Mar 17, 2015

Are there any plans to extend that to allow per repo configuration using .gitattributes?

We'd definitely like to do this but it's going to require significant engineering work on GitHub.com so I don't have a timeline sorry.

@IonicaBizau
Copy link

From the REAMDE content:

Checking code you didn't write, such as JavaScript libraries, into your git repo is a common practice, but this often inflates your project's language stats and may even cause your project to be labeled as another language. By default, Linguist treats all of the paths defined in lib/linguist/vendor.yml as vendored and therefore doesn't include them in the language statistics for a repository. Vendored files are also hidden by default in diffs on github.com.

Use the linguist-vendored attribute to vendor or un-vendor paths.

$ cat .gitattributes
special-vendored-path/* linguist-vendored
jquery.js linguist-vendored=false

So, having <path> linguist-vendored=false is supposed not suppress the commit diffs.

jQuery.throwable is an example where this feature would be useful. The diffs are suppressed for the commits in the jquery.throwable.js file (except the first commit and the commits where there are changes in other files as well).

After a discussion with @jdennes I pushed the content of this repository into one of my private test repositories and added the .gitattribtes file:

public_html/js/jquery.throwable.js linguist-vendored=false

However, the diffs are still suppressed.

image

Is there any alternative/workaround for this? If not, how to understand the quoted readme content?

/cc @arfon

@arfon
Copy link
Contributor

arfon commented Apr 3, 2015

Hi @IonicaBizau - I'm sorry for the confusion about this. Basically our docs are wrong here. :-\

linguist-vendored=false will remove the file from the language statistics we calculate but it won't yet do what you want which is to change the behaviour of the diff suppression on GitHub.com. Because of the way we handle Git data in GitHub (the separation of file servers where Git operations happen and the application servers that run GitHub.com) we don't actually have the gitattribute data available to the front-end servers when we render the views.

Again, sorry for the confusion here. I'll open a PR to correct the docs.

@arfon
Copy link
Contributor

arfon commented Apr 22, 2015

/ cc #2349

@aocenas
Copy link

aocenas commented May 19, 2015

+1 for linguist-vendored=false in .gitattributes overriding the default diff suppression. We have a lot of our own code living in node_modules directory for various reasons and this is a real pain for us.

But more pain than that diffs are suppressed by default, is that when there are only suppressed diffs, the page layout is collapsed to 980px instead of being full width. This means that even if I manually click all suppressed diffs and open them, they usually have lots of wrapped lines which is unreadable in most circumstances. @arfon is there a possibility to do something about that? Like having full width layout for all diffs even if all are collapsed? Thanks.

@arfon
Copy link
Contributor

arfon commented May 19, 2015

@arfon is there a possibility to do something about that? Like having full width layout for all diffs even if all are collapsed?

Possibly although I'm not really involved in product development stuff like that here at GitHub so if you have a feature request then please email support@github.com so the request can be tracked.

+1 for linguist-vendored=false in .gitattributes overriding the default diff suppression. We have a lot of our own code living in node_modules directory for various reasons and this is a real pain for us.

I hear you loud and clear on this 😄. I'd love to get this shipped.

@hsuzukiml
Copy link

Hi @arfon, I wonder what's the status of this?
Content classifiers should always have options for manual overrides.

@arfon
Copy link
Contributor

arfon commented Oct 15, 2015

Still a work in progress I'm afraid.

@hsuzukiml
Copy link

Ok. I appreciate the update. Thanks for working on this!

@hsuzukiml
Copy link

Also, I wonder if this will make into the enterprise version, and whether it's more effective for me to try to raise this issue in a different channel?

@arfon
Copy link
Contributor

arfon commented Oct 16, 2015

Also, I wonder if this will make into the enterprise version, and whether it's more effective for me to try to raise this issue in a different channel?

@hsuzukiml - when we get this completed it should make it into GitHub Enterprise yes. It never hurts to email our Enterprise support teams to let them know this is a feature you'd like to see supported 😸

@dweinstein
Copy link

I tried using the linguist-vendored=false option to opt out of vendored for a nested node_modules directory that I want to see diffs in PRs and have the code indexed for search. so basically it's the inverse of the issue title... is this something that should actually work?

@arfon
Copy link
Contributor

arfon commented Feb 8, 2016

is this something that should actually work?

No, I'm afraid that's currently not supported 😞

@anru
Copy link

anru commented Mar 8, 2016

I have the same desire as @aocenas
cc @arfon

@ianwremmel
Copy link

Is this still being worked on? I'm working on several javascript projects that are committing into packages/node_modules. It would be great not to deal with clicking "load diff" in every code review.

@myitcv
Copy link

myitcv commented May 2, 2017

Bizarrely this did seem to be working a couple of months ago (sorry I can't be more specific)... but now seems to no longer be working. For example:

https://github.com/myitcv/react/pull/65/files

everything under _vendor should be considered vendored per:

https://github.com/myitcv/react/blob/1307fd5c32ce8beb4698aadd0db53c33628e7dea/.gitattributes#L1

@pklada
Copy link

pklada commented Jun 8, 2017

Did linguist-generated diff suppression ever make it to github enterprise? Trying to get it working on a project with lots of generated files, but it doesn't seem to be working (tried after reading: https://robots.thoughtbot.com/github-diff-supression)

@jspiro
Copy link

jspiro commented Jun 8, 2017 via email

@lildude
Copy link
Member

lildude commented Jun 8, 2017

@pklada it should have. Enterprise 2.10 was released this week so will offer very similar functionality as GitHub.com right now. I'm not sure off the top of my head when this was made available on GitHub but it may also be in 2.9 released on 1 March.

You can hover your pointer over the octocat icon at the bottom of any page on your instance to determine the version you're running.

@pklada
Copy link

pklada commented Jun 8, 2017

^ ah okay awesome. We're still on 2.8, going to try again after upgrading.

@myitcv
Copy link

myitcv commented Jul 27, 2017

Ok, having re-read this thread, I'm a little confused as to what behaviour to expect on github.com (not enterprise) with respect to linguist-vendored and diff suppression.

https://github.com/github/linguist/pull/2304/files (from back in 2015) added the following wording to the README.md:

Use the linguist-vendored attribute to vendor or un-vendor paths. Please note, overriding the vendored (or un-vendored) status of a file only affects the language statistics for the repository and not the behavior in diffs on github.com.

But that wording no longer appears in README.md.

So does that mean linguist-vendored is, at the time of writing, expected to suppress diffs on github.com?

@lildude
Copy link
Member

lildude commented Jul 27, 2017

@myitcv

So does that mean linguist-vendored is, at the time of writing, expected to suppress diffs on github.com?

Nope. That section was incorrect when it was added to the readme and removed in #2714. To quote the PR OP:

Vendored files are treated just like any other files when diffing on github.com.

... that is, they are not suppressed.

@myitcv
Copy link

myitcv commented Jul 27, 2017

@lildude - thanks for clarifying.

To ask the perhaps obvious follow up question, given that linguist-vendored does not influence diff suppression, is there any way to suppress diffs for certain paths? .gitattributes or otherwise?

@lildude
Copy link
Member

lildude commented Jul 27, 2017

Yup linguist-generated will suppress diffs as pointed out further up in this thread.

@myitcv
Copy link

myitcv commented Jul 27, 2017

@lildude - thanks. Apologies, I mentally skipped anything to do with linguist-generated on the basis vendored does not imply generated.

Nonetheless, that solution works well.

@gluons
Copy link

gluons commented Aug 8, 2017

Does linguist-generated exclude generated files from language stat?

@lukes
Copy link

lukes commented Aug 15, 2017

@gluons
Copy link

gluons commented Aug 15, 2017

@lukes But I've already added linguist-generated in my repo but it still show in language stats. (dist folder)

@lukes
Copy link

lukes commented Aug 15, 2017

@gluons Hmm, weird. According to this https://github.com/github/linguist#vendored-code everything in this file should be excluded automatically without you needing to add anything https://github.com/github/linguist/blob/master/lib/linguist/vendor.yml so not sure why it's not working for you. Maybe raise an issue?

@gluons
Copy link

gluons commented Aug 15, 2017

@lukes OK. I'll try to adjust my repo again. If it still not work, I'll raise an issue.

@lildude
Copy link
Member

lildude commented Aug 15, 2017

@gluons

But I've already added linguist-generated in my repo but it still show in language stats. (dist folder)

Don't confuse counting towards the language stats with having the language correctly identified and returned in the search results you get when clicking the language bar. The search results return all files idetified whilst the language stats only counts those it should.

I've checked your repo and the dist files are not being included in the language stats calculation.

@lildude
Copy link
Member

lildude commented Aug 15, 2017

@gluons, if it helps the JavaScript total comes from two files: autoload.js and karma.conf.js. All others are ignored by Linguist.

@gluons
Copy link

gluons commented Aug 15, 2017

@lildude Right! I'm going adjust to my .gitattributes. Thank you. 🖖

@gluons
Copy link

gluons commented Aug 15, 2017

I've checked your repo and the dist files are not being included in the language stats calculation.

@lildude How can I check it by myself?

@lildude
Copy link
Member

lildude commented Aug 15, 2017

@gluons you'll need to install Linguist on your system and then check the repo using bundle exec linguist --breakdown as per https://github.com/github/linguist#usage

@gluons
Copy link

gluons commented Aug 15, 2017

@lildude OK. 🙇 Thank you. I'll try.

@rattrayalex-stripe
Copy link

rattrayalex-stripe commented Sep 12, 2017

For the benefit of others scrolling through a rather long thread, using this in your .gitattributes should now suppress diffs for files with the extension .whatever on Github.com and Github Enterprise 2.10 and greater:

*.whatever linguist-generated

(please correct me if I have summarized poorly)

@lildude
Copy link
Member

lildude commented Oct 25, 2017

To summarise, as of today, 25 Oct 2017:

  • vendored files are not suppressed in diffs
  • generated files are suppressed in diffs
  • any file/directory can be marked as generated using an over-ride: *.whatever linguist-generated
  • any file/directory can be excluded from being detected as generated using an over-ride: *.whatever linguist-generated=false
  • any file/directory can be marked as vendored using an over-ride: *.whatever linguist-vendored
  • any file/directory can be excluded from being detected as vendored using an over-ride: *.whatever linguist-vendored=false

All of this is detailed in the README.md so I think we can close this issue now.

If you are not seeing this behaviour or it changes in future, please open a new issue.

@lildude lildude closed this as completed Oct 25, 2017
@github-linguist github-linguist locked and limited conversation to collaborators Oct 25, 2017
@lildude
Copy link
Member

lildude commented Oct 25, 2017

I've locked the conversation as it is already incredibly long and hard to follow and I'd prefer not to add more complexity if it can be avoided. A lot has changed in over 2 years so it's best to keep things "fresh".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests