Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator output contains duplicate lines #205

Open
huush opened this issue Nov 20, 2023 · 3 comments
Open

Generator output contains duplicate lines #205

huush opened this issue Nov 20, 2023 · 3 comments

Comments

@huush
Copy link

huush commented Nov 20, 2023

The Generator combines the complete content of the individual *.gitattribute files. So, there is then duplication of content in the result.

Is this an issue?

i.e. "* text=auto" ordinarily will be overridden by subsequent rules top to bottom.

What happens if there are then multiple lines of "* text=auto" throughout the file? Does this cause some form of reset of the overrides? Can this generated file actually be used, or do we first need to filter this combined file and remove all duplication?

@huush
Copy link
Author

huush commented Nov 20, 2023

Also, the linked Generator appears to not pick up the files in Global and Community

@alexkaratarakis
Copy link
Member

alexkaratarakis commented Dec 3, 2023

What happens if there are then multiple lines of "* text=auto" throughout the file? Does this cause some form of reset of the overrides? Can this generated file actually be used, or do we first need to filter this combined file and remove all duplication?

That is a good point! Let's explore. I made a toy repository with just 2 files:

.gitattributes
 test.cpp

git check-attr --all test.cpp returns what git treats the file as. Here are the results for a given .gitattribute file:
(git version 2.34.1)

(1)

* text=auto
*.cpp binary
*.cpp text
test.cpp: binary: set
test.cpp: diff: unset
test.cpp: merge: unset
test.cpp: text: set

(2)

*.cpp binary
*.cpp text
* text=auto
test.cpp: binary: set
test.cpp: diff: unset
test.cpp: merge: unset
test.cpp: text: auto

(3)

* text=auto
*.cpp text
*.cpp binary
test.cpp: binary: set
test.cpp: diff: unset
test.cpp: merge: unset
test.cpp: text: unset

(4)

*.cpp text
*.cpp binary
* text=auto
test.cpp: binary: set
test.cpp: diff: unset
test.cpp: merge: unset
test.cpp: text: auto

The results are interesting. It seems like the last entry takes precedence, so text is either auto or set depending on the last entry. But also binary seems to "win" regardless of where it is. binary is always set in the above cases and it even unsets text when it comes last.

This doesn't look immediately obvious, so a Generator could flatten and deduplicate entries to avoid possible confusion - and have the entry match what git would have done if all entries where present.
On the other hand, having separate sections for each gitattribute that you used is very similar to what one would do manually (without a generator) and allows further manual modification and future update because any section would be exactly as you found it from the source.
I think it could go either way (maybe an option?).

Note that the behavior would be easily observable. If binary is set, then we get the following with git diff:

diff --git a/test.cpp b/test.cpp
index cc3f56d..1df2af6 100644
Binary files a/test.cpp and b/test.cpp differ

whereas if it is not set, then:

diff --git a/test.cpp b/test.cpp
index cc3f56d..1df2af6 100644
--- a/test.cpp
+++ b/test.cpp
@@ -1 +1,2 @@
 // blah
+// blah2

Similarly, with customized diff options, the diff would show the "winning" setting.

@alexkaratarakis
Copy link
Member

alexkaratarakis commented Dec 3, 2023

Also, the linked Generator appears to not pick up the files in Global and Community

You are right. The subfolders are recent additions. The generator(s) have not been updated to take this into account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants