Skip to content

Ignore files that are ignored by git.#745

Closed
krobelus wants to merge 1 commit intonextcloud:masterfrom
krobelus:gitignore
Closed

Ignore files that are ignored by git.#745
krobelus wants to merge 1 commit intonextcloud:masterfrom
krobelus:gitignore

Conversation

@krobelus
Copy link
Copy Markdown

This commit adds a new configuration option, similar to "sync hidden files".
By default, any file that would be ignored by git is not synced.
This introduces a dependency to libgit2.

Related issue: #26.

Some notes:

  • Is this something that we want? Otherwise we might turn it off by default as to not change the behaviour.
  • I did add git2 as cmake dependency, I don't know if this is enough to make it build everywhere.
  • Most of the code is basically just copied from the logic to exclude hidden files which could use some refactoring. In particular, the configuration on whether to sync is stored per folder, but it can only be set once. Perhaps we want to improve on that before merging this.

This commit adds a new configuration option, similar to "sync hidden files".
By default, any file that would be ignored by git is not synced.
This introduces a dependency to libgit2.

Related issue: nextcloud#26.
@juliusknorr
Copy link
Copy Markdown
Member

Thanks for your pull request @krobelus.

Is this something that we want? Otherwise we might turn it off by default as to not change the behaviour.

Can you describe the use case for this feature a bit more. I'm a bit concerned, that it is rather specific. I don't think that we should introduce such specific settings. cc @nextcloud/designers for the settings.

@jpnurmi
Copy link
Copy Markdown
Member

jpnurmi commented Nov 5, 2018

Would it be a lot of work to teach the existing file exclusion system to handle gitignore's pattern formats? Then we could perhaps have a bit more generic "Import..." button to allow loading .gitignore or any other text files with a set of patterns? :)

@krobelus
Copy link
Copy Markdown
Author

krobelus commented Nov 8, 2018

Files ignored by git should mostly be automatically generated, one may want to avoid uploading them.
I understand this is very specific and it is doing several different things at once and as such may be hard to implement in a generic way :(
So this involves two distinct features:
- load exclude patterns from files
- have different exclude patterns for different directories (as in #26)
I reckon it wouldn't be too hard to implement loading text files containing some patterns.
But then we'd need to deal with relative exclude paths via directory-specific excludes first, which is probably not easy to get right.

@jancborchardt
Copy link
Copy Markdown
Member

I'd also like to hear specific use-cases or user stories for this. Especially a new dependency is not to be taken lightly I guess? @rullzer @camilasan

@icewind1991
Copy link
Copy Markdown
Member

The use case here is syncing software projects without the overhead of syncing things like js/php/etc dependencies (which often include large amount of small files which are slow to sync).

Having the client respect .gitignore files makes for an easy way of excluding all those files without any additional configuration.

@Stijn98s

This comment has been minimized.

@juliusknorr
Copy link
Copy Markdown
Member

Would be fine by me, @rullzer @camilasan What do you think?

@Stijn98s
Copy link
Copy Markdown

would be amazing

@bes1002t
Copy link
Copy Markdown
Member

bes1002t commented Apr 1, 2019

why just gitignore? I think this is a little bit too specific. Why not make it possible to simply ignore the paths inside of a given file?

@Stijn98s
Copy link
Copy Markdown

Stijn98s commented Apr 2, 2019

That would be a better option

@microtronics
Copy link
Copy Markdown

Please add this feature!

@jenschurchill
Copy link
Copy Markdown

Absolutely need this.
Nextcloud client has been syncing for days, trying to process the black hole that is node_modules!

I would personally be quite happy, with a more generic option, that states, if a file or folder exists inside of parent, ignore entire parent folder. As a parent-folder with a .gitignore file or .git folder inside, is usually distributed to other computers already, and sync'ed with git clone/pul/push, and I have no interest in having it automatically sync'ed across my various pc's (In fact, I would prefer it very much didn't !).

I would point out, that ignore files, such as gitignore, don't simply contain paths, but patterns, so bes1002t suggestion of scanning files for paths wouldn't work.

To my knowledge, gitignore set the standard for how ignores files tend to work, so the following filetypes use the same pattern, and might also be considered for inclusion...

gitignore, npmignore, coffeelintignore, dockerignore, atomignore, vscodeignore, eslintignore, prettierignore, stylelintignore

@krobelus
Copy link
Copy Markdown
Author

Might be hard to get right. An easy way to resolve this on the user's end is to only add the bare repository (.git) to your Nextcloud (and not the working copy). Or use rsync --filter="dir-merge,- .gitignore" to sync files that are not ignored.

@jsaraiva
Copy link
Copy Markdown

jsaraiva commented Aug 24, 2019

Although it doesn't solve the underlying issue (+1 for that, btw), you can skip the node_modules folder syncing by specifying the pattern "node_modules/" (without double quotes, with a slash at the end) in the client's ignore dialog. This may help you with your immediate problem.

This feature would really come in handy for me because of cases such as:

  • RoRails' tmp and log folders (I don't use ActiveStorage)
  • .NET's bin and obj folders
  • Unity's Library, obj, and Logs folders
  • Cordova's platforms and plugins folders
  • ReactNative's Android and iOS build folders
  • Other miscellaneous files, ignored by git too

I obviously can't just ignore most of these folders, as those names are pretty generic, and I would risk excluding relevant files.

@jenschurchill
Copy link
Copy Markdown

jenschurchill commented Aug 24, 2019

@krobelus
I'm not sure I follow, but if you're suggesting manually rsync'ing to avoid nextcloud sync hell (hundreds of thousands of small files making nextcloud client eat cpu and battery like crazy, before one notices, stops it, adds a new pattern, and resumes client), I'd say that is very poor UX - But I believe you meant something else?

@jsaraiva
100% agree, and although I did add node_modules/ to the ignore patterns list, that wouldn't always be possible. As you state, it could easily have been a folder name less specific; I think @krobelus 's work here, is very much needed, and realize that I am actually gravitating towards a different feature request, if ignore patterns supported ignoring capturing groups, this regex (^\/.*)\/\.git\/ would completely solve the problem I'm facing, and lend a lot more power to the ignore filter.

ie. "/home/user/projects/a-project/.git/" ends up ignoring "/home/user/projects/a-project"

That being the case, I will stop polluting this thread, and simply state, that @krobelus idea gets a +1, would be awesome, from me.

Edit: Escaped backslashes i regex

@dmnq-f
Copy link
Copy Markdown
Contributor

dmnq-f commented Sep 10, 2019

I think this issue was in it's general ambition resolved by the great contribution of #1374 , am I right? Would be happy if you agree/disagree here to get this case closed and communicated 👍 🚀

@krobelus
Copy link
Copy Markdown
Author

@DominiqueFuchs yeah, that's a nice solution for the directory-specific excludes.
It might be possible to add support for file exclusion systems like git's on top of that in future, though I doubt that's really necessary.
Closing, thank you!

@krobelus krobelus closed this Sep 10, 2019
@brainchild0
Copy link
Copy Markdown

brainchild0 commented Oct 13, 2019

Out of curiosity, do other users really want a working copy of a Git repository to be synchronized into Nextcloud server? Git and NC are both solutions for sharing and versioning files with servers and other clients, but for very different use cases. For me, if a Git working directory were in a NC-shared folder (a scenario I specifically avoid), then I would prefer that NC ignore the entire directory (not the one named .git but the one containing the one named .git).

I definitely understand the discussion to avoid build targets from synchronizing, but everything not a target is either scratch, which has limited long-term value, or source, which would, and certainly could, be replicated into a remote repository.

@jenschurchill
Copy link
Copy Markdown

@brainchild0 No, I would not want that, at all.
If you read the last couple of comments I left #745 (comment) and #745 (comment) (or maybe, read the entire thread?) - I think you'll notice that there are different concerns, and this issue is not trying to solve the one your referring to.

@Stijn98s
Copy link
Copy Markdown

@brainchild0 i use nextcloud as a sharing between my laptop and desktop. git projects are also synced in that folders. For me its just so the files are synced between my laptop and desktop and not for versioning. That is why i prefer that the excludes in the .gitignore are added to the nextcloud excludes. This because this are almost always library files and take up alot of space.

@microtronics
Copy link
Copy Markdown

For me it's the exact same issue: syncing between laptop and desktop...

@brainchild0
Copy link
Copy Markdown

@Stijn98s @microtronics So you work on a development project alternating one system with the other, and you want your source files to be synchronized as soon as they change somewhere?

I suppose your use case is not one I had not considered, as it is not how I normally work, but I understand the issue.

Not to be pedantic, but NC synchronizes at the file level, and has the potential to introduce failures in a development project because of a tree state not representing a snapshot described by some atomic commit operation. It also lacks any facility for conflict resolution at the granularity below the file.

You may be finding that you are having some success with using NC for your development projects, but have you considered just using a private working branch on a server? It could be even a different server from one you might share with your colleagues. It wouldn't be automated (unless you ran a background robot), but it would give you assurance that operations are atomic at the tree level. Once you are ready to merge into a shared branch you can rebase your history so no one sees your extra commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.