Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next Gen Ignores: Requirements #2491

Open
calmh opened this issue Nov 19, 2015 · 95 comments
Open

Next Gen Ignores: Requirements #2491

calmh opened this issue Nov 19, 2015 · 95 comments

Comments

@calmh
Copy link
Member

@calmh calmh commented Nov 19, 2015

This issue replaces the ones that relate to fundamental issues with how our ignores work. The following is the up to date list of requirements:

Certain Requirements

  • Must be understandable by someone who understands the interface in use by DropBox, OneDrive, etc.
  • Must be a simple tree view with checkboxes
    • Root of tree (=folder root) checked by default
    • Can uncheck root and then check just the branches that we want synced
  • Must be able to black list extensions
  • Must live in config (not .stignore files)
  • Must "live update". That is, if I ignore a file that was going to be pulled, we should not pull it. If I unignore a file that we are missing, it should be pulled shortly thereafter.
  • Must be able to specify if ignored files should be preserved or removed on directory delete.
    • Per what though? File extension? Directory?
  • Power user access to flexible, underlying patterns
  • Be able to to remove files for folders that are not picked, as it remove things that are now ignored.
  • Have inclusion and exclusion mode, ignore everything, include X, versus include everything, ignore X.

Maybe?

  • Per directory file extension black lists ("never sync *.tmp")
  • Per directory file extension white lists ("only sync *.doc")
  • Should sync between devices
  • Ability to ignore a directory by the presence of a special file in it (cachedir.tag).

Comments and discussion below. A project maintainer will keep the list above in sync as the requirements change or are clarified. Note that this is not a list of problems with the current system, but requirements for a new system.

@crccheck
Copy link

@crccheck crccheck commented Nov 19, 2015

Why not also keep .stignore, but make it sync between devices as .gitignore and .hgignore do? That way, there's a developer friendly way to keep files out. As someone who had to delete 100k node_modules files yesterday (not to mention .DS_Store, packages, build, .tox, etc), that would have save me 3 gigs of bandwidth and time.

There's two distinct user stories:

  1. As a user to Syncthing who also knows Dropbox, I want a familiar interface for selectively syncing files so I only sync what I care about
  2. As a developer using Syncthing, I want a familiar interface for selectively syncing files globally so I only sync what I care about without additional set up every new install.

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

A .fooignore file might come naturally to developers, but that's really the only demographic that expects a hidden file with a bizarre extension (think Windows). I think it's a developer accident that it exists at all in it's current form today - it's really a piece of folder configuration as anything else. Not having it in the folder itself also works better with read only folders - perhaps you want to share the contents of a DVD-ROM... (This also requires handling the non-existence of .stfolder, but there's a ticket for that.)

Whether it syncs or not is really somewhat beside the point of the existence of the file, but I do think there should be at least an option to sync the patterns, yes.

(You can do that today with a little #include hackery as well.)


To bring it more in line with the actual ticket; what's the actual requirement that having an .stignore on disk fulfills?

@cdhowie
Copy link
Contributor

@cdhowie cdhowie commented Nov 19, 2015

I don't care where the ignore data is stored, but patterns allow me to express ignores in a concise manner and with more flexibility than a tree option. Please at least retain this functionality, if not in a file called .stignore then somewhere else. None of the options provided in this ticket summary will meet my needs.

If the current pattern-based ignore option is removed I'm afraid I may have to stop upgrading Syncthing, as versions without pattern-based ignore wouldn't meet my needs.

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

@cdhowie What are the needs, that are not covered by the tree options and the extension white/black lists?

@cdhowie
Copy link
Contributor

@cdhowie cdhowie commented Nov 19, 2015

There is no accommodation for prefixes in the current proposal. I can't be the only one with a ._* ignore.

Neither is there a way to express a pattern along the lines of "ignore all objects named bin under directory foo" whereas we can do that today with

/foo/bin
/foo/**/bin

I'm not asking you not to implement a nicer interface for new users, I'm asking you not to sacrifice the power users on the altar of the masses.

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

Well, currently the masses are being lead to the slaughter en masse as they try to work the ignore system, so some sort of balance there is required. :D

The prefix thing is a good point, and should probably be supported somehow. I need to think about the last one...

@AudriusButkevicius
Copy link
Member

@AudriusButkevicius AudriusButkevicius commented Nov 19, 2015

So the two could coexist, but power users will have to grasp their concepts around about the order in which it gets executed, etc.

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

Well, yes. At some point whatever the user is configuring, graphically or text based or however, is going to to get compiled into some list of match objects of some kind. Today that's a list of regexps, unfortunately that's a bit inefficient to evaluate. But something like that; whatever it ends up being could of course be exposed in some way so that they can be input directly into the config or something.

@cdhowie
Copy link
Contributor

@cdhowie cdhowie commented Nov 19, 2015

whatever it ends up being could of course be exposed in some way so that they can be input directly into the config or something.

I would absolutely be happy with that resolution, if there is a (somewhat) easy way to edit this in the GUI. Like an "advanced" button at the bottom of the simple UI that opens the list of compiled patterns, letting me fiddle with them.

(I'd be willing to contribute dev time to making that happen, as this is an important feature for me.)

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

Possibly. The way I could envision it in that case is that you'll probably have a simple default layout with a tree view and a list of extensions or something. Switch to advanced mode and you get the compiled patterns or something and can edit them at will. But once you've done that, you probably won't be able to switch back, as what you created in the advanced mode might not be expressible in the simple mode.

@cdhowie
Copy link
Contributor

@cdhowie cdhowie commented Nov 19, 2015

But once you've done that, you probably won't be able to switch back, as what you created in the advanced mode might not be expressible in the simple mode.

Same thing I was thinking. Or, you can switch back, but you'll get a warning that your manually-modified patterns will get destroyed if you do.

@AudriusButkevicius
Copy link
Member

@AudriusButkevicius AudriusButkevicius commented Nov 19, 2015

I think we can keep what we have today and keep it the same way it is today, just add a thing before ignores which is the dir selector.

@calmh
Copy link
Member Author

@calmh calmh commented Nov 19, 2015

Okay I added that to the requirements

@cdhowie
Copy link
Contributor

@cdhowie cdhowie commented Nov 19, 2015

Thanks @calmh. If you are in need of dev power to implement this feature let me know. I don't have a lot of time but I'm willing to spare some for ST.

@mkroehnert
Copy link

@mkroehnert mkroehnert commented Nov 19, 2015

@crccheck keep in mind that storing .stignore files inside the directories to be synced leaks metadata if you copy the directory to a USB stick for example.
You might not want other people to know which program you are using for synchronization and which files are ignored.

@Schroedingers-Cat
Copy link

@Schroedingers-Cat Schroedingers-Cat commented Nov 20, 2015

I think it is important to differentiate between ignoring as part of setting up a new repository and ignoring as part of setting up a new node.
When you create a new repository and want to exclude f.e. all binaries prefix_*.extension from the subdirectory /foo/bin, you don't want to see these files on any new node you sync this folder to. That's why exclusions expressed at this state as part of the folder setup should be considered an essential information and thus be synced automatically to any new node. What is excluded at this stage should never get any attention by syncthing (for that repository).
In contrast to that is ignoring as part of setting up a new node. Users likely have a device to which they don't want to sync the entire repository but only a specific subfolder. These information are device specific and should only be cared about on that node.
To summarize, ignoring at the repository level is some global filter that every device applies while ignoring at the node level is a filter that is specific to that device only.

Ignoring at the node level is probably what most users want as this is what Dropbox/Onedrive etc. offer. You have a tree list and uncheck a folder/file you don't want to download to that node. A simple interface with a folder tree, maybe even a file list, with checkboxes should suffice.
Ignoring at the repository level is what advanced users do. They care about not syncing Thumbs.db or some config.dat files of a subdirectory containing device-specific information like it's screen resolution. Even something simple like excluding files based on their extension means that they activated the option "show file extensions" in the explorer (which is what most users don't do). I don't think that these users need a more user friendly way to deal with exclusions because if they can express their needs in a form like "filter out all docx-files" they should be able to put it like *.docx. So instead of switching between simple mode and editable compiled patterns (you called it "advanced mode"), start with the simple folder tree list that configures at the node level and add a button called something like global ignore patterns or advanced mode where power users can input ignore patterns (like we have now) for the repository level.
This way, power users have fine grained control about what sort of data a syncthing repository contains and both most users and power users have a simple method to control what portion of that syncthing repository a certain node downloads/uploads to other nodes.

That is why I like @AudriusButkevicius suggestion:

I think we can keep what we have today and keep it the same way it is today, just add a thing before ignores which is the dir selector.

This is good for all users. Plus it doesn't involve rewriting the entire system, just adding the "simple node-level" sync system and it's GUI. (Not that the system would be simple to implement, but simple to use).

Also, repository-level ignore patterns don't need to reside in a .stignore file, they could also be saved within the syncthing configuration folder and synced between nodes as metadata.

@AudriusButkevicius
Copy link
Member

@AudriusButkevicius AudriusButkevicius commented Nov 20, 2015

Also, another thing people asked for is synced ignores and global ignores (applied to all folders)

@sneak
Copy link

@sneak sneak commented Nov 28, 2015

If set to a non-random sync priority (e.g. newestFirst) , the puller getting hung up on a directory delete with ignored files inside it (e.g. .DS_Store) means the folder stops syncing until the logs are checked for why (and the directory manually deleted). The default of 'random' seems to mitigate the impact of this, if I'm understanding it correctly.

@katrinleinweber
Copy link

@katrinleinweber katrinleinweber commented Nov 30, 2015

I'd like to chip in a pretty good example of visualising exclusions from a folder tree: Crashplan. The greyed-out & red-marked files are excluded via filters:

crashplan file list with exclusion rule results

What Syncthing could improve over this design is the ability to right-click a file or folder & quickly access the menu to change the rules. E.g. the context menu could have the option in-/exclude files/folders like this one, depending on whether an already excluded or included file is right-clicked. This would open an edit filters panel on the side, in which changes trigger a preview in the folder tree. The preview could highlight & explain what would happen to the file/folder set (in colorblind-safe manner).

Including a specific file or sub-folder that is excluded by a rule should be as simple as clicking the greyed-out check-box.

@capi
Copy link
Member

@capi capi commented Nov 30, 2015

What would also be good would be a policy per directory (as advanced setting), if new sub-directories are to be added automatically or not. For some directories it's desirable to automatically add new children, for others, it's not.

@imsodin
Copy link
Member

@imsodin imsodin commented Jan 2, 2020

As the sync part is just a maybe point in this issue, and there may (sic!) be a good enough solution to achieve it within the current system, lets discuss about that in a separate place (and open an issue if it turns out to be doable/actionable): https://forum.syncthing.net/t/syncing-ignore-patterns/14272

I already put up some initial thoughts there as an incentive to actually move the discussion there ;)

@gabriel-fallen
Copy link

@gabriel-fallen gabriel-fallen commented Jan 4, 2020

What do you think about ignoring everything mentioned in .gitignore/.hgignore/etc. found in the directories subject to synchronization, relative to the ignore file? The same way recent utilities like SIlver Searcher, Ripgrep and fd-find behave by default?

Obviously analogous to the said programs we need a way to switch off this default, but as long as this feature is developers-aimed we can just place special un-ignore files in appropriate directories.

@tengwar
Copy link

@tengwar tengwar commented Jan 4, 2020

@gabriel-fallen IIRC the utility-independent file name that most tools like ripgrep support is .ignore.

And in the first post there is a requirement that says:

Must live in config (not .stignore files)

So I don't think a file-based solution is going to be accepted.

@gabriel-fallen
Copy link

@gabriel-fallen gabriel-fallen commented Jan 5, 2020

@tengwar first, Ripgrep supports exactly .gitignore: https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md#automatic-filtering

Second, the point is to be tool-dependent: support .gitignore, .hgignore and some other existing popular ignore files from revision control system and such.

The thing is I'm syncing a directory with some Git repos inside it, so the repos already have .gitignore files containing things I'm not going to sync. So I had to go through all these ignore files and copy-paste patterns into Syncthing, while Syncthing can handle that completely automatically and transparently.

@amra
Copy link

@amra amra commented Jan 9, 2020

Second, the point is to be tool-dependent: support .gitignore, .hgignore and some other existing popular ignore files from revision control system and such.

This is fine. Settings in file is simple and strait forward. But .gitignore/.hgignore only set files to ignore. I would like to configure files/folders to download (opposite to ignore) and ignore the rest. That is not supported in .gitignore.

An example:
I share library (music/video/images) between PC and smartphone. Smartphone allows just a subset of the library (to save space) and ignores all other. Now the PC adds a new music folder, which will be immediately downloaded on the smartphone because the new folder not mentioned in ignore settings. I just would like to set some folders in the library to download.

The configuration must enable patterns for ignore as well as for allow.

@gabriel-fallen
Copy link

@gabriel-fallen gabriel-fallen commented Jan 11, 2020

@amra I'm not saying syncthing should adopt .gitignore/.hgignore as the only ignore/allow config. I'm saying it would be nice from syncthing to respect these ignore configs when and if it encounters them while walking a directory tree.

Yeah, whitelist is a very nice thing to have, especially on receiving side as in your example. Totally agree.

@ProactiveServices
Copy link
Member

@ProactiveServices ProactiveServices commented Jan 13, 2020

@gabriel-fallen the problem with accepting third-party software ignore lists is several-fold: they are likely to have differing syntax today and tomorrow may be different still; the ignores are for different purposes: I may not want $privatedata to be pushed to a remote git repo, but do want it synced to my other machines; and which ignore pattern system is to take precedence?

@gabriel-fallen
Copy link

@gabriel-fallen gabriel-fallen commented Jan 13, 2020

@ProactiveServices I can't say about all third-party software, but all of Git, Mercurial, Subversion and Syncthing use globs for ignore patterns. And I really doubt these applications will throw backwards compatibility out of the window anytime soon. 😉

Supporting all third-party software out there is clearly unrealistic and unnecessary. But support for the most widespread ones would be nice.

Regarding ignoring different things I totally agree. In my first comment I said there should be a way to "ignore ignore" or a whitelist. As long as this feature is aimed at advanced users (programmers) files-based solution is acceptable in my view.

@calmh
Copy link
Member Author

@calmh calmh commented Jan 13, 2020

We're not going to support other programs ignore patterns. .gitignore looks quite similar to our ignore patterns, but works completely differently (last match wins, cannot unignore children of ignored parents, for example). It is not simply a matter of "it's also globs" -- different semantics, even if the syntax is similar.

@gabriel-fallen
Copy link

@gabriel-fallen gabriel-fallen commented Jan 13, 2020

@calmh OK I see. Thanks for the clear decision.

ibizaman added a commit to ibizaman/syncthing that referenced this issue Feb 24, 2020
ibizaman added a commit to ibizaman/syncthing that referenced this issue Feb 24, 2020
ibizaman added a commit to ibizaman/syncthing that referenced this issue Feb 24, 2020
ibizaman added a commit to ibizaman/syncthing that referenced this issue Feb 24, 2020
@finsterwalder
Copy link

@finsterwalder finsterwalder commented May 3, 2020

I'm a new Syncthing user.
(Thanks so much for this really great tool! And in particular for the QNAP version!)
So I just had my first fight with Syncthing patterns...

I think that the current patterns are not bad, but have some problems, that make the ideas posted in this thread a little difficult. I would suggest to modify the ignore pattern handling slightly and then add a UI on top of that, which generates ignore patterns under the hood.
And, as discussed above, an "advanced" mode should be added to allow for more tricky patterns for powerusers.

So what are the problems with the current ignore patterns and how should they change?
(Keep in mind, that my understanding of the current system may be wrong, so please correct me, when I am)

  1. sub-sub-folder (un-)ignore is cumbersome, because I have to take care of all the intermediate levels. This makes a simple GUI, where I check a folder three levels deep more complicated.
    With path /a/b/c/d/... when I want to ignore /a but not /a/b/c/..., I just want to write down:
    !/a/b/c
    /a
    And not
    !/a/b/c
    /a/b/
    !/a/b
    /a

  2. Patterns ending in with a slash behave unexpectedly. As a first time user I thought, that a trailing slash would clearly state that it only matches a directory. But instead it means only match the directories content. So currently a/ and a/** are actually the same and that's unexpected and misleading. I don't really see a usecase for that.

  3. There is no way to clearly differentiate between directories and files. (Follows from point two above).
    Now I can't write a pattern that ignores files, but not directories that match the same pattern or vice versa.
    (e.g. ignore sub-directories starting with a dot but include files starting with a dot, that reside in the same directory)
    I would change the behaviour of a trailing slash to mean directory only. So /a*/ would mean any directory starting with a, but not files starting with a.
    /.*/ -- ignore directories starting with a dot, not just their content
    !/.* -- do not ignore files starting with a dot

  4. ** matching slashes is strange. ** should only match any number of sub-directories. But patterns like ab**cd behave strangely and are really complicated to understand and predict. I would restrict the usage of ** to full path matches only. So only allow ** inbetween two slashes, at the start before a slash or at the end after a slash. ab**cd should be forbidden. ab/**/cd would be allowed, as would **/ab and ab/**. ab*/**/*cd would do the same as ab**cd does now, but would be more explicit and easier to understand.

The simple treeview GUI really just needs patterns to include and exclude specific files and directories. That's doable, in particular when point 1 above is fixed.

So a GUI could translate all the checkboxes into specific include and exclude patterns and vice versa. An additional textbox could be provided for advanced patterns.
I think it should actually be possible to combine the two. So when I have whatever patterns in my list, I should be able to filter all the ones, that specify a specific file or directory (no wildcards) and translate them into a treeview and vice versa without losing anything.
So I could even show an advanced mode to edit all patterns and then switch back to treeview. Best of both worlds! :-)

@xxxserxxx
Copy link

@xxxserxxx xxxserxxx commented Jun 18, 2020

Is the design to have the ignore file synced or not? It appears so, from the current ticket description. Is so, this seems an odd decision for the purpose. For most use cases, wouldn't you want per device settings for selective synchronization, and isn't that defeated by syncing the ignore file(s)?

@pagdot
Copy link

@pagdot pagdot commented Jun 18, 2020

You could probably ignore the ignore file

@imsodin
Copy link
Member

@imsodin imsodin commented Jun 18, 2020

The issue tracker is not the right place to ask questions, please use https://forum.syncthing.net for that.

@shish
Copy link

@shish shish commented Jun 19, 2020

wouldn't you want per device settings for selective synchronization

Now that you mention it, I wonder if the reason that this is taking so long to go anywhere is because we're treating "selective sync" (typically a per-device setting looking at folders) and "ignore *.tmp files" (typically a global setting looking at globs) as if they are a single feature? If we instead treated them as two separate features which interact with each other, would one (or both) have a chance of getting implemented?

@xxxserxxx
Copy link

@xxxserxxx xxxserxxx commented Jun 19, 2020

@guziks, in #193 mentioned some great use cases. That ticket was closed in favor of this ticket, so I assume someone decided this design covers those use cases. Most of those use cases specifically talk about situations where some machines sync some parts of a share while ignoring other: the set of shared/ignored files differ between machines.

The proposal should be explicit about whether the excludes file will be synced or local only, and it'd be nice to figure out how (if it's shared) all of the use cases as proposed by @guziks will be addressed. Right now, the only live ticket about this feature specifies implementation requirements without describing the problems (situations, use cases) the requirements address (and the ones they don't).

@imsodin: from the ticket description:

Comments and discussion below. A project maintainer will keep the list above in sync as the requirements change or are clarified.

I'm asking for clarification of the ticket, so I think it's appropriate to comment here.

@hlovdal
Copy link

@hlovdal hlovdal commented Oct 11, 2020

I have read through all the comments here and one takeaway is that syncthing ought to have something probably named Folder ignore profiles.

  • A shared folder can have zero or more folder ignore profiles associated.
    • Zero means behaviour exactly like today.
    • Examples: desktop, laptop and tablet.
  • When attaching a new device, the device with the folder content sends a list of all the currently associated folder ignore profiles to the blank device.
  • In addition to filling in the device id, the user must select which folder ignore profile to use where "None" is an additional choice.
  • The selected folder ignore profile will be used to create an initial .stignore file on the new blank device.
  • The .stignore file along with any files it includes (recursively) are the first files that are synchronized. Synchronization of all other files are delayed until this is done.
  • After this initial transfer, the .stignore file (and included files) is never automatically updated again, even if the profile is modified.
  • Folder ignore profiles are configured in folder settings in a new tab next to "Ignore Patterns".
  • Folder ignore profiles are not stored within the shared folder.
  • Folder ignore profiles should probably be synchronized between all the devices (and then needs conflict handling).

The above can be implemented today without any dependency to any new ignore behaviour, and it will fully address the forget and race condition issues that @madumlao raised.

The above should represent a complete MVP. Some smaller additional features that could be added later:

  • Force propagation of folder ignore profile updates. If you update say the "mobile" profile on your desktop computer, you can chose that this update should be sent to all devices. If a device receives a profile update that matches the profile that initially was used, then it can be applied automatically if configured to do so or have it as a pending question for the user to confirm before doing the update. This means that profiles need a unique id so that renaming from "mobile" to "mobile phone" does not break things.
  • Allow for re-selection of profile.
  • A folder ignore profile could either be folder specific or global (possible to share between multiple shared folders).

The force propagation ting might not be that important since it effectively already can be achieved with a simple #include .stignore.mobile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet