Enable key:value completions & allow removal of tags with --force-cv#87
Enable key:value completions & allow removal of tags with --force-cv#87jneidel wants to merge 4 commits intonovoid:masterfrom
Conversation
|
I don't understand the idea behind renaming "controlled_vocabulary" (a term often used in literature related to personal management or personal informatics) with "constant_vocabulary". Would you elaborate on that? |
|
I don't like colon as separator although it's the most commonly used separator for key value pairs: it's an illegal character for Windows file systems: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file Therefore it would impose issues when people start adapting colons as parts of file names on non-Windows file systems and then face issues when trying to copy files to USB thumb drives or share via network services. Would you provide a PR without colons? |
I am aware of the windows problem, but why must stand in the way of completion on linux? Even as it is right now, one can just use colon separated tags. I only got the idea of key:value tags from your great how to do tagging article, in which you clearly mention the windows caveat. |
I introduced the variable "controlled_vocabulary" in #80. |
I see. However, the term is problematic in my opinion. In the books, "controlled vocabulary" (CV) stands for the general idea of having a user-curated set of vocabulary - in this case: tags. When I understand you correctly, both variable names deal with user-curated tags. Therefore, I would rename both variables in order to avoid misunderstanding - just as I did misunderstand the nature of the variable's purpose. In general, I'd say that "controlled vocabulary" should always related to the content of .filetags files, that do represent the concept of CV within filetags implementation. If a variable is not just containing that data, it should not be named like CV. If we have multiple derived data structures, better to avoid the CV term and use descriptive names that differ. |
Yes. This is not a Linux-only tool despite the fact that (most likely) we both are not using Windows-based file systems at all.
I would even say that filetags must make sure that colons may not entered by the user by accident. I don't think that I've introduced any user input sanitation check to avoid illegal characters. Maybe this is the right moment to think about introducing such a check.
I know what your intentions are and I'm really pissed myself that Microsoft did that stupid design decision back in the old days (or inherited from CP/M and/or QDOS). Colons are very common characters for all sorts of things that also might be present in file names. However, when somebody is starting with filetags, introducing colons to file names (because not unlike we both, the user fancies colons for particular purposes), I don't want the user to face issues when working with Windows people. This always falls back to filetags one way or the other. Therefore, I would suggest that you modify your PR so that it nudges for dashes (only) instead. If documentation is also adapted accordingly, I really would like to see that idea in filetags if I don't find anything that would stop working which did work before with respect to user experience: format of .filetags files, text completion as well as Python UI completion, ... |
Both vocabulary and constant_vocabulary contain the structured contents of the .filetags file. So they contain the CV, name fits.
Sure, this can be combined with a operating system = windows check. This way the colon improvement for completion can be applied and windows users can be protected from (unknowingly) shooting themselves in the foot 🙂
In terms of backward compatibility (which is what you are referring to here) the changes would means:
|
But their content differs, otherwise one of them would be redundant. You wrote:
So I would suggest to use "user_controlled_vocabulary" and maybe something like "filtered_controlled_vocabulary" - depending on the use-case which you have a better understanding of at this moment. If my assumption is true, this would clarify the situation.
No, I would prevent colons on any operating system as the generated file names should not cause issues when transferred later-on to a different computer/user/system.
I'd prefer to not let the user enter problematic characters at all instead of issuing an error or warning which requires a manual change by the user. It's better to avoid errors (if possible) than to notify about them later. I'm not sure if this is possible with the two input methods filetags does have: CLI + GUI. If it's not possible to prevent while entering, we have to accept the warning I guess.
Yes. Once again: I would love to be able to use colons for exactly that use-case myself as it's standard for almost any implementation of key-value-pairs I know of. My personal preference is to value the priority of "avoid filename issues" higher than "use the standard delimiter for key-value-pairs".
Yes. I want to avoid that as much as possible in the first place. Most users probably won't even realize what the issue at hand is if the tool used to copy/sync does not provide meaningful help. For example, I doubt that NextCloud sync or Syncthing.net would help the user here: they would simply not synchronize and the user would assume, that there is a bug or similar. I do have similar issue when syncing with Syncthing to Android: somehow, Google decided that problematic characters (Windows file systems included!) are not allowed on Android storage even though it could handle them itself. (This was different for a couple of years where this limitation was not in place as far as I remember.)
Again: I don't want different behavior depending on the current platform used. Even when using Linux, colons must be avoided. There are edge cases where the user still might shoot himself in the foot: |
Okay, I got you. That argument makes sense.
Yes, it's possible. The --force-cv does the same thing, warns the user of non cv tags and blocks submission. That mechanism/approach can be reused to do input validation for colons.
--force-cv also covers that. That input would also be validated. Would you want the colon input validation/blocking to be disablable via e.g. |
Great to know.
No, I would not add an additional parameter for that. In my head, this doesn't make much sense. Instead, I would prevent all problematic characters in the input (as explained above) and optionally introduce a new parameter like "--allow-problematic-characters" for those people who really do know what they are doing. This is also the same group of people who read manpages and online help output. 😉 |
|
I prefer if
resonates well with me. Since Prior to a dive into / an edit of #!/usr/bin/env python3
"""Constrain choices with Python's argparse.
The constraints on a garment's size and color basically is one of the
examples in Ken Youens-Clarks's _Tiny Python Projects_ appendix about
argparse. The idea of a blacklist of strings however is not.
"""
import argparse
BLACKLIST = [":", ";", "*"]
def get_args():
"""Collect the arguments."""
parser = argparse.ArgumentParser(
description="Choices", formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
parser.add_argument(
"name",
metavar="str",
help="Name to greet",
)
parser.add_argument(
"-c",
"--color",
metavar="str",
help="Color to choose",
choices=["red", "yellow", "blue", "colorless"],
default="colorless",
)
parser.add_argument(
"-s",
"--size",
metavar="size",
type=int,
choices=range(1, 11),
default=5,
help="The size of the garment",
)
# addition to the book:
parser.add_argument(
"-S",
"--separator",
metavar="str",
help="Define a separator enclosed in single/double quotes.",
type=validate_string,
default="_",
)
return parser.parse_args()
def validate_string(element: str) -> str:
"""Prevent using a string of the blacklist."""
if element in BLACKLIST:
raise argparse.ArgumentTypeError(
# a more verbose note:
# f"invalid value `{element}` is among the forbidden strings {BLACKLIST}"
# less verbose note:
f"`{element}` is blacklisted (cf. documentation)"
)
return element
def main():
"""Join the functionalities."""
args = get_args()
print(f"{args.name} uses as separator the string `{args.separator}`.")
print(f"The garment is {args.color} and of size {args.size}.")
if __name__ == "__main__":
main() |
|
@jneidel Please note I speculate it might be better to solve the errors ahead of merging a new flag |
|
Ok, I will add a new PR which will prohibit problematic characters from being used in tags (colon for now). For non-interactive input (-t), argparse can be used to validate. |
and differentiate it from controlled_vocabulary. Follows discussion on novoid#87
Review by commit.
Todo
Significant changes
Improvement: Enable key:value completions
Given the .filetags:
Completion on the colon does not work as expected, see this demo for a before and after:

Fix: Allow removal of tags not in controlled vocabulary with
--force-cvFollow #80
With
--force-cvactivated it was previously not possible to remove tags unless they were in the cv.This is inconvenient. When you rename or remove a tag and then go to cleanup you couldn't use the usual command (which includes --force-cv). Now removal of non-cv tags is possible.
See demo before and after:

Tests