-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify parsing of facts with a clear tag barrier #663
Conversation
Hmm, Aside from that, I like the proposal and the code improvements. |
Yeah. That was something that I deliberately changed in the test suite. Right now the code was allowing I'd rather not add any special case to be backwards compatible on this specific input. It's not really possible to support this old syntax with the liberal names of categories. But if the consensus is that we can forbid hash in categories and activities, then I can do something but I'd don't believe that it would be a good move. |
I added a supplementary change to ensure backwards compatibility with the double comma syntax, this is nice for external users (like GTG) who already adopted this change. And I got rrid of a few style issues that my code editor kept highlighting on me (unrelated to my changes, except that they are in the file that I modified). |
Yeah, but at the same time here was a test to ensure that without the double comma it would parse entirely differently... it's exactly the kind of unexpected difference that create confusion for users. |
Why would a user expect to get the same thing with or without the documented barrier ? |
Because a barrier ought to be not needed when you don't want to put anything after the barrier... Anyway, I have been running this code for a while and it works nicely for me. @matthijskooijman and others, can you review and merge if you are satisfied ? |
@rhertzog, I've been meaning to look at this issue and others, but haven't found the time yet. Hopefully Soon(tm), but it is still on my mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is a change for the better -- supposing that test cases are sufficient evidence that changes also work as intended.
I had an initial look at this, though not good enough to really grasp everything. Here's some initial thoughts, though:
As I said, I haven't fully reviewed this yet (a bit too tired to grasp it completely right now), hope to get a closer look soon. |
I you don't do this, you lose some backward compatibility where you could put tags right after the description without any comma.
Right, documentation updated.
I don't see why you want to delay any longer something that's ready. I would not be opposed in general if really required but given our delays of review of each small step, I don't think it's reasonable to self-inflict us this.
The main restriction is that
Yes. I just added a test to make this explicit:
|
Thanks for your updates and response!
My thought was that adding a new feature, such as parsing tags within the description, might need a bit more consideration than reverting the double comma, which we should do ASAP. But you're right, maybe I'm overly cautious and we should rather move forward. Considering this PR a bit more, I think I approve of the changes it makes. There are some compatibility changes, but I indeed think it becomes more obvious for the user. I haven't looked close enough at the code yet to completely approve the PR. @rhertzog, could you maybe have a stab at writing a changelog entry for this change? Given some things will no longer work with this version, I would think it to be useful to provide a verbose changelog entry for this, and I suspect you have the best overview of what changed exactly. |
Given that this PR makes some breaking changes in the parsing of the Hamster activity strings, I'm pinging some people from clients that use Hamster: I've looked over the code/tests/doc changes and they look very good to me. I paid most attention to the tests and documentation because I'm not strong on regular expressions. |
@GeraldJansen, is there a summary of the new parsing rules somwhere? |
@mwilck Simple version: |
@GeraldJansen, thanks. This looks good to me. I never had much love for the double comma. (the time/date parsing is still an issue I think, it lacks internationalization ("4PM") - but that's a totally independent topic). |
No objections by me - I just pass through what the user enters onto the DBus method. However I plan to check the result of the DBus call more thoroughly. Can I assume that hamster will return some error text that I can display? |
This PR will eat all hashes in the description, and create tags instead. This PR will also introduce numerous regressions, some mentioned in #657. |
I second that. I regularly use hashes in the description to reference tickets. For me automatic conversion is troublesome, because I have to transfer the whole information into our companies time tracking at a later point. If the information is distributed across several fields, I have to re-combine that again. One could argue, that it might be helpful to use the tags to filter specific tickets, but that does not work in my case either. I work with multiple ticketing systems, therefore a ticket number is not unique and I quickly have duplicates. No help. For those reasons I don't use hamster's tag feature at all. |
As far as I know there are two changes that could be considered regressions. 1) Commas can no longer be embedded in the activity name. This feature was requested by a single user and its low importance has already been discussed a few times. 2) People who use tags but not descriptions will now have to insert an extra comma before the tags. This has been mentioned above and does not seem to have created concerns for maintainers of clients. |
Perhaps the automatic harvesting of tags from the description could be turned on or off by a boolean configuration setting (say "Extract tags from description"). It should likely be off by default given that it is a new feature. At any rate, it seems like the hash symbol should not be stripped. |
The numbers are too low to evaluate the importance solely on the number of expressed requests. |
The current code does not parse tags from the description that start with a number, for exactly this reason. Would that solve your concern? Or are you using some letter-based prefix that still causes problems here?
So you are saying you are using If so, I guess that either or both of @GeraldJansen's suggestions might help here:
And:
I was also thinking about the latter before, but didn't want to go into too much bikeshedding. Now that there is a workflow that would be helped by not stripping the tags from the description, I guess this might be worth considering. |
Ah, I didn't realize that.
Yes, that would resolve my personal concern, since I mainly handle number-based tickets.
I don't use letter-based tickets, but some people in my company do (e.g. from Jira). No hamster users though, but I try to convince them :-).
No, I'm using descriptions like these But to be honest, the behavior is not new. I learned to work around this by either stripping the hashes myself oder wrapping the ticket numbers in brackets (e.g.
That sounds like a feasible and reasonable solution to me.
If automatic harvesting is optional, this is not a necessity to me. But it also sounds reasonable to leave the information intact. |
It would be good to move this forward, but I'm not sure if we're there yet. I'm lacking time and brainspace to really dive into the code again for a full review, but I'll try to respond to previous comments at least. @rhertzog wrote:
One additional (implied) suggestion would be to simply skip all automatic parsing of tags entirely, i.e. @mwilck wrote:
And I wrote:
@rhertzog, I think you haven't responded to these specifically yet? In particular, do you feel that auto-detecting tags is really added value, or just a mechanism to keep (partial) compatibility in the parsing rules? If the latter, would be inclined to omit it. @rhertzog also wrote:
Is this really true? I would consider that the string-based interface is an interface between the user and hamster itself, and the external client is just a transparent pass-through? Then it makes sense that if the user configures hamster, then this interface between hamster and the user changes according to their preferences. If an external client wants to create a fact with fully predictable values, it should just submit a JSON fact instead? One caveat with this viewpoint is maybe when an external client wants to offer a string-based UI, but also apply autocompletion or suggestions in that UI, which does require that the external client know about the parsing rules (but maybe we have dbus calls for autocompletion already?). Having said that, I do think that avoiding preferences where possible is a good strategy in general. A bit longer ago, I also wrote this:
I still think this is an important point, since we need the changelog entry before releasing a version with this change included, and I expect having a good writeup of user-visible changes will also help with the discussion around this PR before merging. |
I did reply to this here: #663 (comment) It certainly adds value as I use it and I feel it's more natural to use it that way. At the same time, it's also a nice way to avoid most of the regressions.
I propose:
|
Yes it's true, here's an example: They build the string from the task and they submit it to Hamster with AddFact(). |
Ah indeed, missed that. Thanks.
Ok, so that would speak in favor of keeping the auto-tag-discovery as now implemented. I'm still not convinced that we need it for backward compatibility (reading more closely I realized that we're now keeping compatibility for tags directly following the description, but not for tags directly following the category, which was previously supported and is now intentionally dropped. But if the feature is useful by itself for others (I have no need for it), I can live with having it. Any other opinions?
Ok, looks good, but I think it's incomplete. Rather than pointing out what I think is missing, I dove into the code again and tried to come up with a more complete suggestion:
Does this look ok and complete? If so, do we consider these changes acceptable? In summary, this removes the need for using a double comma and makes parsing a bit more predictable, at the expense of no longer allowing |
Right, but I would argue that that client should be using AddFactJSON instead of relying on the user-targeted string (de)serialization scheme like they do now. |
Yes, your changelog entry looks good.
I agree but that's how it is. And it's still best to not deliberately break too much when we can avoid it. |
Yes. |
Upgrading to Ubuntu 22.04 brought the issues fixed in this PR back to my attention (e.g. double comma, tag removal from comments, etc.). Since it's unclear if/when a new release will happen, I switched to rhertzog's code branch, following the steps below: # Clone the official Hamster repository
$ git clone https://github.com/projecthamster/hamster.git
$ cd hamster
# Checkout rherzog's pull request #663
$ git fetch origin pull/663/head:fact-parser-changes
$ git checkout fact-parser-changes
# optional: clean-up (in case you already had a build)
$ ./waf clean distclean
# configure and build hamster package
$ ./waf configure build
# check for running hamster processes and stop them
$ pgrep -af hamster
$ pkill -f hamster-service
$ pkill -f hamster-windows-service
# confirm all hamster processes have been stopped
$ pgrep -af hamster
# install hamster (note: you have to use bash, fish will fail)
$ ( umask 0022 && sudo ./waf install; )
# optional: confirm functionality by adding a single comma separated command
$ hamster add "activity@work, test PR using a #tag within the comment and a single comma" Maybe that's helpful to others. I'm running this setup for some time now and am very happy :-D. |
Yeah, it would be really nice to see movement again and my branch merged. I realized I lost my own changes when I updated the Debian package to fix it for Python 3.11. :-( |
This PR reverts the use of a double comma to indicate the start of the description, a breaking change introduced in Hamster 3.0, back to the previous use of a single comma, as discussed in projecthamster#657. Likewise, the double comma needed before tags, in the case of descriptions containing the #hash pattern, is also reverted to the use of a single comma. This change requires two restrictions to the parsing rules. Firstly, no comma is allowed in the activity name (projecthamster#270). Secondly, tags may not be separated by a comma when entered on the commandline (ie. just use `#tag1 #tag2`, not `#tag1, #tag2`).
Now we use a comma, followed by one one or more optional spaces, followed by a hash as a clear sign that the tags list has been started. Obviously there are cases where we want to use tags within our description as they are part of it. We will restore this in the next commit.
If the tag list is not introduced by a comma, it's supposed to be part of the description... so we copy the embedded tags to the tag list and we drop the leading hash to keep the plain word in the description. Tags embedded in the description are more restricted in their syntax, they can't contain spaces and must not start with a digit (we want to avoid creating tags for things like bug numbers). So this basically converts a fact like "Coding, fix #bugs in #hamster" into "Coding, fix bugs in hamster, #bugs #hamster". Note that I disabled the round trip test for with description containing a hash because even though they are equivalent, the manually created fact is not exactly the same as the parsed one... the parsed one has extracted the tags from the description but not the manually created one.
v3 with its double comma has been out for a while and apparently GTG is already using that syntax. So we should deal nicely with submission using that syntax. It doesn't cost much to accept multiple commas instead of a single one. Ensure we keep this working with a unit test.
I saw this documented so we should make sure it actually works.
9d57c2c
to
2b5c059
Compare
Automatically generated build artifacts for commit 2b5c059 (note: these links will expire after some time):
To test this PR, download and unzip the flatpak application above and then install with:
|
I would also like to see this PR resolved before we release. Looking at previous discussion, I see everyone involved is positive about making these changes, except ederag. This change does make some backward incompatible change (people can no longer use comma's in their activity or category), but that seems acceptable IMHO (since it just adds a limitation to be observed by the user, compatibility with existing implementations that use the double-comma for serializing and submitting facts is kept). I've rebased this PR (and force pushed to your branch @rhertzog, so pull carefully) to get a cleaner history when merging it, and so a new flatpak build is generated that allows easier testing. I want to do a few more tests - in particular what happens when an existing fact does have a comma in the activity, and to check if this change plays well with the gnome-shell extension. |
I have pulled and tested this PR a little bit. An existing activity with embedded comma is displayed and reported just fine as long as it is not edited. As soon as you edit it the new parsing rules kick in. For example, "some,task@cat" becomes activity="some", category=None, description="task@cat". This is ugly but I think it is okay and will hardly affect any users given that Hamster didn't allow commas in the activity for most of its existence. |
I also checked xfce4-hamster-plugin against this PR and did non find any weird behaviour. That was to be expected because the plugin simply passes the string to Hamster to be parsed. |
Coming back to this PR after a while, it seems testing by @GeraldJansen was ok (and I think I did a bit of testing myself too earlier this year), so I think we can merge this as-is if nobody objects soon :-) |
Yay \o/ |
This PR builds on top #662 to further simplify the parsing facts by creating a clear barrier between the description and the tags. But to be nice with the user, we support a subset of tags within the description that we copy over to the tag list and that we keep as plain words within the description.