New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The print-unique command needs to allow field selection for comparison #1046
Comments
Interesting idea, though I'm not totally clear on the real world use cases for more powerful uniqueness checking. If you feel this is valuable enough, would you like to try mocking up the UI and docs here ? |
It's possible that I'm using the wrong tool, but here is my scenario. The only digital format I can get out of my bank is an XLS sheet of the last N transactions. I can download these whenever I want, but the result is inevitably duplicate transactions. I'm converting the XLS to CSV, then importing to Ledger. Here is a sample set of problem entries from the resulting ledger:
Note that there are three wire transfers here, but only two of them are unique (with "unique" codes). Deduplicating these works pretty easily with Then there are two entries for fees associated with the two wire transfers. Normally this would be duplicated too but in this case the fee was added later and the first time I imported it didn't have the fee, a later download did. Deduplicating the fee transactions is harder. The description line should have been unique (by chance of my including the year date in the memo) but the XLS only has truncated values, so these two years are showing the same description line. They have different codes, but This results in the awkward output of
Ideally I would be able to use
(And yes, my bank's exports are inconsistent in their use of number formatting! I do clean that up in the next step by filtering the ledger through a print with an explicit commodity format declaration.) |
Great example, thanks. Though I have to admit I'm still not really clear. The current print-unique was used for something I don't remember. print-unique --fields=FIELDS (with some default set of fields) sounds good. Except, if we can avoid options it's always better. Why not always check all fields ? |
In my use case, checking all fields would be more useful than the current behavior, BUT I could imagine a use case for not being so strict. In the cases of a journal that has been imported an modified, being able to still flush out duplicates even if comments have been added or categories tweaked would be nice. As it stands I'm a little unsure about what use case it does currently work for. |
So I guess:
|
Related to #943 but not quite the same...
The
print-unique
command is of limited use without some configuration as to what it actually compares. At the moment it only works on part of the description, the text field proper without the code part in other places considered part of the description. In my use case processing imported transactions I'm actually looking for uniq codes (or in one use case, a combination of the code + description). I can also conceive of wanting to match the amount too to get unique transactions, not just unique payees.What fields are compared should be configurable.
The text was updated successfully, but these errors were encountered: