New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intelligent import autofill #26

Open
hsoft opened this Issue Jun 22, 2013 · 57 comments

Comments

Projects
None yet
3 participants
@hsoft
Owner

hsoft commented Jun 22, 2013

When importing, it's always the same. There are some meaningless descriptions and they are always replaced by the same thing. There should be way to do that automatically.

First, for this feature to work, it must be possible to edit descriptions, payee and checkno in the import table. The moneyGuru document would maintain a list of all replacement that were made. For example, if, in the import table, I change "MEANINGLESS CRAP BLAH ** BLAH" to "Groceries", moneyGuru will remember it.

Then, there will be the "Smart Replace" (or some cleverer name) button in the import window that, as soon as there's one value that can be replaced, will be enabled. When clicked, a menu with all possible replacements will pop up (there will also be a "Perform all" item). When the user clicks on an item, a replacement occurs.

These replacements will be saved with the moneyGuru file.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

[not-tagged:"suggestion" tagged:"feature" bulk edit command]

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

There's a problem with the way I planned this. Smart replace should also cover account transfers, which rules out the edit-in-import-table concept. I haven't thought this through yet, but the first idea to come to mind would be to remember values at import for each imported transaction, so that we can identify them later and make correct replacements.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

A request for this on GS

Import rules would be very nice. I import all using CSV files, and most of the transactions are recurrent (groceries at the same store once a week for example), it would be nice to be able to automatically set the "To" here. It would be also nice to be able to set "From" and "Payee". The rules could be based on the "description", "payee", "to" and "from" field.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

Another request on GS

I like your program (MoneyGuru), but I think it may be improved by automatic assignment of categories based on descriptions. I import all my data from from CSV, and have to assign categories manually, which is very long and tiring.
Is it possible to assign a category automatically to the same string? For example, if I once assigned category "salary" to the description string with my company details, the program "remembers" it and automatically assigns this category every time I import data. A further step could be regular expressions defined by user to assign categories to multiple transactions. For example, if I shopped in different Tesco's, the strings will contain this name (Tesco) and they can easily be assigned a user-defined category ("shopping" or "food" or whatever).

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

Smart replace should also cover account transfers ... the first idea to come to mind would be to remember values at import for each imported transaction, so that we can identify them later and make correct replacements.

The CSV lists from my bank actually show a unique transaction ID.
I tried feeding that column both to "Tansaction ID" and to "Check #".
However, moneyGuru does not recognize transfers from my checkings account to my savings account as one single transfer. I end up with two disjunct transfers: one is going from my checkings account and the other is coming into my savings account.

If this the feature you're contemplating, you could:

  • Add import support for a transaction identifier ("Tansaction ID" is there allready but what it is used for, actually?)
  • Simply check for unassigned transactions with same date and same amount on existing assets/liabilities during the import run, and (propose to) merge those transactions

I guess such a simple check does the trick for a personal finance app.
Your typical user would import transfers for two or three bank accounts only.
Identifying transactions between those boils down to matching up the unassigned, no?

Regexp replace for transfer targets would be just awesome. Ideally it would predict my success rate by telling me how many previously assigned transfers to my "groceries" account match the super fancy expression I am typing (and my failure rate, the matches with transfers assigned to other accounts).

Really like your work. Keep it up!

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

The transaction ID is used for automatically "binding" transactions to existing ones during imports. This feature comes from OFX import where it's not possible to have the same transaction ID from two different accounts. I agree that in the case of CSV imports, it could also be used to connect transfer accounts. However, I'm afraid this new feature could bring all sorts of new complications. I'd have to think about it. Meanwhile, I created #233 for this feature.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

@andrew-hill

This comment has been minimized.

andrew-hill commented May 5, 2014

Another similar piece of software I recently reviewed let's you set up or learn rules, and then apply them by selecting transactions and pressing the auto fill button.

This way it happens after the import step, so may avoid some of the difficulties.

Could be problematic if it's not easy to tell which transactions need auto filling (maybe up to user's own process or workflow, or a tag on transactions that's set once they've been autofilled, or a "last import" view like in iPhoto, etc for the most recently imported data)

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Nov 21, 2014

I'd like to take a stab at this ticket.

This and #166 are probably the most important tickets for my own use.

I do all my management once a month or I let it go for a couple of months and I'm curious about where I am, and then update if I have a need. Every time I start again, I have a small script that collects up my transactions (e-statements from my credit cards and banks), eliminates double transactions, renames things based on an occurrence file, etc. So general pre-processing. If it doesn't work out, I look at the problem, roll back, make a change to the pre-processing, and try again.

It would be good to be able to integrate this kind of work into moneyguru, or make the process "plugin-accessible". I'm going to read through, experiment a little, and post some ideas.

@hsoft

This comment has been minimized.

Owner

hsoft commented Nov 21, 2014

Awesome! as always, don't hesitate to ask me if you need any guidance.

This is a long standing ticket which I always wanted to implement but I've always seen it as too big to fit my will (as it was with the amount drawing thing). I'm glad to see you're not afraid of big tickets :)

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Nov 27, 2014

I started to write a wiki page for the feature here. It's just started, and my initial thoughts are an outlook style set of rules.

If [Column Name] [is/is not] [equal to/contains/matches] [value/regular expression] Then
[map to account name/other actions] [account name or other values].

Some of the easier rules will be added automatically when the user edits a meaningless name (MEANINGLESS CRAP BLAH ** BLAH to groceries) which would create the rule

If [Account Name] [is] [equal to] [MEANINGLESS CRAP BLAH ** BLAH] then [map to account name] [groceries].

The user could create a group or set of these rules, and save them much like csv layouts are saved. This would also be a very good place to provide a plugin interface that plugin authors could write there own custom logic for importing.

Again, I'll provide more of this in a formal way with the wiki page.

@hsoft

This comment has been minimized.

Owner

hsoft commented Nov 29, 2014

It's a great writeup and questions that arise from your proposition are kind of similar to the discussion in #13 (which is much older than 2013. these dates are due to me having to move issues to github): Do we really want to include that much "UI complexity" into the app?

The thing is that, like with calculated transactions, implementing the actual matching rules is rather simple. The complex part comes at the UI level and that complexity comes down not only on the developer, but also the user. I don't know about you, but I find designing mail filters in a mail client is tiresome.

In #13, we tried to find if there wasn't some cleverer way of solving the problem (and then we never implemented anything :) ). I suspect that if the feature in this ticket is implemented with a good old matching rules editor, it will be less used than if it's implemented in some cleverer way.

In the wiki, we mention cases of variable values, such as GBF Market Foods UIXXXCDE ** 3 and GBF Market Foods UIAASVXS ** 2. Instead of spending time implementing a UI, couldn't we send time on developing a nice heuristics algorithm that properly matches those strings?

To prevent mis-autofill, maybe that autofilling procedures could involve some kind of confirmation dialog where we show replacements that are about to take place and then allow the user to cancel some of them. We could then "teach" the algo to work better.

Also, very early in moneyguru development, we were asking ourselves "what if moneyguru was a personal finance manager for developers?". I say that because we could make it so that default autofill rules could be enhanced by plugins. Code is much more expressive than select boxes in a row.

These are just ideas I throw around to fuel your thoughts.

@hsoft

This comment has been minimized.

Owner

hsoft commented Nov 29, 2014

(now that I re-read stuff from #13, I realize that there's a lot of context missing and it's hard to make sense of it. it wasn't a very good idea to reference it here... nevermind that.)

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Nov 29, 2014

Yes, I'm not done with the wiki (I was just editing it as you appended this). It's basically my thoughts as I go along figuring out how things currently work, and finally get to the implementation.

My thought is that I'd get this document from a fluid state to a more defined one. My other thought that the first few iterations would be to find a target in the core GUI code that would be a good target for a well designed component that implements a defined interface that could be used for a plugin.

If we did implement an learning heuristic or an "e-mail rules" style UI interface, the back-end would use this component style with well defined interface, making plugins extensible. This would be very much like the Trac style of writing plugins. For instance, provide an IImportActionProvider interface, which has some methods for defining some abstraction of the swap_* functions in core.gui.import_window.ImportWindow. Maybe then a IImportConditionalProvider, able to provide a boolean value based on a transaction or account data.

I don't have something concrete in mind for interfaces yet. But whatever features we choose to implement, we'd do so first defining an abstract interface that plugin authors could come behind us and change our silly mistakes without having to learn all the cross-platform toolkit communication shtuff which is where I am buried in reading currently :).

I do really appreciate the feedback though. I don't expect you to read through the whole thing all the time, I'll let you know when it's finished and you can read it if you'd like. It's more just a sounding board for my own thoughts instead of diving into code changes up front.

@hsoft

This comment has been minimized.

Owner

hsoft commented Nov 29, 2014

find a target in the core GUI code that would be a good target for a well designed component that implements a defined interface that could be used for a plugin.

+1 . Plugins came way too late in moneyguru's design, but if we can think about them when implementing new features, it can only be good. The relationship between UI layers and plugins is a tricky problem, however.

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Dec 5, 2014

(hsoft#26) Intelligent Import Autofill
First crack at abstracting some current import action "fixing" logic into
plugins.  Created an ImportActionPlugin in core.plugin, and used that as
a base class to create a one-class-per-action hierarchy.  There is two
way communication between the actions and the import winodw (hence the
use of Broadcaster/Listener classes). From what I can see, all previous
functionality is retained (though I still have to sweep through the tests).
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Dec 5, 2014

If you have a chance to quickly glance over the above commit, I've taken the "Fix Import" logic and abstracted it into plugins. The base class for import actions is core.plugin.ImportActionPlugin, but the currently used plugins themselves reside in core.gui.import_window. Each plugin that doesn't have a NAME can be considered an abstract base class.

All functionality should still be retained, I've tested manually with some test import files. I have to fix all the tests (but not change their intention!), to confirm that for certain though.

This is just a first pass attempt at getting a plugin API going, but it isn't hard to see even someone with limited programming experience extend these individual actions by dropping in a quick file in the plugins directory.

from core.plugin import ImportActionPlugin

class MySwitcherAction(ImportActionPlugin):
    NAME = "My Switcher Plugin"
    ACTION_NAME = "Switch with Durdy!"

    def perform_action(self, selected_pane, panes, transactions):
    "If Hurdy in the description, replace with Durdy and throw him a quick 20."
        # or match against regex's whatever as comments indicate earlier
        for txn in transactions:
            txn.description = txn.description.replace("Hurdy", "Durdy")
            txn.amount += 20

More work to follow.

@hsoft

This comment has been minimized.

Owner

hsoft commented Dec 5, 2014

Sweet. Very elegant start, congratulations!

The only problem I see is with plugin subclasses living in the core code, but as you already mention in this ticket, a plugin overhaul is needed to bring this ticket to completion (something I totally agree with). So this isn't really a problem because these plugins will be moved elsewhere before the end.

Side note: the example you just gave is not a very good one because transaction.amount += 20 can't work. With splits and all, you can't increase the amount of a txn like that.

History note: I once implemented a new transaction model ( #16 ) that would allow this code to work, but users didn't like it and I reverted the feature back to typical splits :(

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Dec 5, 2014

Yes, the plugin architecture. How do you want to handle recommendations about that? It goes to a higher level then this ticket. New ticket? Maybe wait a little until concrete suggestions build up?

I don't know about your point mixing plugin and core code. I think Trac's pretty vibrant plugin eco-system of plugins came from them implementing interfaces and then immediately using those interfaces in their core code. See, for example:

https://github.com/edgewall/trac/blob/trunk/trac/ticket/api.py

They declare a bunch of interfaces, and then use them in their components. The purpose of the interfaces is to define some publicly accessible methods for communicating with components. The purpose of using a Component class is to register plugins and track who is using which interfaces. It also gives developers a reference implementation for every interface, which they know is used code rather then an experimental or nifty-never-used feature because it's part of the core.

Aside from that, the plugins that were defined are refactored core code. Originally, I had moved the specific bits into the plugin_examples folder, but I was continually deleting my local cache of them so they would be copied and regenerated. It's just a lot easier to have it live alongside the place it's used. The only difference between what was there and now we elevated it into another class, and we're just using the plugin architecture to make sure it's extensible. Our use of it in the core also ensures its, well... usable!

On the transaction.amount assignment, I didn't think about it at the time... but do imports rows generate more than one split per line? If they don't, this should work, right? Not best practice though...

@hsoft

This comment has been minimized.

Owner

hsoft commented Dec 5, 2014

Not all imported files can yield transactions with more than two splits, but QIF and OFX can.

As for the plugin thing, I don't have a strong opinion about this and I trust that the end result will be elegant. Whatever you prefer will be fine by me.

But yeah, we should create a new ticket for the plugin overhaul. I'll do that right away.

@hsoft

This comment has been minimized.

Owner

hsoft commented Dec 5, 2014

Oh, I also wanted to comment about this section in your wiki page

moneyGuru already has some form of limited auto-matching builtin. You might want to check the current implementation. Right now, it only does auto-matching when importing OFX file that have reference IDs in it (which are typically guaranteed to be unique). Then there's this whole "binding" mechanism with the little "lock" icon in between and all. Have you seen this? It might influence your analysis.

(there are example OFX files with references in them in the test suite if you want to try it out)

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Dec 8, 2014

(hsoft#26) Intelligent Import Autofill
Fixed all tests, missed some logic regarding the filtering of panes
to only those able to perform an action.  Also, some of the date-names
were not being set until after the "selected pane" update method was
called.  The user would not notice it, but the tests expected the name
to be set more immediately.
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Dec 8, 2014

I hadn't read it at the time of writing the wiki entry, and my only experience had been with the CSV files at the time. I've since read the documentation on importing into a new account, and I will be experimenting with the feature and reviewing the code before making any additional assumptions or changes.

All tests now pass in the last commit. None of the tests had to be changed (which I guess is a testament to your high level testing).

So at this point, I'm going to do some feature review as you stated, and after I have a solid grasp try to determine how the user experience might change... do we make the import table edit-able, how would users see before/after changes, etc. Then go from there.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Dec 8, 2014

Attached are my initial thoughts as to how I'm going to proceed with this ticket form here. I'll update the wiki before progressing, but the "import pipline" (image attached) will look something like this:

  1. From the loader, any import action plugin marked as "always apply" will be a pre-processing step just before the reaching the intelligent import autofill heuristic.
  2. The heuristic will do it's best to just change the existing import fields based on prior user edits, looking for string similarities, etc. (this needs to be formally thought out and written up)
  3. An extensible "bind transactions" plugin will take the result of that and apply user defined rules for binding transactions together.
  4. At this point the user is given the chance to provide feedback. If all is well, they complete the import. If not, they can perform steps 5 or 6.
  5. Apply any one of a number of selectable actions. Goto step 7.
  6. Unbind an existing transaction (as it is now), or hand edit the unbound import data. Goto step 7.
  7. The intelligent import autofill feature (plugin?) takes the changes and records them for future imports. Goto step 3.

So, in our bubble diagram, "Apply select action" (step 5) is already implemented (we still need to collect up the plugins from the plugin directory). "Always apply actions" (step 1) is also easily implemented. The only difference I might suggest is that these actions be able to be performed when selected by the user on single or multiple selected transactions as well as the entire pane or set of panes.

Under "Hand edit data", the import table will have to accept edits. The binding/unbinding via drag and drop is as moneyguru currently acts, but I was thinking of adding a third state of "questionable bind" where the lock would be followed by a question mark. This would indicate to the user that the automated binding rules were applied, and they should review the result of the application of these rules closely before continuing.

Finally, we have the intelligent import autofill, which will take a first crack based on the pre-processing steps and then allow the user to make their edits and record the changes. We'll want to provide a pane for configuration preferences (whatever they might be) or the ability to completely turn off the intelligent import autofill in case it acts outside of reason for some users. Again, it's behavior needs to be formally written up in the wiki proposal.

See the attached image.

import_pipeline

@hsoft

This comment has been minimized.

Owner

hsoft commented Dec 8, 2014

I don't think I've ever thought out the import process as deeply, so I have questions rather than critiques.

  1. Is there a specific reason for the loop going back to the autofill/bind processes? I can't think of a situation where that would be needed. Do you have a scenario in mind?
  2. Which extensible parts will be needing the Store API we were talking about earlier?
  3. What do you mean by "overrides col user data"?

Looks good to me. I haven't looked at your latest commit yet, I'll do so as soon as I have free time.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Dec 9, 2014

Keep in mind that "always apply actions", "bind transactions", and the extended "select actions" will for the majority of users be the same as current functionality or no-ops. That is where your custom code will go. This concept pretty much rules out the UI layer complexity representing rules, replaced by creating dead simple plugin code. So with that in mind...

Your first question... (1.)

We have the intelligent import autofill which changes transaction attributes, and binding rules that bind based on transaction attributes. So the transaction attributes need to be changed before the binding rules take effect. And if the transaction attributes were incorrectly changed, the binding rules will need another try at the corrected transaction.

Let's say I had an import of my credit card statement. I have a binding rule that says: amounts from checking to pay my credit card on the same date in the same amount should always bind to the similar one from my checking account. I only pay my cards from my checking account. My credit card statements represents these transfers as "Date Posted, Payment - Thank you, (transaction ID), -400.23". I don't see it bind right away because the transfer isn't correct. I change it from "Payment - Thank you" to "Checking". This edit needs to go back to the intelligent import autofill so it remembers that change for the future. The binding rules now have the correct account "Checking" for the transfer, and bind to the transaction from my bank checking account statement that already existed from a prior input.

Note that now that I read it, is this binding? To me, it means the other side of a transaction between two accounts as well as previously imported transactions from a file with an overlapping date range as the one currently being imported. I think that's correct.

Your second question... (2.)

So we want to provide this as an option for plugin developers. So I could see someone developing an "always apply" plugin that the plugin developer tells users to go to their data directory and edit the CSV file where the first column is original descriptions and the right is replacements. This might not work for every case, but it's the first example I could think of where I would want to have some kind of file. When I do the intelligent import autofill, I'd expect that I'd like to take a hard look at customizations available there as well. But it's for whatever data that the plugin developer wants to store between application runs.

Your third question... (3.) That's my sloppy writing... it's "overrides w/ user data" which is my shorthand for "overrides with user data".

I don't know if you need to review the commit, but feel free. It was just a clean up of the original commit and making the tests work. I did change the per-index name replacements to a total refresh of the names for all actions per your one comment. I plan on going through your reviews that are cosmetic or "good design or code style" and implement them together to minimize your time.

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Dec 12, 2014

(hsoft#26) Intelligent Import Autofill
Started some experimentation on making the import table editable by
assigning each pane it's own document object (with a few minor tweaks).

This also gives import action plugins a document object through which
they can make their changes before it is progressed onto the working
document.

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Jan 7, 2015

(hsoft#26) Intelligent Import Autofill
This will likely be the last commit on this prototype branch before
composing the demonstration and choosing a way forward.  Added an
``ExampleRuleAutofill`` plugin which shows feasibility for a statistics
based plugin.
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 7, 2015

I'm freezing the issue_26 prototype branch of my fork where it is warts and all, and will be using it to construct a short video (~5-10 minute) demonstration of the highlights.

Feel free to pull it down on your end and experiment under the caveat that it's alpha and backed by minimal if any new-feature tests (though the current test suite passes) and applicable only to the Qt end.

My plan is to present the new features in the video demo and discuss the prototype implementation, and we can negotiate which features and implementation details are a "go" or "no-go" and break them into individual chunks. I'll provide a suggested list in a yes/no format so you don't need to spend a lot of time on that.

After we're agreed about what works and won't work, I'll create a new branch off of develop and use a test-first migration of the prototype code in a feature-by-feature path that we have negotiated. For example, converting the existing swaps into import action plugins first, then one new plugin at a time, ensure the Cocoa side matches functionality, next plugin, etc.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 8, 2015

I edited the wiki page to include a breakdown of what's required feature by feature. Note that I still owe you and will do a video, but I wanted to create the document for my own purposes and to organize my thoughts.

The first feature is blocking, and would need to be done for each subsequent feature. But each subsequent feature can be implemented independently of the others.

  • ImportActionPlugin replacement of existing five actions (involves BaseDocument / ImportDocument creation)
  • Allow import actions to be performed on individually selected rows.
  • Highlight changes to entries, revert highlighted changes
  • Make the import table editable
  • Implement transaction matching based on similarity (fuzzy matching)
  • Implement autofill of transfers as an import action plugin

The first would have to be a "yes" to proceed with the others, but the remaining can be a "yes"/"no". I'd prefer to implement each, but won't do it if you don't think there is a prospect of it being merged or want me to focus on one over the others then I'll work within that.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 10, 2015

I don't have much to add, it's a good plan. Of course, if you're up to the task, I have no objection to you implementing the bulk of it. The tricky part will be the Document refactoring which will need extensive review.

As I wrote earlier, I'm more than willing to help you out, so if you end up stranded, let me know.

Also, of course, there's a prospect of your code being merged. These new features in your proposal look really interesting.

As for the UUID proposal, I'm not very warm to it. I see it as a serialization tool more than anything else and I wouldn't add it to a plugin API without having considered alternatives first.

However, after having read your wiki page and looked at the code in import_window, I see the problem. Unfortunately, this code was never written with "re-cooking" in mind. There are two possibilities I see right off the bat, the first one sounding better than the second to me:

  1. Refer to Split instances rather than Entry. They are stable and are not regenerated upon cooking. There's possibly complications coming with this and the matches list in AccountPane will probably get a bit more complex, but in my head, it sounds feasible.
  2. Try to reuse Entry instances during cooking. Possibly simpler than point 1, but in my head, there's a red light flashing telling me it sounds like a bug-inducing idea.
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 11, 2015

  1. I thought the same thing, and tried it. Can't refer to the split
    instance directly because if you change something about the transaction
    (say a check number or swap the description/payee) through the document
    then it eventually calls from_transaction in the Transaction class. This
    shallow copies the entire splits list and your match is invalid. I had
    some luck with using a transaction instance and a numeric split index tuple
    as a key to the entry match. But the use of a uuid that identifies the
    split after the shallow copy seemed more reliable. If someone changes the
    structure of the split, then that uuid would regenerate with the creation
    of the new split instance. It's a read only attribute created only at
    construction so analogous to the identity returned from id().
  2. I wanted to stay away from the internals of the oven. That would be my
    worry as well.
    On Jan 10, 2015 2:40 PM, "Virgil Dupras" notifications@github.com wrote:

I don't have much to add, it's a good plan. Of course, if you're up to the
task, I have no objection to you implementing the bulk of it. The tricky
part will be the Document refactoring which will need extensive review.

As I wrote earlier, I'm more than willing to help you out, so if you end
up stranded, let me know.

Also, of course, there's a prospect of your code being merge. These new
features in your proposal look really interesting.

As for the UUID proposal, I'm not very warm to it. I see it as a
serialization tool more than anything else and I wouldn't add it to a
plugin API without having considered alternatives first.

However, after having read your wiki page and looked at the code in
import_window, I see the problem. Unfortunately, this code was never
written with "re-cooking" in mind. There are two possibilities I see right
off the bat, the first one sounding better than the second to me:

  1. Refer to Split instances rather than Entry. They are stable and are
    not regenerated upon cooking. There's possibly complications coming with
    this and the matches list in AccountPane will probably get a bit more
    complex, but in my head, it sounds feasible.
  2. Try to reuse Entry instances during cooking. Possibly simpler than
    point 1, but in my head, there's a red light flashing telling me it sounds
    like a bug-inducing idea.


Reply to this email directly or view it on GitHub
#26 (comment).

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 11, 2015

Ok, then let's go for a UUID, but let's not expose it directly and simply override Entry.__eq__() so it compares UUID. Thus, we would do if entry1 == entry2 rather than if entry1 is entry2.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 11, 2015

Will do.
On Jan 11, 2015 9:56 AM, "Virgil Dupras" notifications@github.com wrote:

Ok, then let's go for a UUID, but let's not expose it directly and simply
override Entry.eq() so it compares UUID. Thus, we would do if entry1
== entry2 rather than if entry1 is entry2.


Reply to this email directly or view it on GitHub
#26 (comment).

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Jan 13, 2015

(hsoft#26) Intelligent Import Autofill
Starting with some tests for import actions as part of the
first feature addition described in the wiki.
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 15, 2015

I prototyped the equality overrides between entries I think it will work out well.

I've also been doing a refactor of Document into a BaseDocument (first action of the first task in the wiki). The BaseDocument holds methods where model data is actually modified, while the Document class does actual UI communication (such as querying for global scope, async autosave, dirty flags, etc) and handles the undo/redo actions. In some instances it was difficult to separate the two (sometimes an action is modified between its creation point and the cook in a conditional way so no way to move it), so protected methods with optional parameters are used to pass actions to the BaseDocument when there is no other alternative. This strategy is also used to pass lists that are common between the actions and the actual data modifications.

The intent is that the BaseDocument can be used absent a UI (so from a python script, in plugins, or as part of a general API). When complete, it will be a first pass for you to either modify or approve to reduce work on your end. I haven't run into any specific issues or concerns so far, but if you have concerns about the approach or if you think I'm tracking in the wrong direction I'm open to suggestions. Or if you'd rather do it as well, that's fine. I'll continue otherwise, left to do is schedules, save/load from xml, import entries, etc.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 17, 2015

I'm sorry to say we might be on a false start regarding the Document refactoring. The limit to what constitute BaseDocument, IMO, shouldn't be related to UI/not-UI, but rather "the minimum needed by both Document and ImportDocument".

The approach you seem to have taken is to rename Document to BaseDocument and then proceeded to bubble up features into a newly created Document class, but I think it should have been the other way around. What I see in the current code is a BaseDocument with too much stuff in it. This is the list of what I think is not needed in ImportDocument, but feel free to correct me if I'm wrong (I haven't experimented a lot with your new import UI features):

  • Being a Repeater subclass. We don't need to repeat notifications from self.app because no Listener connect to our ImportDocument.
  • schedules and budgets
  • all document properties (ahead months etc.)
  • date range mechanics
  • global scope querying (this is only needed for schedule spawn editing)
  • account exclusion mechanics
@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 17, 2015

("false start" may be too strong a term. bubbling up features to Document will eventually work, it will just be more work, in the end, than having went the other way around)

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 18, 2015

Well, the refactor is pretty much done right now (though I'm open to redoing it). See the next comment with the video + gist.

This is the way that I see the code base, or at least my impression of it.

The Document class has multiple purposes. The first purpose is to serve as an API for the model, where it is very difficult if not impossible to perform manipulations on the core model data without duplicating the same code which is in the Document. That is the kind of thing that your plugin developers will want access to.

It also has things the user cares about. Undo, redo, save points, communication with windows and views, autosaving, stopping of edition, and querying for global scope.

The attempt at the refactor was to separate these purposes. In the import window, there isn't anyway to undo/redo right now. Also if perform actions through the document class a modest number of times, it records every transaction change each time (which is a waste of resources in my mind). So you wouldn't want your ImportDocument to have access to those things. But you also wouldn't want to limit the plugin developer's freedom in creating unique plugins by restricting their ability to create splits, modify entries, etc by restricting them to a set number of calls available in a BaseDocument.

So that was my mindset when I refactored. Also, I wanted to give some talking points as well, and I didn't have a clear direction as to what you specifically would want to see. I'm completely open to suggestions about alternative ways to refactor.

Discussing some of your points:

  • Repeater subclass: I thought there might be some instances of plugins that would be interested in when transactions or other attributes were modified in their respective documents.
  • schedules and budgets: I had always wondered why you weren't able to import these from other moneyguru documents... but I see your point.
  • document properties: These values are somewhat fixed, but default_currency is a property which is valid in the ImportDocument.
  • date range mechanics: I would agree, you basically just want the "All transactions date range"
  • global scope querying: My thought was that a plugin developer could set a state variable indicating global state was True or False
  • account exclusion mechanics: Yeah, probably of no use here. Again, I did it with the thought that someone might want to use BaseDocument solely as a more general manipulation API.
@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 18, 2015

I made a short video illustrating the benefits of just the base functionality.

https://www.youtube.com/watch?v=Bhv6yL40quA&feature=youtu.be

I also created a gist with the plugin which is showcased.

https://gist.github.com/brownnrl/27b04ac4c40d5b3fe190

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 18, 2015

Also if you wanted to draw down the issue_26_base branch, checkout document.py back to develop, and refactor to your ideal state I would be OK with that. Just make sure the ImportDocument class is similar to its current state and the new tests pass.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 18, 2015

I think I will, I'm rather uncomfortable with letting this much go into BaseDocument. But not before justifying myself :)

YAGNI

The base principle here is that You Ain't Gonna Need It, which is something that I try to follow and a method of development that is particulartly suitable with high level tests. I'm really afraid of over-engineering.

However, I also recognise the new "unknown possibilities" principle introduced with plugins and I follow you on that, so I'm willing to step a bit on the YAGNI side.

However, if I can prove that something cannot possibly be used by a plugin, I hope that we'll agree to keep it out of BaseDocument.

Repeater

The Repeater is not really there in Document for convenience, but out of necessity because Listener can only listen to one Broadcaster (early design decision). So for anyone to listen to both notifications from Application and Document, repeating is needed.

In the case of ImportDocument, it doesn't emit notifications by itself (yet). There is no reason to keep the repeater there. If a plugin really wants to listen to Application, it can connect directly to it.

If ImportDocument ever starts emitting its own notifications, then we'll bring Repeater back.

Schedules and budgets

Maybe that someday we will allow the import of schedules and budgets (but please, let it be in the scope of another ticket :) ), but for now, it's impossible. So even if a plugin added budgets to our ImportDocument, it would be useless because they would not end up being imported.

Properties

The problem with properties is that they shouldn't live in the ImportDocument by themselves, but rather only be proxies of a Document instance, which is where they really live. When you open a Document Properties tab in the app, you don't edit the properties of an ImportDocument, you edit the properties of a Document.

So maybe that we can keep some of those properties in BaseDocument (well, probably only default_currency), but we'll definitely have to remove the properties save/restore routines from there because they really don't belong in ImportDocument.

Also, we should probably initialize ImportDocument from a Document instance rather than an Application instance. This way, it will be able to fetch its base properties (well, probably only default_currency) from it.

Date range

If we don't use it, we might as well not have the code in there. It's also faster because even with the "All transaction" range, there's always an analysis of all transactions going on at each change to re-evaluate the bounds of the current date range (yes, even with all transactions, we need an upper bound, otherwise, we could have tables with an infitity of schedule spawns. It's max(transaction.date) + ahead_months.

Global scope

That flag is there only for schedule spawn editing. If we don't have schedules in our ImportDocument, that flag is useless.

Account exclusion

Easy one, not possibly useful in any way in ImportDocument.

Next step

As I offered earlier, I really don't mind doing it. I don't think I'll be able to finish it today, but I'll try not to slow you down too much.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 18, 2015

About the Repeater thing: I constantly forget that we'll need notifications for when we bring in transaction editing from within the Import Window (that is in fact the whole point of ImportDocument)!

So yes, we'll end up subclassing Repeater. But since the issue_26_base branch is about enabling plugins, can we postpone that part until we're actually merging transaction editing?

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 18, 2015

Yes, so my plan is that I am going to close my modifications of issue_26_base, and create a branch forking from issue_26_base for the independent additional features as described in the wiki. I'll take any additional changes you make (maybe try to keep them to the Document/BaseDocument/ImportDocument classes?) and I'll merge them back into the additional feature branches when you feel it's at a good point. If there are any conflicts I'll try to resolve them on my end, or get your input. This way we can still work in parallel and not blocking.

We'll at the very least need change_transaction and change_entry, and possibly change_transactions in the the ImportDocument. We will also need any dependencies those methods have, and of course the core structures accounts, transactions, oven but not schedules/budgets?.

We can postpone the Repeater subclassing until it's needed for editing.

I would like to get the import work wrapped up within the next month Feb timeframe?

hsoft added a commit that referenced this issue Jan 18, 2015

Extract BaseDocument from Document in preparation for #26
The role of `BaseDocument` is to be the minimal common denominator for
`Document` and the upcoming `ImportDocument` from #26

`BaseDocument` only has transactions/entries related public methods and
doesn't (yet) broadcast any notification.

The extraction isn't 100% clean, as we can see in the comments I've
added: I couldn't figure out a way to entirely break free of any
schedule spawn related code. That's why we end up propagating
`global_scope` everywhere. It's not very pretty, but it's the best I
could come with.

Tox is still happy.

The next step will be to start merging this branch with `issue_26_base`
from @brownnrl and see if everything happens happily.
@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 18, 2015

Synchronized! I was finishing up my first commit as you commented. My plan was actually to merge my work with your issue_26_base branch. If you prefer to do it, I don't object.

I think that my commit referenced above satisfies the conditions in your previous comment.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 18, 2015

There's the docstring matter that I pushed away. In your branch, you added @samedoc. I thought it would be best to see how the final result looks like to see how we handle where docstrings live in this new world.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jan 18, 2015

Looks good, I will try to merge and test on my end tomorrow if I have time, but definitely within the week. Can we start a PR for issue_26_base into hsoft/develop just after that? Or would you want to wait for the other additional features?

@hsoft

This comment has been minimized.

Owner

hsoft commented Jan 19, 2015

Yeah, we can open a PR after that. We will end up merging it all, it's just that I prefer to review one chunk at a time.

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Jan 20, 2015

(hsoft#26) Intelligent Import Autofill
First pass at defining a simple test and some logic around the
import autofill.

brownnrl added a commit to brownnrl/moneyguru that referenced this issue Mar 15, 2015

(hsoft#420) Issue hsoft#26 base feature implementation.
Added a good bit of comments and documentation to the import_window file.
@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 24, 2015

Time flies and I'd like to release a v2.9 soon (to test that big refactoring in the wild!), which would make this ticket out of scope for this release. I hope you don't mind @brownnrl.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jun 26, 2015

Sorry I've been MIA on this, I feel bad about it. Been really busy for the past 3 months selling our old home and buying a new one (which MG was very, very helpful with if not crucial for figuring out what we can afford based on our worth). Pretty much sucked up all my time outside of work. We're moved in, but I've been trying to get the home network set up / electrical / etc. When I can get some stable free time, hope to get back to committing soon.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 27, 2015

No problem at all, I understand completely and I'm not trying to pressure you! I just wanted to make sure that you were OK with me taking over some ticket that we had been reserving for you.

@brownnrl

This comment has been minimized.

Contributor

brownnrl commented Jun 28, 2015

No issues, I am not one to stand in the way of progress. I just got my home PC out of the box. Not the fun unboxing, the unboxing having previously boxed unboxing. :) So it may still be a bit before I'm back to MG contributions, hopefully soon, but probably not before you'll put out the 2.9 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment