-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
False predictions when importing several files at once #77
Comments
hi, I think that what you want to achieve is to apply the For example, class Importer(importer.ImporterProtocol):
# ... ...and after applying the decorator: @PredictPostings()
class Importer(importer.ImporterProtocol):
# ... That's it. Note: You may prefer alternative methods of applying the decorators to be able to unittest undecorated importer classes. But the simple solution above is sufficient to get you started. |
thank you for your reply. Do i need to use any additional commands? (to train the model) Now i am using |
No additional commands needed. Data from Technically, the decorator wraps the importer's |
did it work? can we close this issue? |
i've made it work, surprisingly for some import files it works perfect, but for some not at all, below i will provide examples. I am still getting warning, not sure if it's important one:
I have 5 chase import files and 2 paypal import files https://puu.sh/C31l7/851b3a0da3.png So basically now in my case pretty much all predictions in some import files are right, and in some they are wrong. And the "wrongness" is that model puts more accounts in transactions, here is example pretty much all transaction predictions for this import file have 4 accounts in it https://puu.sh/C32nU/7e18db510f.png |
Seems like i found the pattern, the first file importer proceeds has very accurate predictions. than predictions for ChaseXXX1_Activity_20181115.CSV will be correct, and for all other ones incorrect, including (ChaseXXX2_Activity_20181115.CSV) but if i delete the ChaseXXX1_Activity_20181115.CSV, and now we have 4 files than predictions for ChaseXXX2_Activity_20181115.CSV will be correct, and for others incorrect. If we use several importers and have following import files: than predictions for ChaseXXX1_Activity_20181115.CSV and PaypalAT.CSV will be correct, for all others incorrect. Could you suggest what's the problem and how could it be solved it? As per your suggestions i've applied the smart importers like this |
Hm, difficult to say.
I can, for now, only guess, but one idea for an explanation is this: Is it possible that your program (beancount or fava) re-uses importer instances when it is told to import several files? Such re-use would make perfect sense for regular importers, but smart importers could end up using false training data. EDIT, Note: |
1 "What do you mean by saying the predictions are incorrect? In which way are they incorrect?" It's completely off. Here is example: If correct transaction is
than prediction can be
2 "Are there any differences between the CSV files, regarding their content (e.g., are they for the same account or for different accounts), regarding what training data should be used, and regarding your expectation about correct vs. incorrect predictions?" files are pretty much the same, it's order they go in downloads folder that matters. 3 "How do you start the import, e.g., through beancount's commandline api or through fava? When you start the import, do you tell the program to import several files at once?" I am using command line, example: "do you tell the program to import several files at once?" 4 "I can, for now, only guess, but one idea for an explanation is this: Is it possible that your program (beancount or fava) re-uses importer instances when it is told to import several files? Such re-use would make perfect sense for regular importers, but smart importers could end up using false training data." that's what i think too. It works correctly when i place 1 file in downloads folder, i just wanted to make it work with all 7 files, but it's ok. Not a big deal, i will just import them 1 at a time. |
Thank you for sharing this information. I think we now have sufficiently narrowed down the problem: False predictions when importing several files at once. Next steps: This will need some debugging to confirm the suspicion that importer instances are cached and re-used, which leads to false training data being used for the predictions. |
@johannesjh hi, So it should work correctly using hooks? Could you please explain how to apply them (hooks) to my sample file described in OP #77 (comment) ? |
@gety9: Instead of applying the decorators to the importer classes, you should apply the hooks to importer instances as outlined in the README |
...for your import configuration, this roughly translates to: chase_importer = chase.Importer(...)
paypal_importer = paypal.Importer(...)
CONFIG = [
apply_hooks(chase_importer, [PredictPostings(), PredictPayees()]),
apply_hooks(paypal_importer, [PredictPostings(), PredictPayees()])
] |
@johannesjh @yagebu thank you guys! |
Guys hi,
It's more a question on usage than issue, hope you could explain me. I've read Quick Start and Documentation but since i am using several importers and 1 of them have 2 "modes" (credit card and checking) can't understand how to apply directions provided.
I have following folder structure:
at.beancount
looks like this/paypal/__init__.py/
andchase/__init__.py
like thisUsing
bean-extract -e at.beancount at.import ../Downloads/ > temp.beancount
gives me
temp.beancount
file similar to thisThan i manually put correct accounts and get this.
I'd like to automate this last manual part with smart_importer. As far as i understand i don't need
@PredictPayees()
, but only@PredictPostings()
. But i can't understand in which importer file to insert them (inat.import
or/chase/__init__.py
and/paypal/__init__.py
) and where exactly :) Python programmer helped me with importers, but now he is not available. So i have to figure out on my own.The text was updated successfully, but these errors were encountered: