code changes via PM from u/fio_smiles #26

torbengb · 2017-10-20T10:21:16Z

code changes via PM from u/fio_smiles to address issue #25: "I don't know if that will work with Python 3+". Let's review and test it, see if we can build on it.
I will test it on my Linux tonight.

"I don't know if that will work with Python 3+" Let's review and test it, see if we can build on it.

nocalla

What I'd consider here is doing a Python version check before each of the relevant steps and doing what needs to be done only then. Wasn't there another error you were getting? Also need to check before importing config parser which version to import.

nocalla · 2017-10-20T13:31:20Z

PM him back and get him to join Github so we can credit him properly!

nocalla · 2017-10-20T13:31:57Z

Or her! Accidental sexism, sorry!

torbengb · 2017-10-20T14:19:27Z

I'm working on it via PM :-) S/he is already intrigued by GitHub and might sign up. Would be cool!

torbengb · 2017-10-20T21:16:18Z

Okay I'm having trouble running this modified code. As-is I got this error:

Traceback (most recent call last):
  File "newtest.py", line 27, in <module>
    from future import unicode_literals # issue #25
ImportError: No module named future

So I tried installing future with the usual command sudo python -m pip install future but then it said:

Could not import setuptools which is required to install from a source distribution.
Please install setuptools.

So I did: sudo python -m pip install setuptools followed by sudo python -m pip install future. Then I got:

Traceback (most recent call last):
  File "newtest.py", line 27, in <module>
    from future import unicode_literals # issue #25
ImportError: cannot import name unicode_literals

I don't know what unicode_literals is but I tried installing that like above, but failed ("No matching distribution found for unicode_literals").

Without a better understanding of Python, I'm going to have to pass on this code change. Sorry.

nocalla · 2017-10-20T21:19:33Z

I'll see if I can work it out over the weekend.

…

On 20 Oct 2017 22:16, "Torben Gundtofte-Bruun" ***@***.***> wrote: Okay I'm having trouble running this modified code. As-is I got this error: Traceback (most recent call last): File "newtest.py", line 27, in <module> from future import unicode_literals # issue #25 ImportError: No module named future So I tried installing future with the usual command sudo python -m pip install future but then it said: Could not import setuptools which is required to install from a source distribution. Please install setuptools. So I did: sudo python -m pip install setuptools followed by sudo python -m pip install future. Then I got: Traceback (most recent call last): File "newtest.py", line 27, in <module> from future import unicode_literals # issue #25 ImportError: cannot import name unicode_literals I don't know what unicode_literals is but I tried installing that like above, but failed ("No matching distribution found for unicode_literals"). Without a better understanding of Python, I'm going to have to pass on this code change. Sorry. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APeoH4RQXdJEVxcHeVjDp5v7uXlY96oRks5suQ2jgaJpZM4QAeNq> .

fiosmiles · 2017-10-21T04:41:26Z

She :)
Not a developer and just hacked about with this to make it work, but I'll take a look next week and see if I can help.

It's future which may have been lost in formatting, see this link:
http://python-future.org/unicode_literals.html

fiosmiles · 2017-10-21T04:42:29Z

Ahhh, sorry the automatically formating is ruining it, there's two underscores before and after the word future, if you follow the link it should make more sense.

torbengb · 2017-10-21T09:43:04Z

Welcome @fiosmiles!! Great to see you on board!

I've added the underscores (see new commit linked above) and now there's just a single issue left, but I can't figure out how to solve it:

  File "bank2ynab.py", line 81, in clean_data
    transaction_reader = csv.reader(transaction_file, delimiter = delim)
TypeError: "delimiter" must be string, not unicode

I've tried appending .encode('utf-8') or .decode('utf-8') to the assignment a few lines above, but it doesn't seem to make a difference. The specific line being referenced puzzles me because there's nothing in there:

 output_data = []

That's when I force the script to run as Python 2 (using the linux command python2 bank2ynab.py). When I run it as Python 3 (python3 bank2ynab.py), it works fine.

nocalla · 2017-10-21T11:39:21Z

I haven't played with this yet. That future thing is interesting!

Edit: I see what you meant about auto formatting now @fiosmiles!

Does the CSV without the newline output correctly on python 3? The reason I had it was because I was getting extra blank rows between each row without.

torbengb · 2017-10-21T12:00:46Z

Does the CSV without the newline output correctly on python 3?

Yes it does. I think the "wb" setting in line 125 did the trick.

nocalla · 2017-10-21T12:28:26Z

I think I ran into trouble with that on Windows. Will report back later.

…

On 21 Oct 2017 13:00, "Torben Gundtofte-Bruun" ***@***.***> wrote: Does the CSV without the newline output correctly on python 3? Yes it does. I think the "wb" setting in line 125 did the trick. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APeoH19bCVVq6p0jMAoIahIQ6gF3fA4Fks5sudzugaJpZM4QAeNq> .

nocalla · 2017-10-21T18:43:22Z

I'm getting errors.
For the write_data change to "wb":
Traceback (most recent call last): File "bank2ynab.py", line 174, in <module> main() File "bank2ynab.py", line 167, in main write_data(file, output) File "bank2ynab.py", line 129, in write_data writer.writerow(row) TypeError: a bytes-like object is required, not 'str'

And for encoding the delimiter:
Traceback (most recent call last): File "bank2ynab.py", line 174, in <module> main() File "bank2ynab.py", line 166, in main output = clean_data(file) File "bank2ynab.py", line 84, in clean_data transaction_reader = csv.reader(transaction_file, delimiter = delim) TypeError: "delimiter" must be string, not bytes

fiosmiles · 2017-10-21T22:28:45Z

@nocalla is there a sample file that everyone is using? If so, can you point me at it and I'll try that file, or if not can you share your file or some version of it and I'll try it on my end.

fiosmiles · 2017-10-21T22:31:43Z

Also, this is something I was looking at when I hacked this to work for me on 2.7, maybe the switch is worth implementing if it's just not compatible with 3+

https://www.reddit.com/r/ynab/comments/77d7rt/tools_bank2ynab_heres_a_script_that_helps_with/

fiosmiles · 2017-10-21T22:32:35Z

Wrong link, correction: https://stackoverflow.com/questions/29840849/writing-a-csv-file-in-python-that-works-for-both-python-2-7-and-python-3-3-in

nocalla · 2017-10-21T22:53:45Z

There isn't a test file. We should definitely make one. That has umlauts and ends with a newline! I'll have a read of the link and see what I can make of it if I can work out what's going wrong on my end.

…

On 21 Oct 2017 23:32, "fiosmiles" ***@***.***> wrote: Wrong link, correction: https://stackoverflow.com/ questions/29840849/writing-a-csv-file-in-python-that-works- for-both-python-2-7-and-python-3-3-in — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APeoH6EGNoViQpBoeIomEG0Tu9MTkLLbks5sunEEgaJpZM4QAeNq> .

torbengb · 2017-10-21T23:09:43Z

Here's a test file, feel free to add more in that folder: https://github.com/torbengb/bank2ynab/tree/master/test-data
I hope I anonymized it sufficiently :-)

nocalla · 2017-10-22T08:42:41Z

I like this solution for the CSV issue.

nocalla · 2017-10-22T09:14:19Z

Also, we have to account for the renamed configparser! I think both of you have installed the Python 3 version as a workaround, but we should make it dependent on the pre-installed modules only. I think the best way to do that is import ConfigParser as configparser near the top of the code!

Also remove delimiter encoding as shouldn't be needed with unicode_literals and was causing me errors Also move future to top of file to prevent error SyntaxError: from __future__ imports must occur at the beginning of the file TypeError: "delimiter" must be string, not bytes

Add python version check for csv writing

If running 2.x, import ConfigParser but rename it to configparser. I don't think this should cause any issues with the features we're using of the config parser.

Add support for 2.x version of configparser

nocalla · 2017-10-22T09:38:18Z

This all works fine at my end, but I need @fiosmiles and @torbengb to test it out, because I know it won't work properly at your end! If we can pinpoint the errors, then we can track down what to focus on, so please paste your lovely exceptions.

I like double quotes

torbengb · 2017-10-22T13:10:25Z

I'll test your code on my Linux (tonight?) and reply then.

torbengb · 2017-10-22T19:16:17Z

Sorry to bring bad news, but it doesn't run against my normal CSV file (very similar to the uploaded test file).

With Python 2 I get this error:

Traceback (most recent call last):
  (...)
  File "bank2ynab.py", line 41, in get_configs
    config.read(conf_files, encoding = "utf-8")
TypeError: read() got an unexpected keyword argument 'encoding'

And with Python 3 I get this error:

Traceback (most recent call last):
  (...)
  File "bank2ynab.py", line 89, in clean_data
    transaction_data = list(transaction_reader)
  File "/usr/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 541: invalid continuation byte

I then tried a search&replace for all the German umlaut characters in both lowercase&uppercase, but I still got the errors. I noted that the VISA CSV export has a LF line ending while the bank CSV export has CR+LF line endings. This is from the same bank, from the same online banking portal. I didn't try messing around with these line endings though.

I then used Notepad++ to convert the CSV file to ANSI file format. Note that I only converted the bank CSV and not the VISA CSV. After that, the error in Python 2 remains but the script ran successfully in Python 3!

I can't make much out of this but I'm sure a detailed feedback might help you identify something I've missed. Mail me at torben@g-b.dk and I can email you the actual files back. I don't want to post them here.

Greetings from rainy Austria,
Ben

nocalla · 2017-10-23T21:39:29Z

After merging issue #29 into this, do you still get the same error here, @torbengb?

torbengb · 2017-10-24T13:33:00Z

@nocalla I will test on my home Linux home tonight and respond here.

torbengb · 2017-10-24T15:43:08Z

Test result with Python3: success! No errors found.

Test result with Python2:

Traceback (most recent call last):
  File "bank2ynab.py", line 197, in <module>
    main()
  File "bank2ynab.py", line 170, in main
    all_configs = get_configs()
  File "bank2ynab.py", line 34, in get_configs
    config.read(conf_files, encoding = "utf-8")
TypeError: read() got an unexpected keyword argument 'encoding'

nocalla · 2017-10-24T15:52:22Z

I wonder is that encoding parameter actually required now.

…

On 24 Oct 2017 16:43, "Torben Gundtofte-Bruun" ***@***.***> wrote: Test result with Python3: *success!* No errors found. Test result with Python2: Traceback (most recent call last): File "bank2ynab.py", line 197, in <module> main() File "bank2ynab.py", line 170, in main all_configs = get_configs() File "bank2ynab.py", line 34, in get_configs config.read(conf_files, encoding = "utf-8") TypeError: read() got an unexpected keyword argument 'encoding' — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APeoH-e-tazZHRmSX8dsOr7PGROosp1gks5svgWMgaJpZM4QAeNq> .

torbengb · 2017-10-24T16:10:43Z

Well, if I leave out the , encoding = "utf-8" from line 34 then I get a different error:

Traceback (most recent call last):
  File "bank2ynab.py", line 196, in <module>
    main()
  File "bank2ynab.py", line 177, in main
    g_config = fix_conf_params(all_configs[section])
AttributeError: ConfigParser instance has no attribute '__getitem__'

But it still works in Python3!

nocalla · 2017-10-24T17:56:46Z

The python 2 version might be different to 3. May need to compare how they handle things.

…

On 24 Oct 2017 17:10, "Torben Gundtofte-Bruun" ***@***.***> wrote: Well, if I leave out the , encoding = "utf-8" from line 34 then I get a different error: Traceback (most recent call last): File "bank2ynab.py", line 196, in <module> main() File "bank2ynab.py", line 177, in main g_config = fix_conf_params(all_configs[section]) AttributeError: ConfigParser instance has no attribute '__getitem__' But it still works in Python3! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APeoHyi5Yq9KqpFVHJzb-TnV-peWn27lks5svgwEgaJpZM4QAeNq> .

torbengb · 2017-10-24T19:08:06Z

Hang on, I downloaded new CSV files and also downloaded a fresh copy of the bank2ynab.py in this pull request.

Test result with Python3: success! No errors found.

Test result with Python2:

Traceback (most recent call last):
  File "bank2ynab.py", line 196, in <module>
    main()
  File "bank2ynab.py", line 170, in main
    all_configs = get_configs()
  File "bank2ynab.py", line 34, in get_configs
    config.read(conf_files, encoding = "utf-8")
TypeError: read() got an unexpected keyword argument 'encoding'

Then I removed the , encoding = "utf-8"from the file and ran it again with Python 2:

Traceback (most recent call last):
  File "bank2ynab.py", line 196, in <module>
    main()
  File "bank2ynab.py", line 177, in main
    g_config = fix_conf_params(all_configs[section])
AttributeError: ConfigParser instance has no attribute '__getitem__'

Does this help you?

nocalla · 2017-10-24T19:11:00Z

Need to do some reading about ConfigParser! But thanks.

Rebase to master

toyg · 2017-10-25T08:58:56Z

@torbengb

AttributeError: ConfigParser instance has no attribute '__getitem__'

I handled this exact error in the py2 branch, the ConfigParser interface is different in 2 and 3, so that's done.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4

Eh, this is bitchy. Basically the encoding for these files is incompatible with utf-8 somewhere in its space. This is unsolvable with precision unless we include something like chardet.

A possible strategy would be to do dummy csv runs through each file, surrounded by try/except, with the most popular charsets (utf-8, utf-16, iso-8859 variants etc) until one completes without error.

def detect_encoding(filepath):
    for enc in ['utf-8',...]:
        try:
            with open(filepath, "rb", encoding=enc):
                for row in  csv.reader(transaction_file): continue
            return enc
        except ValueError:   # parent of UnicodeDecodeError and all that
           continue

This has obvious performance implications, but it might be acceptable in most cases - considering people will use this once a month or so, waiting a few seconds more should not be a biggie.

torbengb · 2017-10-25T09:28:16Z

I like your idea of iterating through several charsets, it's actually an elegant solution. I hope it's worth the trouble!

As for performance I'm sure it doesn't matter at all, given how the script already runs in the blink of an eye. I'd be surprised if it takes two seconds to complete this loop.

nocalla · 2017-10-25T10:18:24Z

I quite like the try except way also. Interesting approach. I still feel it's a bit weird that I'm not getting that error on Windows though.

nocalla · 2017-10-25T10:26:57Z

Made a pull request to merge @toyg 's branch into this so we can compare approaches. I think the py2 branch effectively quashes most errors but this one has most discussion attached! #54

nocalla · 2017-10-25T19:02:34Z

Merged into the py2 branch. #54

# source: survey response #26

code changes via PM from u/fio_smiles

d7d3838

"I don't know if that will work with Python 3+" Let's review and test it, see if we can build on it.

torbengb requested a review from nocalla October 20, 2017 10:21

torbengb assigned torbengb and nocalla Oct 20, 2017

nocalla reviewed Oct 20, 2017

View reviewed changes

torbengb removed their assignment Oct 20, 2017

changed from future to __future__

7da3468

nocalla and others added 4 commits October 22, 2017 10:24

Merge pull request #39 from nocalla/issue25

0c79c97

Add python version check for csv writing

Add support for 2.x version of configparser

c7da76e

If running 2.x, import ConfigParser but rename it to configparser. I don't think this should cause any issues with the features we're using of the config parser.

Merge pull request #40 from nocalla/issue25

9717c53

Add support for 2.x version of configparser

nocalla and others added 2 commits October 22, 2017 10:42

I like double quotes

49440d8

Merge pull request #41 from nocalla/issue-#25

9575f2f

I like double quotes

torbengb self-assigned this Oct 22, 2017

torbengb mentioned this pull request Oct 22, 2017

Script doesn't like Umlauts: üöäÜÖÄßæøåÆØÅ #12

Closed

Merge branch 'master' into issue-#25

0fc0c10

Merge pull request #52 from torbengb/master

1dea08a

Rebase to master

nocalla mentioned this pull request Oct 25, 2017

Python 2.x Support #25

Closed

3 tasks

nocalla mentioned this pull request Oct 25, 2017

Merge our python 2 compatibility branches. #54

Merged

nocalla closed this Oct 25, 2017

torbengb added a commit that referenced this pull request Mar 26, 2018

Added [CA TD Canada Trust, checking+Visa]

894a313

# source: survey response #26

code changes via PM from u/fio_smiles #26

code changes via PM from u/fio_smiles #26

Conversation

torbengb commented Oct 20, 2017 • edited

nocalla left a comment

Choose a reason for hiding this comment

nocalla commented Oct 20, 2017

nocalla commented Oct 20, 2017

torbengb commented Oct 20, 2017

torbengb commented Oct 20, 2017

nocalla commented Oct 20, 2017 via email

fiosmiles commented Oct 21, 2017

fiosmiles commented Oct 21, 2017

torbengb commented Oct 21, 2017 • edited

nocalla commented Oct 21, 2017 • edited

torbengb commented Oct 21, 2017

nocalla commented Oct 21, 2017 via email

nocalla commented Oct 21, 2017

fiosmiles commented Oct 21, 2017

fiosmiles commented Oct 21, 2017

fiosmiles commented Oct 21, 2017

nocalla commented Oct 21, 2017 via email

torbengb commented Oct 21, 2017

nocalla commented Oct 22, 2017

nocalla commented Oct 22, 2017

nocalla commented Oct 22, 2017

torbengb commented Oct 22, 2017

torbengb commented Oct 22, 2017 • edited

nocalla commented Oct 23, 2017

torbengb commented Oct 24, 2017

torbengb commented Oct 24, 2017

nocalla commented Oct 24, 2017 via email

torbengb commented Oct 24, 2017

nocalla commented Oct 24, 2017 via email

torbengb commented Oct 24, 2017

nocalla commented Oct 24, 2017

toyg commented Oct 25, 2017

torbengb commented Oct 25, 2017

nocalla commented Oct 25, 2017

nocalla commented Oct 25, 2017

nocalla commented Oct 25, 2017

torbengb commented Oct 20, 2017 •

edited

torbengb commented Oct 21, 2017 •

edited

nocalla commented Oct 21, 2017 •

edited

torbengb commented Oct 22, 2017 •

edited