Pre-processing of CSV #66

elChapoSing · 2020-05-06T12:38:48Z

Hello,

Great job on the library it's incredibly useful.

Would it make sense to add a pre-processing sequence the same way you have different mappings ?

My bank has incredibly badly formed CSV so I need to pre-process it to have it match the format that csv2ofx can understand. I am assuming this is something that a lot of people have encountered.

The code would look something like :

from __future__ import absolute_import, print_function

import itertools as it

from meza.io import read_csv, IterStringIO
from csv2ofx import utils
from csv2ofx.ofx import OFX
from csv2ofx.mappings.default import mapping

#new import
from csv2ofx.processing.default import pre_process
#new import

ofx = OFX(mapping)

#to be used here
records = read_csv(pre_process('path/to/file.csv'))
#to be used here

groups = ofx.gen_groups(records)
trxns = ofx.gen_trxns(groups)
cleaned_trxns = ofx.clean_trxns(trxns)
data = utils.gen_data(cleaned_trxns)
content = it.chain([ofx.header(), ofx.gen_body(data), ofx.footer()])

for line in IterStringIO(content):
    print(line)

where pre_process would be a function taking a path and returning a StringIO.
You could then have pre-processing for a bunch of counterparties that do not know how to output proper CSV.
Does any of this make sense ?

The text was updated successfully, but these errors were encountered:

reubano · 2021-12-25T21:36:49Z

CR #83

reubano · 2022-06-08T20:15:07Z

I'd rather not complicate this script by doing additional pre-processing. There are various commands you can run your file through before piping to csv2ofx. E.g. to strip the last 2 lines of a csv file, head -n -2 transactions.csv | csv2ofx -x /path/to/mapping.py

Skipping trailing rows/cols to come soon

* features: Bump to version 0.30.0 [NEW] Add option to set last row (fixes #83) [NEW] Add transaction filters (fixes #83) [NEW] Skip initial rows/columns (fixes #66, #67, and #83) [DOC] Update/correct mapping documentation (closes #16) [DOC] Update readme content Fix chunksize default value spacing and update readme [FIX] Make has_header optional (defaults to True) Test foreign currency (closes #70) [NEW] Add date parsing format option (closes #60) [DOC] Clarify how date_fmt is used (close #60) [ENH] Add `dayfirst` option/kwarg (closes #39)

elChapoSing mentioned this issue May 6, 2020

added DBS Bank Singapore mapping and added preprocess functionality #67

Closed

reubano added the enhancement label Dec 25, 2021

reubano closed this as completed Jun 8, 2022

reubano added a commit that referenced this issue Jun 9, 2022

[NEW] Skip initial rows/columns (fixes #66, #67, and #83)

973309c

Skipping trailing rows/cols to come soon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-processing of CSV #66

Pre-processing of CSV #66

elChapoSing commented May 6, 2020 •

edited

Loading

reubano commented Dec 25, 2021

reubano commented Jun 8, 2022 •

edited

Loading

Pre-processing of CSV #66

Pre-processing of CSV #66

Comments

elChapoSing commented May 6, 2020 • edited Loading

reubano commented Dec 25, 2021

reubano commented Jun 8, 2022 • edited Loading

elChapoSing commented May 6, 2020 •

edited

Loading

reubano commented Jun 8, 2022 •

edited

Loading