Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filename issues when downloading documents with scan_postbox #32

Closed
megamorf opened this issue Oct 13, 2022 · 3 comments
Closed

Filename issues when downloading documents with scan_postbox #32

megamorf opened this issue Oct 13, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@megamorf
Copy link

megamorf commented Oct 13, 2022

Hey, thanks for this very useful library. I used it for the very first time today and one of my primary use cases is to grab new documents from the postbox.

I encountered the following issues with the downloaded documents:

Encoding problems

Steuerbescheinigungen

 Length Name
 ------ ----
1436671 Ertr%c3%a4gnisaufstellung_2021_redacted.pdf

Vertragsinformationen

 Length Name
 ------ ----
 985260 %c3%84nderungsangebot_zum_01.01.2023_-_Vermieterpaket.pdf
4720661 %c3%84nderungsangebot_zum_01.01.2023.pdf
  12718 Mitteilung_%c3%bcber_steigende_Sollzinss%c3%a4tze_ab_01.10.2022.pdf

Wertpapierdokumente

Length Name
------ ----
 56873 allgemeine_Anschreiben_Kapitalma%c3%9fnahmen_vom_05.09.2022_zu_Depot_redacted_-_51567359221656A2QK20.pdf
 56804 allgemeine_Anschreiben_Kapitalma%c3%9fnahmen_vom_28.09.2022_zu_Depot_redacted_-_51572762221324A2QK20.pdf
 48144 Kauf_-_WKN_A2PKXG_-_Auftragsbest%c3%a4tigung_vom_17.06.2022_zu_Depot_redacted_-_Ordernr._6487281200.pdf
 48143 Kauf_-_WKN_A2PKXG_-_Auftragsbest%c3%a4tigung_vom_17.06.2022_zu_Depot_redacted_-_Ordernr._6487377900.pdf
 47735 Kauf_-_WKN_A2PKXG_-_Streichungsbest%c3%a4tigung_vom_31.08.2022_zu_Depot_redacted_-_Ordernr._6487281200.pdf
 47737 Kauf_-_WKN_A2PKXG_-_Streichungsbest%c3%a4tigung_vom_31.08.2022_zu_Depot_redacted_-_Ordernr._6487377900.pdf
 48301 Kauf_-_WKN_A2QK20_-_Auftragsbest%c3%a4tigung_vom_27.06.2022_zu_Depot_redacted_-_Ordernr._6585736100.pdf
 48443 Kauf_-_WKN_A2QK20_-_Auftragsbest%c3%a4tigung_vom_28.06.2022_zu_Depot_redacted_-_Ordernr._6585736100.pdf
 48754 Kauf_-_WKN_A2QK20_-_Ausf%c3%bchrungsanzeige_vom_29.06.2022_zu_Depot_redacted_-_Ordernr._6585736101.pdf
 58119 Kosteninformation_f%c3%bcr_das_Jahr_2022_zu_Depot_redacted.pdf

Zero length files / missing file extension

Length Name
------ ----
     0 Kosteninformation_zu_Wertpapier_ROMEO_POWER_INC._REG._SHARES_CL.A_DL_-_0001_vom_27.06.2022__21
     0 Kosteninformation_zu_Wertpapier_VANGUARD_FTSE_ALL-WORLD_U.ETF_REG._SHS_USD_ACC._ON_vom_17.06.2022__01

Edit: the problematic files have the following names in the postbox and this is the corresponding download link:

Name: Kosteninformation zu Wertpapier ROMEO POWER INC. REG. SHARES CL.A DL -,0001 vom 27.06.2022, 21:23 zu Depot redacted
Link: https://www.dkb.de/DkbTransactionBanking/content/mailbox/MessageList.xhtml?$event=getMailboxAttachment&filename=Kosteninformation+zu+Wertpapier+ROMEO+POWER+INC.+REG.+SHARES+CL.A++DL+-%2C0001+vom+27.06.2022%2C+21%3A23+zu+Depot+redacted&row=15
Name: Kosteninformation zu Wertpapier VANGUARD FTSE ALL-WORLD U.ETF REG. SHS USD ACC. ON vom 17.06.2022, 01:06 zu Depot redacted
Link: https://www.dkb.de/DkbTransactionBanking/content/mailbox/MessageList.xhtml?$event=getMailboxAttachment&filename=Kosteninformation+zu+Wertpapier+VANGUARD+FTSE+ALL-WORLD+U.ETF+REG.+SHS+USD+ACC.+ON+vom+17.06.2022%2C+01%3A06+zu+Depot+redacted&row=19

Let me know how I can help you get these two issues resolved :-)

@grindsa grindsa added the bug Something isn't working label Oct 13, 2022
@grindsa
Copy link
Owner

grindsa commented Oct 14, 2022

Hi, I pushed a fix into the master branch which should address the umlaut encoding. The "zero lengh" issue is a bid trickier.

  • Are these PDF documents?
  • Are you able to replicate the problem? If so can you please download the latest devel branch, enable debugging, replicate the issue and send me the logs?

@megamorf
Copy link
Author

I reran the download based on master 7b49fa6 and it not only fixed the umlaut encoding problem but also seems to have fixed the zero length document problem.

The only difference this time is that I had to run dkb.scan_postbox(PATH, download_all=True) since all the documents from the previous run had been marked as read and there's no way to mark them as unread again.

@grindsa
Copy link
Owner

grindsa commented Oct 15, 2022

Thank you for your response. Good that the zero-file size problem seems to be addressed as well (even if I do not fully understand why). The change made into v.18. Thus, I am closing this issue.

@grindsa grindsa closed this as completed Oct 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants