Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV files broken, html entities not decoded, newlines not stripped #1464

Closed
qubez opened this issue Sep 2, 2016 · 3 comments
Closed

CSV files broken, html entities not decoded, newlines not stripped #1464

qubez opened this issue Sep 2, 2016 · 3 comments

Comments

@qubez
Copy link
Contributor

qubez commented Sep 2, 2016

Thanks for all your work on TeamPass. You are faster at fixing bugs than I am at just figuring out what code does what!

Example bad CSV output:
print_out_csv_1472807483_1f2a1c47937d99f.csv.txt

Steps to reproduce

  1. Make some items with multi-line descriptions,
  2. Make some items with backslash, ampersand, Unicode 128-255, etc

Expected behaviour

CSV files should be importable Comma-Separated Values

Actual behaviour

  • files are semi-colon delimited
  • multi-line descriptions break output into multiple lines
  • HTML entity encoding in db is not decoded into correct UTF-8 display text
  • passwords are not quoted; passwords with commas or quotes could mess up the table or automated importing (how to fix in a CSV file? What does Keepass etc do?)

See attached export file of four items. ASCII pasted in first item description, backslashes in third item description, multi-line list of unicode characters as fourth item description.

Bug fix concerns

  • double-quotes currently are " - making them " could escape the quotes around descriptions;

Server configuration

Operating system:
Ubuntu 16.04.2 32 bit

Web server:
Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g-fips

Database:
MySQL 5.7.13-0ubuntu0.16.04.2 - (Ubuntu)

PHP version:
7.0.8-0ubuntu0.16.04.2

extensions:
mysqli mbstring

Teampass version:
Updated 2.1.26 (17) -> development (including a1f0749)

Updated from an older Teampass or fresh install:
Updated 2.1.26 (17) -> development w update script run
this bug was also present -8 commits behind, not tried before tag 17 where export was missing.

Client configuration

Browser:
Firefox 48 32 Bit
Operating system:
Windows 7 x64

Random SQL notes:

This bug is one more symptom of inconsistent encoding and decoding of database entries (like #1445 was)

I was working on my own database encoding:decoding library that could be used universally to make SQL-stored data consistent. Encoding new entries logically, but still properly parsing old items using the current encoding is difficult. Refactoring all code that uses SQL would be tons of work.

I noted these points:

  • Unicode does not need to have HTML entity encoding for SQL; utf8_general_ci can store all as text;
  • HTML entity encoding is also not required for many other ASCII elements;
  • Teampass applies HTML entity encoding to many characters, wasting DB space.

Several characters that are dangerous for SQL are not escaped.

  • Bad characters: - * / < > , = ( ) ; & ' "
  • Bad characters in combo: <> || <= >= ~= != ^= --

varchar() removes trailing spaces, they are not preserved properly but may be important

Ideally in TeamPass, after implementing a SQL encoder/decoder with a standardized UTF-8 conversion table:

  • A separate function should encode data into HTML entities when they are being rendered as HTML
  • A separate function should decode data for input form pre-filling
  • A separate function should sanitize/encode data when it is being passed to javascript or PHP
  • A separate function should encode URL codes when data is being used as a URL
@nilsteampassnet
Copy link
Owner

Yes you are right.
When started working on this project, didn't have much ideas of what the tool will do except storing my passwords. And changes after changes it becomes what it is today.
So all those encoding issues, I know them but due to existing database I couldn't make expected changes. And currently doing a total refactoring would take a lot of time.

But I really like your ideas to develop some specific functions to handle different cases of encoding/decoding permitting to handle all existing case and starting to re-encode correctly all new items and all changes performed ... so it will permit to clean up step by step.

@qubez
Copy link
Contributor Author

qubez commented Sep 6, 2016

similar, reference #1299

@nilsteampassnet nilsteampassnet added this to the 2.1.27 milestone Sep 10, 2016
nilsteampassnet pushed a commit that referenced this issue Jan 5, 2017
Fix for #1299, #1464
@nilsteampassnet
Copy link
Owner

Implemented in 2.1.27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants