Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create layer from delimited text (csv) does not work properly for quoted strings #14074

Closed
qgib opened this issue Jul 18, 2011 · 14 comments
Closed
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Plugins
Milestone

Comments

@qgib
Copy link
Contributor

qgib commented Jul 18, 2011

Author Name: springmeyer - (springmeyer -)
Original Redmine Issue: 4091
Affected QGIS version: master
Redmine category:c++_plugins


If you import a csv with values with commas, using 'comma' as the delimiter, only commas that are unquoted should be used to split the columns.

Right now (QGIS 1.7.0) the result of a row like:

1, "John,Doe", "Mary, Jane"

is to split on the , between John and Doe, which is not the right behavior.

Assigning to ccrook as i see he's done some recent work on the plugin and can hopefully give feedback on this.

The reason I think getting this behavior right is critical is that most csv export software (in my case I'm using LibreOffice) is going to default to quoting strings with commas and using commas as delimiters.

@qgib
Copy link
Contributor Author

qgib commented Jul 18, 2011

Author Name: springmeyer - (springmeyer -)


I also meant to mention that when "a value" is imported the quotes are not stripped, as they should be. It is my understanding that quoted strings should be representing string literals so keeping the quotes after import is wrong.

@qgib
Copy link
Contributor Author

qgib commented Dec 9, 2011

Author Name: Paolo Cavallini (@pcav)


  • category_id was configured as C++ Plugins
  • pull_request_patch_supplied was configured as 0

@qgib
Copy link
Contributor Author

qgib commented Dec 16, 2011

Author Name: Giovanni Manghi (@gioman)


  • fixed_version_id was configured as Version 1.7.4

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2012

Author Name: Chris Crook (@ccrook)


Definitely an issue with CSV import! The workaround for the moment is to OGR CSV format (with a VRT file) which works just fine. Will have a look at fixing this in delimited text plugin.


  • version was configured as master
  • crashes_corrupts_data was configured as 0

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2012

Author Name: springmeyer - (springmeyer -)


Chris Crook wrote:

Definitely an issue with CSV import! The workaround for the moment is to OGR CSV format (with a VRT file) which works just fine. Will have a look at fixing this in delimited text plugin.

Hey, thanks for commenting. I've used the VRT method and was looking for a one-step approach for novice users. I ended up solving things (for my purposes) in Mapnik by writing my own CSV plugin. So, +1 to improving this feature, but at least my original usecase is not longer critical.

@qgib
Copy link
Contributor Author

qgib commented Apr 16, 2012

Author Name: Paolo Cavallini (@pcav)


  • fixed_version_id was changed from Version 1.7.4 to Version 1.8.0

@qgib
Copy link
Contributor Author

qgib commented Sep 4, 2012

Author Name: Paolo Cavallini (@pcav)


  • fixed_version_id was changed from Version 1.8.0 to Version 2.0.0

@qgib
Copy link
Contributor Author

qgib commented Oct 29, 2012

Author Name: Giuseppe Sucameli (@brushtyler)


Fixed in changeset "230bbfb459f807a645fa3edbbc44b1012177bdfb".


  • status_id was changed from Open to Closed

@qgib
Copy link
Contributor Author

qgib commented Oct 29, 2012

Author Name: Giuseppe Sucameli (@brushtyler)


Whether you choose only one delimiter from the "selected delimiter" list it is internally converted to "plain delimiter", so now it works also quoted strings (see #15401).

If more delimiters are choosen from the "selected delimiters" list it still uses the "regexp delimiter" and it doesn't parse qouted strings.

The newline problem (quoted strings on more lines are not parsed) is still there, whatever delimiter you're using.


  • done_ratio was changed from 0 to 50
  • status_id was changed from Closed to Reopened
  • assigned_to_id removed Chris Crook
  • priority_id was changed from Normal to Low

@qgib
Copy link
Contributor Author

qgib commented Nov 2, 2012

Author Name: Chris Crook (@ccrook)


I have an update for the delimiter plugin which fixes the newline and comma issues, but it also requires an update to the plugin dialogue which I haven't had time to complete yet. Basically the approach I am considering is to use a couple of alternative parsers - one for regexp, one for plain whitespace, and one for fixed delimiters such as CSV. I'm thinking the dialog could then be a bit simpler (for the user), with an initial selection of parser type (which could include preset types, such as Excel CSV, tab delimited), and then options displayed according to the type of delimiter set.

One development issue that makes this difficult is that both the data provider plugin and the options need to access the same parsing code, but they are different compilation modules, so I haven't figured where to put the common code, or whether to just replicate it.

@qgib
Copy link
Contributor Author

qgib commented Apr 15, 2013

Author Name: Chris Crook (@ccrook)


Fixed for 2.0 at commit fab2c57


  • status_id was changed from Reopened to Closed

@qgib qgib added Bug Either a bug report, or a bug fix. Let's hope for the latter! Plugins labels May 24, 2019
@qgib qgib added this to the Version 2.0.0 milestone May 24, 2019
@qgib qgib closed this as completed May 24, 2019
@eduardosuela
Copy link

Maybe the problem is back? quoted text " is ignored
Version 3.10.6 (Spanish language. I also tryed English... same problem)

I am importing a CSV file with the: add layer --> delimited text menu.
My file is ; separated and " text qualified.
I go to custom delimiters: I select semicolon (or ; ar custom) and text delimiter and escaping characters are both double quotes (tried written and copied from actual textfile, just in case they are funny characters)
The preview in the bottom look fine with any combination
But after import attribute table import has ignored quotes

If you need to reproduce the problem, CSV is generated from xlsx at:
https://opendata.euskadi.eus/catalogo/-/centros-de-salud-publicos-en-euskadi/

Maybe not the right place to publish

@gioman
Copy link
Contributor

gioman commented Jun 4, 2020

If you need to reproduce the problem, CSV is generated from xlsx at:

@eduardosuela can you please attach the CSV here (as zip)? thanks.

@eduardosuela
Copy link

eduardosuela commented Jun 4, 2020

I added a package and the files
centros-salud.xlsx
error quotes.zip

Here you are.
For instance, the last line is having trouble:

Nombre;descripción;Código del centro;Tipo de centro;Horario atención Ciudadana;Horario especial;Alias del centro;Hospital de referencia;Horario de urgencia;LATWGS84;LONWGS84;Dirección;Comarca;Municipio;Código postal;Provincia;Teléfono;Fax;Correo electrónico

Consultorio de Navaridas ;;consult_navaridas;Consultorio;"martes y jueves de 08:30 a 11:00; viernes de 12:30 a 14:30";;;Hospital Universitario de Araba;;42.543792;-2.624492;Fabulista Samaniego;OSI Rioja Alavesa;Navaridas;01309;Araba;;;

Horario atención Ciudadana (opening times for citizens) contains in the last line:
"martes y jueves de 08:30 a 11:00; viernes de 12:30 a 14:30"
but it loads
"martes y jueves de 08:30 a 11:00;
and the rest appears in another line
viernes de 12:30 a 14:30"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Plugins
Projects
None yet
Development

No branches or pull requests

3 participants