Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

[Push] Cannot publish tabular data with values that have double quotes #66

Closed
2 of 3 tasks
anuveyatsu opened this issue Feb 1, 2018 · 16 comments
Closed
2 of 3 tasks

Comments

@anuveyatsu
Copy link
Member

anuveyatsu commented Feb 1, 2018

@Mikanebu commented on Thu Feb 01 2018

Steps to reproduce

Output

got error "Invalid opening quote" - this is because we need to have escape character setup

Expected behaviour

  • works: data cat https://raw.githubusercontent.com/frictionlessdata/test-data/master/files/csv/all-schema-types.csv
    • same thing works when data is local
  • works: data push all-schema-types.csv - generates correct descriptor, eg, it has "dialectproperty withescapeChar: """`
@anuveyatsu anuveyatsu changed the title Cannot push tabular data with values that have double quotes [Push] Cannot push tabular data with values that have double quotes Feb 1, 2018
@zelima zelima self-assigned this Feb 2, 2018
@anuveyatsu anuveyatsu assigned anuveyatsu and unassigned zelima Feb 2, 2018
@anuveyatsu
Copy link
Member Author

This is now INVALID as if you try to reproduce, it wouldn't fail as before.
In general, if users want to have double quotes " in the values, e.g., when a value is a hash {"a":1} then it should be enclosed in single quotes.

@AcckiyGerman
Copy link
Contributor

user@pc:~$ data-linux push https://github.com/frictionlessdata/test-data/blob/master/files/csv/all-schema-types.csv
> Error! Invalid opening quote at line 8
user@pc:~$ node work/datahq/data-cli/bin/data.js push https://github.com/frictionlessdata/test-data/blob/master/files/csv/all-schema-types.csv
> Error! Invalid opening quote at line 8
user@pc:~$ node work/datahq/data-cli/bin/data.js -v
0.6.7

data-cli is up to date with github master branch, npm i is also done.

@AcckiyGerman
Copy link
Contributor

@anuveyatsu ^^^

@anuveyatsu
Copy link
Member Author

@AcckiyGerman have you pulled latest "test-data"?

@anuveyatsu
Copy link
Member Author

Just realised that double quotes in values should be used with escape character (and escape character by default should be also double quotes) so, e.g.:

{"a": 1}

should become:

"{""a"": 1}"

@AcckiyGerman
Copy link
Contributor

that all becomes complicated, I suggest you to make a list of rules like

  • escape this and this with quotes
  • escape doublequotes in this way
  • etc

@AcckiyGerman
Copy link
Contributor

AcckiyGerman commented Feb 7, 2018

coz otherwise It can become very compex - did you ever seen the json file encoded in the url string ? 😄

@AcckiyGerman
Copy link
Contributor

or probably you can use json.dumps() for that

@anuveyatsu
Copy link
Member Author

@AcckiyGerman I think it's a common situation when you have double quotes in values. By default, " is used as the escape character, e.g., in this library http://csv.adaltas.com/parse/#parser-options (we're currently using it) and also if you try to export data from Excel or Google Spreadsheets as CSV you would get the same result. Considering these points, I think we should have " as default escape character in dialect for tabular resources.

@zelima
Copy link
Collaborator

zelima commented Feb 7, 2018

Agree. You can escape \" or set different escape character and push that way.

@AcckiyGerman
Copy link
Contributor

TESTED: FAILED
data push all-schema-types.csv cause:

@AcckiyGerman
Copy link
Contributor

AcckiyGerman commented Feb 8, 2018

@anuveyatsu The push is OK, so we could close this issue, but first create an issue about PUBLISH FAIL

@zelima zelima changed the title [Push] Cannot push tabular data with values that have double quotes [Push] Cannot publish tabular data with values that have double quotes Feb 8, 2018
@zelima
Copy link
Collaborator

zelima commented Feb 12, 2018

@AcckiyGerman Just a tip: posting links to the Failed revisions do not really help as we can not see unless logged in. Could we switch posting screenshots instead in cases like this.

@AcckiyGerman
Copy link
Contributor

@zelima sure. I don't know why I was sure that you can read related logs from the backend. But even if so, it will be easier to read logs in the message than lurking on the backend

@zelima
Copy link
Collaborator

zelima commented Feb 14, 2018

@AcckiyGerman Although you won't be able to push this exact file due to #98, this one will be fixed in data 0.7.7.

I created Gist and removed yearmonth and geopoint types from there, leaving double quote's column as is. You can try

data push  https://gist.githubusercontent.com/zelima/d9a3d99b7ca41e632c8b3d7853d543df/raw/be2058d74248e219e430ba72ed5de0cbeb005aaf/types.csv

Or take a look at already published package https://datahub.io/zelima/schema/v/27

@AcckiyGerman
Copy link
Contributor

TESTED & FIXED

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants