Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

fputcsv improvements to allow RFC 4180 compliance #197

Closed
wants to merge 9 commits into from

6 participants

@tml

Allows users to assert that something other than the backslash
should be considered an escape char; also follows the RFC 4180
recommendation that fields containing a " be enclosed.

@tml tml Expose fputcsv's escape_char to userland
Allows users to assert that something other than the backslash
should be considered an escape char; also follows the RFC 4180
recommendation that fields containing a " be enclosed.
49f1ef4
@lstrojny

We definitely need a test for that. Could you add one?

@tml

I'm actually all thumbs with .phpt syntax, but I'll see if I can outsource it. :)

@lstrojny lstrojny commented on the diff
ext/standard/file.c
@@ -1839,6 +1840,17 @@ static const char *php_fgetcsv_lookup_trailing_spaces(const char *ptr, size_t le
enclosure = *enclosure_str;
}
+ if (escape_str != NULL) {
+ if (escape_str_len < 1) {
+ php_error_docref(NULL TSRMLS_CC, E_WARNING, "escape must be a character");
+ RETURN_FALSE;
+ } else if (escape_str_len > 1) {
+ php_error_docref(NULL TSRMLS_CC, E_NOTICE, "escape must be a single character");

Should be "Escape string must be a single character"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lstrojny lstrojny commented on the diff
ext/standard/file.c
@@ -1839,6 +1840,17 @@ static const char *php_fgetcsv_lookup_trailing_spaces(const char *ptr, size_t le
enclosure = *enclosure_str;
}
+ if (escape_str != NULL) {
+ if (escape_str_len < 1) {
+ php_error_docref(NULL TSRMLS_CC, E_WARNING, "escape must be a character");

Message should be "Escape string must be a character"

This and line 1848 are fine — the messages are consistent with fgetcsv() and the other warnings in the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lstrojny

Any news?

@tml
tml commented

No - if you know of anyone who would be willing to write a .phpt for it, feel free to pass it along to them.

@theoreticaLee

@lstrojny I have a test coming your way. branching and submitting a pull request

@theoreticaLee

@tml @lstrojny

TML, hey, It's error_code from irc. I just submitted a pull request https://github.com/tml/php-src/pull/1 on your repo like we talked about on monday.

Either hit me up on irc or here if you want me to work further on it.

thanks guys!

@lstrojny

@tml @theoreticaLee could you both coordinate and consolidate a single PR including the test?

@LawnGnome

@lstrojny It's OK, I'm about to make this even more annoying for everyone — it needs a rebase after the fix for https://bugs.php.net/bug.php?id=43225, and SplFileObject::fputcsv() also needs to have an escape argument added to keep it in line with fputcsv().

It turned out that @theoreticaLee and I were working on this simultaneously unbeknownst to each other — I'll consolidate the commits from @tml and @theoreticaLee together with my own changes for SplFileObject and the aforementioned rebase and commit it.

@LawnGnome

Actually, forget what I said about rebasing — the #43225 fix isn't right, so I've reverted it for now, since I can't spend any more time on it at present.

@tml @theoreticaLee As @lstrojny said, if you guys can consolidate this, that'd be grand.

My earlier note about SplFileObject::fputcsv() stands, though.

@tml

@LawnGnome Thank you for finding that - it was why I couldn't manage to get a solid .phpt out of this, but I couldn't find the bottom of the stack of changes. As always, your heroic efforts are greatly appreciated.

@lstrojny, the test case from @theoreticaLee has been merged into this pull request; as far as I'm concerned, @LawnGnome and @theoreticaLee need the credit for this patch; I just happened to code monkey it, they did the heavy lifting.

@tml tml commented on the diff
ext/standard/file.c
@@ -817,7 +817,7 @@ static void file_globals_dtor(php_file_globals *file_globals_p TSRMLS_DC)
if (p_len > 64) {
p[63] = '\0';
}
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tml tml commented on the diff
ext/standard/file.c
@@ -1377,13 +1377,13 @@ PHPAPI int php_mkdir(char *dir, long mode TSRMLS_DC)
{
long arg1 = 0;
int oldumask;
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tml tml commented on the diff
ext/standard/file.c
((6 lines not shown))
oldumask = umask(077);
if (BG(umask) == -1) {
BG(umask) = oldumask;
}
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@smalyshev
Owner

This patch seems to break fputcsv_variation6.phpt, fputcsv_error.phpt and fputcsv_variation9.phpt. Changes to fputcsv_error.phpt are trivial but for two others I'm not sure what's wrong. @tml, please look into it.

@smalyshev
Owner

@tml any news?

@tml

I personally feel these tests are horribly incorrect (not to mention bloated with c/p code that doesn't make much sense at all). They're testing what PHP has historically done, rather than what makes sense (most notably, if I explicitly request delimiters and enclosures, why is PHP dropping them!?).

The line that is causing these tests to fail was FPUTCSV_FLD_CHK('"'), added because RFC 4180 2.6 says: " Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes..." (emphasis mine). However, I have no interest in a BC fight, so I've retracted the fix; now we can just brush the problem under the rug for the next 10 years.

@tml

I cannot reproduce the failed builds at this point; the tests failing are:

PostgreSQL notice function [ext/pgsql/tests/09notice.phpt]
Bug #32223 (8.0+) (weird behaviour of pg_last_notice using define) [ext/pgsql/tests/80_bug32223b.phpt]

Neither of them appears to bear the slightest relationship to my fputcsv changes.

@smalyshev
Owner

Looks like doesn't fail for me anymore, so I guess it's ok.

@php-pulls
Collaborator

Comment on behalf of stas at php.net:

merged

@php-pulls php-pulls closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 13, 2012
  1. @tml

    Expose fputcsv's escape_char to userland

    tml authored
    Allows users to assert that something other than the backslash
    should be considered an escape char; also follows the RFC 4180
    recommendation that fields containing a " be enclosed.
Commits on Jan 15, 2013
  1. @theoreticaLee
  2. @theoreticaLee @tml

    added test file from theoreticaLee

    theoreticaLee authored tml committed
  3. @tml
  4. @tml
  5. @tml

    Merge my fork with remote-tracking branch 'upstream/master' in prepar…

    tml authored
    …ation for new pull request
Commits on Aug 18, 2013
  1. @tml
Commits on Aug 19, 2013
  1. @tml
  2. @tml
This page is out of date. Refresh to see the latest.
View
34 ext/standard/file.c
@@ -818,7 +818,7 @@ PHP_FUNCTION(tempnam)
if (p_len > 64) {
p[63] = '\0';
}
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
RETVAL_FALSE;
if ((fd = php_open_temporary_fd_ex(dir, p, &opened_path, 1 TSRMLS_CC)) >= 0) {
@@ -1380,13 +1380,13 @@ PHP_FUNCTION(umask)
{
long arg1 = 0;
int oldumask;
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
oldumask = umask(077);
if (BG(umask) == -1) {
BG(umask) = oldumask;
}
-
+
@tml
tml added a note

@theoreticaLee cleaned up stray tab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "|l", &arg1) == FAILURE) {
RETURN_FALSE;
}
@@ -1799,22 +1799,23 @@ static const char *php_fgetcsv_lookup_trailing_spaces(const char *ptr, size_t le
#define FPUTCSV_FLD_CHK(c) memchr(Z_STRVAL(field), c, Z_STRLEN(field))
-/* {{{ proto int fputcsv(resource fp, array fields [, string delimiter [, string enclosure]])
+/* {{{ proto int fputcsv(resource fp, array fields [, string delimiter [, string enclosure [, string escape_char]]])
Format line as CSV and write to file pointer */
PHP_FUNCTION(fputcsv)
{
- char delimiter = ','; /* allow this to be set as parameter */
- char enclosure = '"'; /* allow this to be set as parameter */
- const char escape_char = '\\';
+ char delimiter = ','; /* allow this to be set as parameter */
+ char enclosure = '"'; /* allow this to be set as parameter */
+ char escape_char = '\\'; /* allow this to be set as parameter */
php_stream *stream;
zval *fp = NULL, *fields = NULL;
int ret;
- char *delimiter_str = NULL, *enclosure_str = NULL;
- int delimiter_str_len = 0, enclosure_str_len = 0;
+ char *delimiter_str = NULL, *enclosure_str = NULL, *escape_str = NULL;
+ int delimiter_str_len = 0, enclosure_str_len = 0, escape_str_len = 0;
- if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ra|ss",
+ if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ra|sss",
&fp, &fields, &delimiter_str, &delimiter_str_len,
- &enclosure_str, &enclosure_str_len) == FAILURE) {
+ &enclosure_str, &enclosure_str_len,
+ &escape_str, &escape_str_len) == FAILURE) {
return;
}
@@ -1842,6 +1843,17 @@ PHP_FUNCTION(fputcsv)
enclosure = *enclosure_str;
}
+ if (escape_str != NULL) {
+ if (escape_str_len < 1) {
+ php_error_docref(NULL TSRMLS_CC, E_WARNING, "escape must be a character");

Message should be "Escape string must be a character"

This and line 1848 are fine — the messages are consistent with fgetcsv() and the other warnings in the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ RETURN_FALSE;
+ } else if (escape_str_len > 1) {
+ php_error_docref(NULL TSRMLS_CC, E_NOTICE, "escape must be a single character");

Should be "Escape string must be a single character"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ }
+ /* use first character from string */
+ escape_char = *escape_str;
+ }
+
PHP_STREAM_TO_ZVAL(stream, &fp);
ret = php_fputcsv(stream, fields, delimiter, enclosure, escape_char TSRMLS_CC);
View
2  ext/standard/tests/file/fputcsv_error.phpt
@@ -48,7 +48,7 @@ Warning: fputcsv() expects at least 2 parameters, 0 given in %s on line %d
NULL
-- Testing fputcsv() with more than expected number of arguments --
-Warning: fputcsv() expects at most 4 parameters, 5 given in %s on line %d
+Warning: fputcsv() expects parameter 5 to be string, resource given in %s on line %d
NULL
-- Testing fputcsv() with invalid arguments --
-- Iteration 1 --
View
107 ext/standard/tests/file/fputcsv_variation15.phpt
@@ -0,0 +1,107 @@
+--TEST--
+various fputcsv() functionality tests
+--CREDITS--
+Lee Leathers <leeleathers@gmail.com>
+--FILE--
+<?php
+
+$list = array (
+ 0 => 'aaa,bbb',
+ 1 => 'aaa,"bbb"',
+ 2 => '"aaa","bbb"',
+ 3 => 'aaa,bbb',
+ 4 => '"aaa",bbb',
+ 5 => '"aaa", "bbb"',
+ 6 => ',',
+ 7 => 'aaa,',
+ 8 => ',"aaa"',
+ 9 => '"",""',
+ 10 => '"""""",',
+ 11 => '""""",aaa',
+ 12 => 'aaa,bbb ',
+ 13 => 'aaa,"bbb "',
+ 14 => 'aaa"aaa","bbb"bbb',
+ 15 => 'aaa"aaa""",bbb',
+ 16 => 'aaa,"/"bbb,ccc',
+ 17 => 'aaa"/"a","bbb"',
+ 18 => '"/"","aaa"',
+ 19 => '"/""",aaa',
+);
+
+$file = dirname(__FILE__) . 'fgetcsv.csv';
+@unlink($file);
+
+$fp = fopen($file, "w");
+foreach ($list as $v) {
+ fputcsv($fp, explode(',', $v), ',', '"', '/');
+}
+fclose($fp);
+
+$res = file($file);
+foreach($res as &$val)
+{
+ $val = substr($val, 0, -1);
+}
+echo '$list = ';var_export($res);echo ";\n";
+
+$fp = fopen($file, "r");
+$res = array();
+while($l=fgetcsv($fp, 0, ',', '"', '/'))
+{
+ $res[] = join(',',$l);
+}
+fclose($fp);
+
+echo '$list = ';var_export($res);echo ";\n";
+
+@unlink($file);
+
+?>
+===DONE===
+<?php exit(0); ?>
+--EXPECT--
+$list = array (
+ 0 => 'aaa,bbb',
+ 1 => 'aaa,"""bbb"""',
+ 2 => '"""aaa""","""bbb"""',
+ 3 => 'aaa,bbb',
+ 4 => '"""aaa""",bbb',
+ 5 => '"""aaa"""," ""bbb"""',
+ 6 => ',',
+ 7 => 'aaa,',
+ 8 => ',"""aaa"""',
+ 9 => '"""""",""""""',
+ 10 => '"""""""""""""",',
+ 11 => '"""""""""""",aaa',
+ 12 => 'aaa,"bbb "',
+ 13 => 'aaa,"""bbb """',
+ 14 => '"aaa""aaa""","""bbb""bbb"',
+ 15 => '"aaa""aaa""""""",bbb',
+ 16 => 'aaa,"""/"bbb",ccc',
+ 17 => '"aaa""/"a""","""bbb"""',
+ 18 => '"""/"""","""aaa"""',
+ 19 => '"""/"""""",aaa',
+);
+$list = array (
+ 0 => 'aaa,bbb',
+ 1 => 'aaa,"bbb"',
+ 2 => '"aaa","bbb"',
+ 3 => 'aaa,bbb',
+ 4 => '"aaa",bbb',
+ 5 => '"aaa", "bbb"',
+ 6 => ',',
+ 7 => 'aaa,',
+ 8 => ',"aaa"',
+ 9 => '"",""',
+ 10 => '"""""",',
+ 11 => '""""",aaa',
+ 12 => 'aaa,bbb ',
+ 13 => 'aaa,"bbb "',
+ 14 => 'aaa"aaa","bbb"bbb',
+ 15 => 'aaa"aaa""",bbb',
+ 16 => 'aaa,"/"bbb,ccc',
+ 17 => 'aaa"/"a","bbb"',
+ 18 => '"/"","aaa"',
+ 19 => '"/""",aaa',
+);
+===DONE===
Something went wrong with that request. Please try again.