Skip to content

Commit

Permalink
IMPALA-11332: Fix trailing whitespace for CSV output
Browse files Browse the repository at this point in the history
The current CSV output is stripping trailing
whitespaces from the last line of CSV output. This
rstrip() was intended to remove an extra newline,
but it is matching other white space. This is a
problem for a SQL query like:
select 'Trailing whitespace          ';

This changes the rstrip() to rstrip('\n') to
avoid removing the other white space.

Testing:
 - Current shell tests pass
 - Added a shell test that verifies trailing whitespace
   is not being stripped.

Change-Id: I69d032ca2f581587b0938d0878fdf402fee0d57e
Reviewed-on: http://gerrit.cloudera.org:8080/18580
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
  • Loading branch information
joemcdonnell authored and stiga-huang committed Jun 2, 2022
1 parent ed0d934 commit c41e694
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 1 deletion.
9 changes: 8 additions & 1 deletion shell/shell_output.py
Expand Up @@ -90,7 +90,14 @@ def format(self, rows):
row = [val.encode('utf-8', 'replace') if isinstance(val, unicode) else val
for val in row]
writer.writerow(row)
rows = temp_buffer.getvalue().rstrip()
# The CSV writer produces an extra newline. Strip that extra newline (and
# only that extra newline). csv wraps newlines for data values in quotes,
# so rstrip will be limited to the extra newline.
if sys.version_info.major == 2:
# Python 2 is in encoded Unicode bytes, so this needs to be a bytes \n.
rows = temp_buffer.getvalue().rstrip(b'\n')
else:
rows = temp_buffer.getvalue().rstrip('\n')
temp_buffer.close()
return rows

Expand Down
10 changes: 10 additions & 0 deletions tests/shell/test_shell_commandline.py
Expand Up @@ -1254,3 +1254,13 @@ def test_http_socket_timeout(self, vector):
result = run_impala_shell_cmd(vector, args + ['--http_socket_timeout_s=None'])
assert result.stderr == ""
assert result.stdout == "0\n"

def test_trailing_whitespace(self, vector):
"""Test CSV output with trailing whitespace"""

# Ten trailing spaces
query = "select 'Trailing Whitespace '"
# Only one column, no need for output_delimiter
output = run_impala_shell_cmd(vector, ['-q', query, '-B'])
assert "Fetched 1 row(s)" in output.stderr
assert "Trailing Whitespace \n" in output.stdout

0 comments on commit c41e694

Please sign in to comment.