Skip to content

fix: update post_cell function to handle different newline characters in cell values #2972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apps/common/handle/impl/xls_split_handle.py
Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@


def post_cell(cell_value):
return cell_value.replace('\n', '<br>').replace('|', '&#124;')
return cell_value.replace('\r\n', '<br>').replace('\n', '<br>').replace('|', '&#124;')


def row_to_md(row):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code is mostly correct but has a small issue with handling line breaks (\t):

@@ -14,7 +14,7 @@
 
 
 def post_cell(cell_value):
-    return cell_value.replace('\n', '<br>').replace('|', '&#124;')
+    return cell_value.replace('\r\n', '<br>').replace('\n', '<br>').replace('\t', '&nbsp;')  # Add this line to replace tabs with spaces

Explanation:

  1. Replacement of \n: Both \n and \r\n are used to denote line breaks in text. By replacing both \n and \r\n, you ensure consistency across different newline characters.

  2. Handling Tabs: The code currently replaces tabs (`\t) with  ` (non-breaking space). This may be necessary if tabular data needs to preserve its original formatting, especially when converting it to HTML or Markdown where automatic indentation might cause problems.

  3. Potential Issues:

    • If the input data contains only \ts without any newlines, post_cell will not work as intended because it relies on \n.
    • Ensuring that all relevant whitespace characters (including tabs) are handled appropriately can make the function more robust for various inputs.

By adding the replacement for \t, you handle more edge cases and ensure that the converted markup preserves consistent spacing and structure.