Skip to content

fix: Fix encoding, blank lines, and append position in edit operations#201

Merged
rdmueller merged 2 commits intodocToolchain:mainfrom
raifdmueller:fix/edit-operations-bugs-193-194-197
Jan 26, 2026
Merged

fix: Fix encoding, blank lines, and append position in edit operations#201
rdmueller merged 2 commits intodocToolchain:mainfrom
raifdmueller:fix/edit-operations-bugs-193-194-197

Conversation

@raifdmueller
Copy link
Collaborator

Summary

Fixes three critical bugs in edit operations that affected content quality and document structure:

Problems & Solutions

Issue #193: Encoding Problem with Umlauts ⚠️ HIGH

Problem:

content.encode("utf-8").decode("unicode_escape")

This approach corrupted non-ASCII characters, making the tool unusable for German and other international documentation.

Solution:

  • Added custom _process_escape_sequences() function
  • Handles \n, \t, \r, \\ without corrupting UTF-8 characters
  • Preserves umlauts: äöü ß remain intact

Test coverage:

  • Umlauts preserved ✓
  • Escape sequences work ✓
  • Mixed content handled correctly ✓

Issue #194: Missing Blank Lines ⚠️ HIGH

Problem:
Sections ended with single newline (\n), causing sections to run together:

== Section 1
Content
== Section 2  ← No blank line!

Solution:

  • Ensure content ends with blank line (two newlines \n\n)
  • Proper AsciiDoc/Markdown formatting maintained

Test coverage:

  • Update preserves blank lines ✓
  • Insert preserves blank lines ✓

Issue #197: Append Position Bug ⚠️ HIGH

Problem:
--position append inserted at the beginning instead of the end because _get_section_append_line() only searched for . separator, missing Level-1 children that use : separator.

Example paths:

  • doc:child (Level 1 - uses :)
  • doc:child.grandchild (Level 2+ - uses .)

Solution:

  • Updated descendant search to check for both : and . separators
  • Correctly finds all children regardless of nesting level

Test coverage:

  • Append inserts at end ✓
  • Handles nested sections correctly ✓

Changes

Modified:

  • src/dacli/cli.py - Added _process_escape_sequences(), fixed _get_section_append_line()
  • src/dacli/services/content_service.py - Ensure blank lines between sections

Added:

  • tests/test_edit_operations_bugs.py - 12 comprehensive tests for all three bugs

Test Results

501 tests passing (12 new tests, no regressions)

New test coverage:

  1. Encoding (6 tests):

    • Preserves umlauts
    • Handles escape sequences (\n, \t, \)
    • Mixed content
  2. Blank lines (2 tests):

    • Update preserves spacing
    • Insert preserves spacing
  3. Append position (2 tests):

    • Appends at end of document
    • Appends after all child sections
  4. Integration (2 tests):

    • Combined fixes work together
    • Real-world scenarios

Examples

Before (Broken):

# Encoding issue
dacli insert "doc:section" --content "Über uns äöü"
# Result: "Ãber uns äöü" ❌

# Blank line issue
== Section 1
Content
== Section 2  ← Runs together ❌

# Append issue
dacli insert "doc" --position append --content "Appendix"
# Result: Inserted after title, not at end ❌

After (Fixed):

# Encoding works
dacli insert "doc:section" --content "Über uns äöü"
# Result: "Über uns äöü" ✓

# Blank lines preserved
== Section 1

Content

== Section 2  ✓

# Append at end
dacli insert "doc" --position append --content "Appendix"
# Result: Inserted at end after all sections ✓

Impact

Severity: HIGH - These bugs affected core functionality:

  • Encoding: Made tool unusable for non-English documentation
  • Blank lines: Broke document formatting
  • Append: Opposite of expected behavior

Backwards compatibility: No breaking changes, only fixes

Fixes #193, #194, #197

🤖 Generated with Claude Code

raifdmueller and others added 2 commits January 26, 2026 23:07
This commit fixes three critical bugs in edit operations (update/insert):

Issue docToolchain#193: Encoding problem with umlauts
- Problem: .encode('utf-8').decode('unicode_escape') corrupted non-ASCII
  characters like umlauts (äöü ß)
- Solution: Replaced with custom _process_escape_sequences() function that
  handles \n, \t, \r, \\ without corrupting UTF-8 characters
- Files: src/dacli/cli.py

Issue docToolchain#194: Missing blank lines after edit operations
- Problem: Sections ended with single newline, causing sections to run together
- Solution: Ensure content ends with blank line (two newlines) to properly
  separate sections
- Files: src/dacli/services/content_service.py

Issue docToolchain#197: append inserts at beginning instead of end
- Problem: _get_section_append_line() only searched for '.' separator,
  missing Level-1 children that use ':' separator (e.g., 'doc:child')
- Solution: Check for both ':' and '.' separators when finding descendants
- Files: src/dacli/cli.py

Changes:
- Added _process_escape_sequences() helper function
- Updated _get_section_append_line() to handle both path separators
- Modified content_service.py to ensure blank lines between sections
- Added comprehensive tests (12 new tests, all passing)

Test coverage:
- Encoding: umlauts, escape sequences, mixed content
- Blank lines: preserved after update and insert
- Append position: inserts at end, handles nested sections
- Integration: combined bug fixes work together

All 501 tests passing (no regressions)

Fixes docToolchain#193, docToolchain#194, docToolchain#197

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix linter errors from CI.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@rdmueller rdmueller merged commit 8a457cc into docToolchain:main Jan 26, 2026
4 checks passed
@raifdmueller raifdmueller deleted the fix/edit-operations-bugs-193-194-197 branch January 26, 2026 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Encoding problem with umlauts in edit operations (update/insert)

2 participants