Skip to content

refactor: improve text sanitization and alias generation in ER diagram builder#1785

Merged
Artuomka merged 1 commit into
mainfrom
bakcned_mermain_diagram_fixes
May 19, 2026
Merged

refactor: improve text sanitization and alias generation in ER diagram builder#1785
Artuomka merged 1 commit into
mainfrom
bakcned_mermain_diagram_fixes

Conversation

@Artuomka
Copy link
Copy Markdown
Collaborator

@Artuomka Artuomka commented May 19, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved sanitization of special characters in entity relationship diagrams to prevent rendering issues with complex database schemas.
    • Enhanced handling of reserved keywords in diagram labels and identifiers to ensure compatibility with diagram generation tools.
  • Refactor

    • Streamlined internal diagram generation logic for better maintainability and robustness.

Review Change Stack

Copilot AI review requested due to automatic review settings May 19, 2026 09:10
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

📝 Walkthrough

Walkthrough

The buildMermaidErDiagram utility is refactored to improve text sanitization across diagram elements. Text escaping is updated from escapeQuotes to sanitizeQuotedText, which also normalizes whitespace. New helper functions toEntityAlias and toAttributeWord guard against Mermaid reserved keywords. These sanitization improvements are applied to table headers, relationship labels, column comments, and alias generation.

Changes

Mermaid Sanitization and Helper Refactoring

Layer / File(s) Summary
Sanitization helper implementations
backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts
New toEntityAlias, toAttributeWord, and sanitizeQuotedText helper functions provide unified sanitization logic to guard against reserved Mermaid keywords and normalize whitespace in quoted text.
Alias generation and comment building with reserved keyword guards
backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts
Mermaid reserved keyword sets are defined. makeUniqueAlias is refactored to use toEntityAlias for base generation. buildColumnComment is updated to use sanitizeQuotedText for its output.
Diagram element rendering updates
backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts
Table headers, relationship labels, and column attribute rendering are updated to use sanitizeQuotedText for quoted text in Mermaid syntax.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit hops through quoted text,
Escaping quotes and whitespace woes,
Reserved keywords guard the jest,
New helpers bloom like Mermaid rose. 🌸

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main changes: refactoring text sanitization logic and improving alias generation in the ER diagram builder utility.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Check ✅ Passed Security improved. Enhanced sanitization of special chars (newlines/tabs) and added protection against Mermaid injection via reserved word validation. No regressions detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bakcned_mermain_diagram_fixes

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Artuomka Artuomka enabled auto-merge May 19, 2026 09:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors Mermaid ER diagram generation to improve identifier/text sanitization and to generate safer aliases/attribute words (including handling some Mermaid-reserved words and marker keyword collisions).

Changes:

  • Replaces quote-escaping with sanitizeQuotedText() and applies it to entity labels, FK labels, and column comments.
  • Introduces toEntityAlias() / toAttributeWord() to generate Mermaid-safe entity aliases and attribute words (prefixing when reserved/invalid).
  • Adds reserved-word sets for Mermaid entities and attribute marker keywords.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

function escapeQuotes(value: string): string {
return value.replace(/"/g, "'");
function sanitizeQuotedText(value: string): string {
return value.replace(/"/g, "'").replace(/[\r\n\t]+/g, ' ');
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts (3)

142-148: 💤 Low value

Consider adding a fallback for empty sanitized identifiers.

If the input contains only special characters, sanitizeIdentifier returns an empty string, resulting in t_ as the alias. While syntactically valid, this is not very descriptive.

Consider using a more meaningful fallback like t_table for empty cases:

 function toEntityAlias(value: string): string {
 	const sanitized = sanitizeIdentifier(value);
-	if (sanitized.length === 0 || /^[0-9]/.test(sanitized) || MERMAID_ENTITY_RESERVED_WORDS.has(sanitized)) {
+	const base = sanitized.length === 0 ? 'table' : sanitized;
+	if (sanitized.length === 0 || /^[0-9]/.test(base) || MERMAID_ENTITY_RESERVED_WORDS.has(base)) {
-		return `t_${sanitized}`;
+		return `t_${base}`;
 	}
-	return sanitized;
+	return base;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts`
around lines 142 - 148, The toEntityAlias function returns a bare "t_" when
sanitizeIdentifier yields an empty string; update toEntityAlias (which calls
sanitizeIdentifier and checks MERMAID_ENTITY_RESERVED_WORDS) to provide a
meaningful fallback for empty sanitized identifiers (e.g., return "t_table" or
"t_<original-safe-name>") instead of "t_"; keep the existing numeric and
reserved-word checks and ensure the fallback is only used when sanitized.length
=== 0 so callers of toEntityAlias receive a descriptive alias.

162-164: 💤 Low value

Consider collapsing consecutive spaces for cleaner output.

The current regex replaces sequences of \r\n\t with a single space, but mixed whitespace like \n (newline followed by space) produces double spaces. Consider normalizing all whitespace runs:

 function sanitizeQuotedText(value: string): string {
-	return value.replace(/"/g, "'").replace(/[\r\n\t]+/g, ' ');
+	return value.replace(/"/g, "'").replace(/\s+/g, ' ').trim();
 }

This collapses all consecutive whitespace (including spaces) and trims leading/trailing spaces.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts`
around lines 162 - 164, sanitizeQuotedText currently only collapses CR/LF/TAB
but leaves mixed runs like "\n " producing double spaces; update the function
(sanitizeQuotedText) to first replace double quotes with single quotes, then
normalize all whitespace runs by replacing /\s+/ with a single space and trim
the result so leading/trailing spaces are removed, ensuring cleaner, collapsed
whitespace in the returned string.

150-156: 💤 Low value

Consider adding a fallback for empty sanitized identifiers.

Similar to toEntityAlias, if the input sanitizes to an empty string, this returns just _. Consider a more descriptive fallback:

 function toAttributeWord(value: string): string {
 	const sanitized = sanitizeIdentifier(value);
-	if (sanitized.length === 0 || /^[0-9]/.test(sanitized) || MERMAID_ATTRIBUTE_KEY_WORDS.has(sanitized)) {
+	const base = sanitized.length === 0 ? 'attr' : sanitized;
+	if (sanitized.length === 0 || /^[0-9]/.test(base) || MERMAID_ATTRIBUTE_KEY_WORDS.has(base)) {
-		return `_${sanitized}`;
+		return `_${base}`;
 	}
-	return sanitized;
+	return base;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts`
around lines 150 - 156, toAttributeWord currently returns just "_" when
sanitizeIdentifier(value) yields an empty string; update the function so empty
sanitized identifiers get a descriptive fallback instead of a bare underscore.
In function toAttributeWord, change the empty-case branch to return a clear
fallback like `_attribute` (or better `_attribute_<shortHash>` computed from the
original value for uniqueness) rather than `_`; you can add a small helper
(e.g., hashString) to produce a short hex/hash suffix if needed to avoid
collisions while keeping the result a valid identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts`:
- Around line 142-148: The toEntityAlias function returns a bare "t_" when
sanitizeIdentifier yields an empty string; update toEntityAlias (which calls
sanitizeIdentifier and checks MERMAID_ENTITY_RESERVED_WORDS) to provide a
meaningful fallback for empty sanitized identifiers (e.g., return "t_table" or
"t_<original-safe-name>") instead of "t_"; keep the existing numeric and
reserved-word checks and ensure the fallback is only used when sanitized.length
=== 0 so callers of toEntityAlias receive a descriptive alias.
- Around line 162-164: sanitizeQuotedText currently only collapses CR/LF/TAB but
leaves mixed runs like "\n " producing double spaces; update the function
(sanitizeQuotedText) to first replace double quotes with single quotes, then
normalize all whitespace runs by replacing /\s+/ with a single space and trim
the result so leading/trailing spaces are removed, ensuring cleaner, collapsed
whitespace in the returned string.
- Around line 150-156: toAttributeWord currently returns just "_" when
sanitizeIdentifier(value) yields an empty string; update the function so empty
sanitized identifiers get a descriptive fallback instead of a bare underscore.
In function toAttributeWord, change the empty-case branch to return a clear
fallback like `_attribute` (or better `_attribute_<shortHash>` computed from the
original value for uniqueness) rather than `_`; you can add a small helper
(e.g., hashString) to produce a short hex/hash suffix if needed to avoid
collisions while keeping the result a valid identifier.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1760512f-08f6-456d-827c-0d45740fedf4

📥 Commits

Reviewing files that changed from the base of the PR and between d2d09a8 and 0dcdcc3.

📒 Files selected for processing (1)
  • backend/src/entities/connection/utils/build-mermaid-er-diagram.util.ts

@Artuomka Artuomka merged commit 9d2bdc8 into main May 19, 2026
20 checks passed
@Artuomka Artuomka deleted the bakcned_mermain_diagram_fixes branch May 19, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants