Skip to content

fix: normalize user emails on save to strip invisible Unicode characters#41664

Merged
subrata71 merged 1 commit intoreleasefrom
fix/normalize-email-on-save-ce
Mar 27, 2026
Merged

fix: normalize user emails on save to strip invisible Unicode characters#41664
subrata71 merged 1 commit intoreleasefrom
fix/normalize-email-on-save-ce

Conversation

@subrata71
Copy link
Copy Markdown
Collaborator

@subrata71 subrata71 commented Mar 27, 2026

EE PR: https://github.com/appsmithorg/appsmith-ee/pull/8830

Summary

  • Add EmailNormalizer utility to strip invisible Unicode characters (U+2060, U+200B, etc.), apply NFKC normalization, trim, and lowercase emails
  • Add collision-aware ReactiveBeforeSaveCallback<User> that normalizes emails on every User save, regardless of entry point
  • When normalization would collide with an existing user (dirty + clean duplicate pair), normalization is skipped to avoid data loss

This is a defense-in-depth measure complementing the cloud-services fixes in APP-15054 and APP-15055.

Slack Thread

https://theappsmith.slack.com/archives/C0341RERY4R/p1774364105450899?thread_ts=1767798360.977899&cid=C0341RERY4R

Test Plan

  • Unit tests for EmailNormalizer (invisible char stripping, idempotency, null handling)
  • Unit tests for UserBeforeSaveCallback (normalization, collision skip, clean email no-op, non-user collection no-op)

Summary by CodeRabbit

  • New Features
    • Enhanced email address normalization during user registration and authentication to remove hidden formatting characters and ensure consistent email matching and comparison across the platform.

Automation

/ok-to-test tags="@tag.All"

Tip

🟢 🟢 🟢 All cypress tests have passed! 🎉 🎉 🎉
Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/23641217711
Commit: 9bc4b37
Cypress dashboard.
Tags: @tag.All
Spec:


Fri, 27 Mar 2026 11:24:48 UTC

@subrata71 subrata71 self-assigned this Mar 27, 2026
@subrata71 subrata71 added the ok-to-test Required label for CI label Mar 27, 2026
@github-actions github-actions bot added the Bug Something isn't working label Mar 27, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Walkthrough

A new EmailNormalizer utility class was introduced to standardize email handling via Unicode normalization, invisible character removal, trimming, and lowercasing. The utility is integrated into the user creation flow in UserServiceCEImpl, and comprehensive test coverage validates the normalization behavior across user creation and signup paths.

Changes

Cohort / File(s) Summary
Email Normalization Utility
app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java
New public utility class providing normalizeEmail() (applies NFKC Unicode normalization, removes invisible Unicode characters via regex, trims, and lowercases) and containsInvisibleCharacters() (detects presence of invisible characters).
Service Integration
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/UserServiceCEImpl.java
Updated userCreate() method to use EmailNormalizer.normalizeEmail() instead of direct toLowerCase() for email normalization before persistence.
Utility Tests
app/server/appsmith-server/src/test/java/com/appsmith/server/helpers/EmailNormalizerTest.java
New JUnit 5 test class validating invisible character stripping (Word Joiner, Zero Width Space, etc.), whitespace trimming, lowercasing, null handling, idempotence, and invisible character detection.
Service Tests
app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java
Added test cases for user creation and signup flows validating email normalization: invisible character removal, multi-character cleanup, unchanged clean emails, and retention of login source during signup.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

📧 Invisible characters hide in the deep,
But this normalizer won't let them creep.
Emails now clean, consistent, and bright—
Unicode-blessed and always lowercase-right! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.52% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding email normalization to strip invisible Unicode characters on user save.
Description check ✅ Passed PR description is mostly complete with clear objectives, test coverage details, and relevant context/links, but lacks explicit issue number reference and TL;DR despite technical nature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/normalize-email-on-save-ce

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java (2)

8-9: Minor: redundant \u200C in pattern.

\u200C is already covered by the range \u200B-\u200D. Not a bug, but can be cleaned up.

🧹 Cleaner pattern
     private static final Pattern INVISIBLE_CHAR_PATTERN =
-            Pattern.compile("[\u200B-\u200D\uFEFF\u2060\u00AD\u200C\u200E\u200F\u202A-\u202E\u2061-\u2063]");
+            Pattern.compile("[\u200B-\u200D\uFEFF\u2060\u00AD\u200E\u200F\u202A-\u202E\u2061-\u2063]");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java`
around lines 8 - 9, In EmailNormalizer.java, the INVISIBLE_CHAR_PATTERN contains
a redundant escape `\u200C` since the range `\u200B-\u200D` already includes it;
update the Pattern.compile invocation in the INVISIBLE_CHAR_PATTERN field to
remove the extra `\u200C` entry so the regex is cleaner (leave all other code
and ranges unchanged), ensuring no duplicate code points remain in the
INVISIBLE_CHAR_PATTERN constant.

6-38: Utility class design: consider adding final and private constructor.

This utility class only has static methods. Per Java conventions, it should be final with a private constructor to prevent instantiation and subclassing.

🔧 Proposed fix
-public class EmailNormalizer {
+public final class EmailNormalizer {
+
+    private EmailNormalizer() {
+        // Utility class
+    }

     private static final Pattern INVISIBLE_CHAR_PATTERN =
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java`
around lines 6 - 38, Make EmailNormalizer a non-instantiable utility class by
declaring the class final and adding a private constructor (e.g., private
EmailNormalizer() { throw new AssertionError("No instances."); }) to prevent
instantiation and subclassing; update the class declaration for EmailNormalizer
and add the private constructor while leaving the existing static methods
normalizeEmail and containsInvisibleCharacters unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/UserServiceCEImpl.java`:
- Line 466: EmailNormalizer.normalizeEmail(user.getEmail()) can return null for
emails that become empty after stripping invisible characters; update the
UserServiceCEImpl flow around the call that currently does
user.setEmail(EmailNormalizer.normalizeEmail(user.getEmail())) to guard against
null: call EmailNormalizer.normalizeEmail(...) into a local variable, validate
it is non-null/non-empty, and if null either throw a clear
IllegalArgumentException (or a domain-specific BadRequestException) with a
descriptive message or handle it by setting a safe default, so downstream
methods and repository lookups that rely on user.getEmail() never receive null.

In
`@app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java`:
- Around line 346-365: In the disabled-user re-enable path in UserServiceCEImpl
(the block around the re-enable flow currently calling repository.save()
directly, lines ~527-540), normalize the user's email with
EmailNormalizer.normalizeEmail(...) before persisting instead of relying on
create(); i.e., compute String normalized =
EmailNormalizer.normalizeEmail(user.getEmail()), set it back on the User
(user.setEmail(normalized)) and then call repository.save(user) so the re-enable
flow stores the cleaned email consistently with create().

---

Nitpick comments:
In
`@app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java`:
- Around line 8-9: In EmailNormalizer.java, the INVISIBLE_CHAR_PATTERN contains
a redundant escape `\u200C` since the range `\u200B-\u200D` already includes it;
update the Pattern.compile invocation in the INVISIBLE_CHAR_PATTERN field to
remove the extra `\u200C` entry so the regex is cleaner (leave all other code
and ranges unchanged), ensuring no duplicate code points remain in the
INVISIBLE_CHAR_PATTERN constant.
- Around line 6-38: Make EmailNormalizer a non-instantiable utility class by
declaring the class final and adding a private constructor (e.g., private
EmailNormalizer() { throw new AssertionError("No instances."); }) to prevent
instantiation and subclassing; update the class declaration for EmailNormalizer
and add the private constructor while leaving the existing static methods
normalizeEmail and containsInvisibleCharacters unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0de7dd3e-7857-4ed2-9309-463747c17723

📥 Commits

Reviewing files that changed from the base of the PR and between 069d580 and 9bc4b37.

📒 Files selected for processing (4)
  • app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/EmailNormalizer.java
  • app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/UserServiceCEImpl.java
  • app/server/appsmith-server/src/test/java/com/appsmith/server/helpers/EmailNormalizerTest.java
  • app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java


// convert the user email to lowercase
user.setEmail(user.getEmail().toLowerCase());
user.setEmail(EmailNormalizer.normalizeEmail(user.getEmail()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential NPE if email consists only of invisible characters.

EmailNormalizer.normalizeEmail() returns null when the email is empty after stripping invisible chars. Setting user.setEmail(null) could cause downstream failures (validation, repository lookups, etc.). Consider adding a guard or throwing an explicit exception.

🛡️ Proposed defensive check
-        user.setEmail(EmailNormalizer.normalizeEmail(user.getEmail()));
+        String normalizedEmail = EmailNormalizer.normalizeEmail(user.getEmail());
+        if (normalizedEmail == null) {
+            throw new AppsmithException(AppsmithError.INVALID_PARAMETER, FieldName.EMAIL);
+        }
+        user.setEmail(normalizedEmail);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/UserServiceCEImpl.java`
at line 466, EmailNormalizer.normalizeEmail(user.getEmail()) can return null for
emails that become empty after stripping invisible characters; update the
UserServiceCEImpl flow around the call that currently does
user.setEmail(EmailNormalizer.normalizeEmail(user.getEmail())) to guard against
null: call EmailNormalizer.normalizeEmail(...) into a local variable, validate
it is non-null/non-empty, and if null either throw a clear
IllegalArgumentException (or a domain-specific BadRequestException) with a
descriptive message or handle it by setting a safe default, so downstream
methods and repository lookups that rely on user.getEmail() never receive null.

Comment on lines +346 to +365
@Test
@WithMockAppsmithUser
public void signUpViaGoogle_WhenEmailHasInvisibleUnicode_SavedNormalized() {
String dirtyEmail = "\u2060google-signup-dirty@example.com";
String expectedCleanEmail = "google-signup-dirty@example.com";

User signupUser = new User();
signupUser.setEmail(dirtyEmail);
signupUser.setPassword("test-password-123");
signupUser.setSource(LoginSource.GOOGLE);

Mono<User> userMono = userService.create(signupUser);

StepVerifier.create(userMono)
.assertNext(user -> {
assertThat(user.getEmail()).isEqualTo(expectedCleanEmail);
assertThat(user.getSource()).isEqualTo(LoginSource.GOOGLE);
})
.verifyComplete();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for ReactiveBeforeSaveCallback implementations for User
ast-grep --pattern $'class $_ implements ReactiveBeforeSaveCallback<User> {
  $$$
}'

# Also search for BeforeConvertCallback or similar Spring Data callbacks
rg -n --type=java 'BeforeSaveCallback|BeforeConvertCallback' -A5

Repository: appsmithorg/appsmith

Length of output: 46


🏁 Script executed:

# First, locate the UserServiceTest.java file
fd -t f "UserServiceTest.java"

Repository: appsmithorg/appsmith

Length of output: 154


🏁 Script executed:

# Read the test method at lines 346-365
head -365 app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java | tail -20

Repository: appsmithorg/appsmith

Length of output: 862


🏁 Script executed:

# Search for UserServiceCEImpl to understand the re-enable path
fd -t f "UserServiceCEImpl.java"

Repository: appsmithorg/appsmith

Length of output: 506


🏁 Script executed:

# Search for email normalization logic
rg -n --type=java 'normalize|clean.*email|invisible.*unicode' -B2 -A2

Repository: appsmithorg/appsmith

Length of output: 50376


🏁 Script executed:

# Search for all User save operations and potential callbacks
rg -n --type=java '\.save\(.*User|ReactiveBeforeSaveCallback|BeforeSaveCallback|BeforeConvertCallback' -B3 -A3

Repository: appsmithorg/appsmith

Length of output: 12706


🏁 Script executed:

# Read UserServiceCEImpl.java around the re-enable path (lines 520-545)
sed -n '520,545p' app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/UserServiceCEImpl.java

Repository: appsmithorg/appsmith

Length of output: 1604


🏁 Script executed:

# Also check if there are any tests that cover the re-enable path with dirty emails
rg -n --type=java 'signUpViaGoogle_WhenEmailHasInvisibleUnicode|isEnabled.*false' -B5 -A10 app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java

Repository: appsmithorg/appsmith

Length of output: 673


Normalization gap in disabled-user re-enable path.

When a disabled user signs up again via the re-enable flow (UserServiceCEImpl:527-540), the email is not normalized before saving. The create() method normalizes via EmailNormalizer.normalizeEmail(), but the re-enable path directly calls repository.save() without this step. Add email normalization in the re-enable flow before persisting.

Note: The callback verification mentioned in the original comment is unnecessary—no ReactiveBeforeSaveCallback<User> implementation exists. The normalization must be added explicitly to the re-enable code path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/server/appsmith-server/src/test/java/com/appsmith/server/services/UserServiceTest.java`
around lines 346 - 365, In the disabled-user re-enable path in UserServiceCEImpl
(the block around the re-enable flow currently calling repository.save()
directly, lines ~527-540), normalize the user's email with
EmailNormalizer.normalizeEmail(...) before persisting instead of relying on
create(); i.e., compute String normalized =
EmailNormalizer.normalizeEmail(user.getEmail()), set it back on the User
(user.setEmail(normalized)) and then call repository.save(user) so the re-enable
flow stores the cleaned email consistently with create().

@subrata71 subrata71 requested a review from sondermanish March 27, 2026 10:26
@subrata71 subrata71 merged commit 3862563 into release Mar 27, 2026
87 checks passed
@subrata71 subrata71 deleted the fix/normalize-email-on-save-ce branch March 27, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something isn't working ok-to-test Required label for CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants