Skip to content

fix: add support for additional languages/dialects (closes #9)#24

Merged
master-wayne7 merged 2 commits intomaster-wayne7:mainfrom
Deepak8858:fix/issue-9-new-languages
Apr 8, 2026
Merged

fix: add support for additional languages/dialects (closes #9)#24
master-wayne7 merged 2 commits intomaster-wayne7:mainfrom
Deepak8858:fix/issue-9-new-languages

Conversation

@Deepak8858
Copy link
Copy Markdown

@Deepak8858 Deepak8858 commented Apr 7, 2026

This PR adds support for Bengali (bn), Gujarati (gu), Punjabi (pa), Swahili (sw), and Urdu (ur). These languages are now part of the Language enum and have corresponding bad word lists in assets/data/.

Summary by CodeRabbit

  • New Features
    • Expanded language support: Bengali, Gujarati, Punjabi, Swahili, and Urdu — selectable in app settings for a localized experience.
  • Content
    • Added corresponding language data files (word lists) to improve localized handling and coverage across the newly supported languages.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2cbf355b-78cc-4c53-bec8-e53f64f88b5e

📥 Commits

Reviewing files that changed from the base of the PR and between 6220865 and 692f24c.

📒 Files selected for processing (1)
  • assets/data/bn.txt
✅ Files skipped from review due to trivial changes (1)
  • assets/data/bn.txt

📝 Walkthrough

Walkthrough

Adds five new languages (Bengali, Gujarati, Punjabi, Swahili, Urdu): new text asset files for each language and corresponding enum values and file-code mappings in lib/src/models/language.dart. No other code or public APIs changed.

Changes

Cohort / File(s) Summary
Language Data Assets
assets/data/bn.txt, assets/data/gu.txt, assets/data/pa.txt, assets/data/sw.txt, assets/data/ur.txt
Added five plain-text data files containing newline-delimited token lists: Bengali (bn.txt, 11 lines), Gujarati (gu.txt, 11 lines), Punjabi (pa.txt, 16 lines), Swahili (sw.txt, 21 lines), Urdu (ur.txt, 19 lines). No code changes in these files.
Language Model
lib/src/models/language.dart
Added enum members bengali, gujarati, punjabi, swahili, urdu; extended LanguageExtension.fromString(String) to accept bn/bengali, gu/gujarati, pa/punjabi, sw/swahili, ur/urdu; updated fileCode getter to return bn, gu, pa, sw, ur for the new enums.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

🐰 A hop, a nibble, five new tongues to say,
Bengali, Gujarati — bright as day,
Punjabi, Swahili, Urdu join the run,
Tokens tucked in files, one by one,
I nibble code and dance — hooray, well done!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding support for five new languages (Bengali, Gujarati, Punjabi, Swahili, Urdu) to the Language enum with corresponding data files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
assets/data/bn.txt (1)

1-13: Remove duplicate tokens to reduce redundant matching.

khanki (Line 1, Line 11) and ghu (Line 5, Line 13) are duplicated.

Proposed cleanup
 khanki
 madarchod
 bal
 choda
 ghu
 khankir
 magi
 saala
 bhenchod
 shala
-khanki
 tor
-ghu
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@assets/data/bn.txt` around lines 1 - 13, Remove the duplicate offensive
tokens by keeping a single instance of "khanki" and a single instance of "ghu"
in the token list (remove the second "khanki" and the second "ghu"), ensuring
the file contains only unique tokens; preserve the original ordering of first
occurrences when deduplicating and save the cleaned list back to the same file.
lib/src/models/language.dart (1)

318-332: Add tests for the new language mappings and data loading.

The new fromString/fileCode mappings are correct, but there’s no coverage for Bengali/Gujarati/Punjabi/Swahili/Urdu in test/language_data_test.dart (see test/language_data_test.dart:1-96). Please add round-trip mapping tests plus at least one bad-word detection check per new language.

Also applies to: 494-503

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/src/models/language.dart` around lines 318 - 332, Add unit tests to cover
the new Language mappings and data loading for Bengali, Gujarati, Punjabi,
Swahili, and Urdu: in the language data test file add round-trip mapping
assertions that Language.fromString(...) returns the expected enum (e.g.,
Language.fromString('bn') -> Language.bengali) and that the enum returns the
correct fileCode/string representation (via the existing fileCode/toString
helper) for each new language, and add at least one bad-word detection assertion
per language using the existing bad-word lookup helper used by other tests
(reuse the same pattern/assertions as existing tests for other languages so they
load language data and detect a known bad word). Ensure you reference and
exercise Language.fromString, the enum values (Language.bengali,
Language.gujarati, Language.punjabi, Language.swahili, Language.urdu), and the
fileCode accessor so the tests assert both mapping directions and bad-word
detection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@assets/data/bn.txt`:
- Around line 1-13: Remove the duplicate offensive tokens by keeping a single
instance of "khanki" and a single instance of "ghu" in the token list (remove
the second "khanki" and the second "ghu"), ensuring the file contains only
unique tokens; preserve the original ordering of first occurrences when
deduplicating and save the cleaned list back to the same file.

In `@lib/src/models/language.dart`:
- Around line 318-332: Add unit tests to cover the new Language mappings and
data loading for Bengali, Gujarati, Punjabi, Swahili, and Urdu: in the language
data test file add round-trip mapping assertions that Language.fromString(...)
returns the expected enum (e.g., Language.fromString('bn') -> Language.bengali)
and that the enum returns the correct fileCode/string representation (via the
existing fileCode/toString helper) for each new language, and add at least one
bad-word detection assertion per language using the existing bad-word lookup
helper used by other tests (reuse the same pattern/assertions as existing tests
for other languages so they load language data and detect a known bad word).
Ensure you reference and exercise Language.fromString, the enum values
(Language.bengali, Language.gujarati, Language.punjabi, Language.swahili,
Language.urdu), and the fileCode accessor so the tests assert both mapping
directions and bad-word detection.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dfb5ec04-dd84-48ed-8e37-3ab26d635071

📥 Commits

Reviewing files that changed from the base of the PR and between 0a57026 and 6220865.

📒 Files selected for processing (6)
  • assets/data/bn.txt
  • assets/data/gu.txt
  • assets/data/pa.txt
  • assets/data/sw.txt
  • assets/data/ur.txt
  • lib/src/models/language.dart

@master-wayne7 master-wayne7 merged commit 8072187 into master-wayne7:main Apr 8, 2026
3 checks passed
@master-wayne7
Copy link
Copy Markdown
Owner

Hi @Deepak8858, I accidentally merged and reverted this earlier. I’ve recreated the PR from my side, thanks again for the contribution 🙌

@coderabbitai coderabbitai bot mentioned this pull request Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants