Conversation
|
Thanks for opening this, but we'd appreciate a little more information. Could you update it with more details? |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughReplaces AWSSDK.S3 usage with a SigV4-signed HTTP S3 client, removes the Amazon S3 client factory, and registers ChangesS3 Storage Modernization
Text Preprocessing for TTS
Sequence DiagramsequenceDiagram
participant Client
participant AudioProcessingService
participant TextPreprocessor
participant eSpeak as "eSpeak NG"
participant ffmpeg
Client->>AudioProcessingService: GenerateNormalizedWavAsync(text, voice)
AudioProcessingService->>TextPreprocessor: Preprocess(text, voice)
TextPreprocessor->>TextPreprocessor: Expand abbreviations & shorthand
TextPreprocessor->>TextPreprocessor: Normalize units & small numbers
TextPreprocessor-->>AudioProcessingService: normalized text
AudioProcessingService->>eSpeak: Synthesize normalized text
eSpeak-->>AudioProcessingService: raw.wav
AudioProcessingService->>ffmpeg: Normalize audio
ffmpeg-->>AudioProcessingService: normalized.wav
AudioProcessingService-->>Client: byte[] normalized audio
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs (1)
238-248: ⚡ Quick winRegex instantiation in loop impacts performance.
Regex.Replacewith a string pattern creates a newRegexinstance on every iteration. ForAbbreviationMapwith ~30+ entries, this is inefficient. Consider pre-compiling the patterns or using a single combined pattern.♻️ Proposed optimization using compiled regex cache
+ private static readonly Dictionary<string, Regex> AbbreviationRegexCache = + AbbreviationMap.ToDictionary( + kvp => kvp.Key, + kvp => new Regex($@"\b{Regex.Escape(kvp.Key)}\b", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled)); + private static string ExpandAbbreviations(string text) { - // Sort keys longest-first so "ALSEMS" is matched before "ALS". - foreach (var kvp in AbbreviationMap.OrderByDescending(k => k.Key.Length)) + foreach (var kvp in AbbreviationMap.OrderByDescending(k => k.Key.Length)) { - var pattern = $@"\b{Regex.Escape(kvp.Key)}\b"; - text = Regex.Replace(text, pattern, kvp.Value, RegexOptions.IgnoreCase | RegexOptions.CultureInvariant); + text = AbbreviationRegexCache[kvp.Key].Replace(text, kvp.Value); } return text; }The same pattern applies to
ExpandDispatchShorthandandExpandAddressAbbreviations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs` around lines 238 - 248, The current ExpandAbbreviations method (and similarly ExpandDispatchShorthand and ExpandAddressAbbreviations) builds a fresh Regex for each AbbreviationMap entry which is inefficient; fix by precompiling and caching Regex objects (or a single combined regex) for the map keys outside the loop (e.g., a static readonly Dictionary<string, Regex> or a static compiled Regex built from alternation of escaped keys ordered longest-first) and then use Regex.Replace with those precompiled instances (preserve RegexOptions.IgnoreCase | RegexOptions.CultureInvariant and compiled option). Update the referenced methods to use the cached/compiled regexes rather than instantiating new Regex on every iteration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Web/Resgrid.Web.Tts/Services/S3StorageService.cs`:
- Line 355: BuildCanonicalUri currently always returns "/{bucket}/{objectKey}"
which breaks virtual-host style signing; update the BuildCanonicalUri method to
detect the S3 style using the options flag (e.g., _options.ForcePathStyle) and
return "/{objectKey}" when path-style is disabled and "/{bucket}/{objectKey}"
when it is enabled, preserving the leading slash and using _options.Bucket and
the provided objectKey to construct the correct canonical URI.
- Around line 261-299: The canonicalHeaders string in CalculateSignature must be
built to include exactly the headers listed in signedHeaders (not always host,
x-amz-content-sha256 and x-amz-date); change CalculateSignature so it splits
signedHeaders and conditionally appends each header (host, x-amz-content-sha256,
x-amz-date) only if that header is present in signedHeaders, and compute
payloadHash consistently (use "UNSIGNED-PAYLOAD" when content is null and
x-amz-content-sha256 is not signed, or when it is signed use
HexSha256(content)); ensure BuildCanonicalUri, DeriveSigningKey, HmacSha256
usages remain unchanged and that the constructed canonicalRequest and
stringToSign use the adjusted canonicalHeaders and payloadHash.
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs`:
- Around line 273-282: The current ExpandSlashNotation uses word-boundary
anchors which fail for keys containing non-word chars like "W/"; update the
regex building there to match keys that are not surrounded by word characters by
replacing the \b-based pattern with lookaround-based boundaries (e.g., use a
left negative lookbehind for \w and a right negative lookahead for \w) so keys
like "W/" in SlashNotationMap will match before/after punctuation or spaces;
update the pattern creation in ExpandSlashNotation and keep RegexOptions the
same.
---
Nitpick comments:
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs`:
- Around line 238-248: The current ExpandAbbreviations method (and similarly
ExpandDispatchShorthand and ExpandAddressAbbreviations) builds a fresh Regex for
each AbbreviationMap entry which is inefficient; fix by precompiling and caching
Regex objects (or a single combined regex) for the map keys outside the loop
(e.g., a static readonly Dictionary<string, Regex> or a static compiled Regex
built from alternation of escaped keys ordered longest-first) and then use
Regex.Replace with those precompiled instances (preserve RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant and compiled option). Update the referenced
methods to use the cached/compiled regexes rather than instantiating new Regex
on every iteration.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b6cd7563-0bfe-439a-b0fa-accfddef5bda
📒 Files selected for processing (7)
Web/Resgrid.Web.Tts/Program.csWeb/Resgrid.Web.Tts/Resgrid.Web.Tts.csprojWeb/Resgrid.Web.Tts/Services/AudioProcessingService.csWeb/Resgrid.Web.Tts/Services/ITextPreprocessor.csWeb/Resgrid.Web.Tts/Services/S3StorageService.csWeb/Resgrid.Web.Tts/Services/TextPreprocessor.csWeb/Resgrid.Web.Tts/Services/TtsService.cs
💤 Files with no reviewable changes (1)
- Web/Resgrid.Web.Tts/Resgrid.Web.Tts.csproj
| { | ||
| _logger.LogDebug( | ||
| "TextPreprocessor normalised \"{OriginalText}\" to \"{NormalisedText}\"", | ||
| original, |
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs (1)
223-228:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't log raw dispatch text here.
These values can include addresses, names, phone numbers, or medical details. Logging both copies at debug level creates an unnecessary PII/PHI leak surface.
💡 Safer logging alternative
_logger.LogDebug( - "TextPreprocessor normalised \"{OriginalText}\" to \"{NormalisedText}\"", - original, - result); + "TextPreprocessor normalised input for voice {Voice}. OriginalLength={OriginalLength}, NormalisedLength={NormalisedLength}", + voice, + original.Length, + result.Length);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs` around lines 223 - 228, In TextPreprocessor, remove the debug log that prints raw dispatch text (the _logger.LogDebug call that interpolates original and result) to avoid PII/PHI leaks; instead log a non-sensitive summary such as that normalization occurred and include safe metadata (e.g., a hash/fingerprint of original/result, their lengths, or which normalization rules matched) or the specific transformations applied so you can trace behavior without storing raw text.
🧹 Nitpick comments (1)
Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs (1)
360-374: ⚡ Quick winExtract
IsEnglishVoiceto one shared helper.
Web/Resgrid.Web.Tts/Services/AudioProcessingService.cs:102-116carries the same predicate. If the accepted voice set changes in one place, preprocessing and synthesis will silently diverge.As per coding guidelines, "Design for testability; avoid hidden dependencies inside methods and prefer explicit, pure functions".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs` around lines 360 - 374, The IsEnglishVoice logic is duplicated between TextPreprocessor.IsEnglishVoice and AudioProcessingService (lines ~102-116); extract this predicate into a single public static helper (e.g., VoiceUtils.IsEnglishVoice or TtsVoiceHelper.IsEnglishVoice) that implements the same pure logic (null/whitespace check, trim, '+' variant split, compare "en", "en-..." and "mb-us1" case-insensitively), replace both existing implementations to call that helper, make the helper internal/public static so both classes can reference it, and add/adjust unit tests to cover the single helper instead of two copies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Web/Resgrid.Web.Tts/Services/S3StorageService.cs`:
- Around line 346-410: BuildObjectUrl and GetHost assume _options.Endpoint is a
valid URI and always call GetEndpointUri(), which throws when _options.Endpoint
is blank; mirror BuildDirectObjectUrl behavior by treating an
empty/null/whitespace Endpoint as “use AWS regional endpoint”: update
BuildObjectUrl and GetHost to check string.IsNullOrWhiteSpace(_options.Endpoint)
and construct the host/URL using the regional pattern
(https://{Bucket}.s3.{Region}.amazonaws.com and corresponding host/authority)
when Endpoint is empty, otherwise keep the existing GetEndpointUri() logic;
reference methods BuildObjectUrl, GetHost, GetEndpointUri and the
_options.Endpoint/_options.Region/_options.Bucket fields when making the change.
- Around line 149-155: The objectKey is interpolated raw into URLs and the SigV4
canonical URI (in GetObjectUrlAsync, BuildObjectUrl, BuildDirectObjectUrl,
BuildCanonicalUri) causing broken URLs/signature mismatches for characters like
spaces/#/?/%/+. Fix by percent-encoding the object key consistently before use:
encode each path segment (split on '/' and apply Uri.EscapeDataString then
rejoin with '/') and use that encodedKey for constructing the public URL
(_options.PublicBaseUrl), signed/base URLs (BuildObjectUrl,
BuildDirectObjectUrl) and the canonical URI generation (BuildCanonicalUri) so
the visible URL and the SigV4 canonical path match exactly.
- Around line 192-195: The code adds the x-amz-content-sha256 header twice (to
request.Headers and to request.Content.Headers), causing a duplicate header and
breaking S3 signature verification; remove the second addition to
request.Content.Headers so only request.Headers.Add("x-amz-content-sha256",
payloadHash) is used (or alternatively guard so the header is only added once if
present), locating this change around the payloadHash computation in
S3StorageService (the lines that call request.Headers.Add and
request.Content.Headers.Add).
- Around line 274-305: The code currently builds canonical headers by
enumerating signedHeadersSet (a HashSet) which loses the SignedHeaders order;
replace the HashSet-based enumeration with a stable ordered list parsed from
signedHeaders (e.g., var signedHeadersList = signedHeaders.Split(';',
StringSplitOptions.RemoveEmptyEntries).Select(h => h.Trim()).ToList()) and
iterate signedHeadersList when appending to canonicalHeadersBuilder so headers
appear in the exact SignedHeaders order; if you still need fast membership
checks (used later when computing sha256HeaderSigned), keep a separate
case-insensitive HashSet for Contains checks (or use
signedHeadersList.Contains("x-amz-content-sha256", StringComparer.Ordinal)) to
preserve order while retaining correct membership semantics.
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs`:
- Around line 195-197: In TextPreprocessor (in the method containing the if
(string.IsNullOrWhiteSpace(text)) branch) normalize blank input to a stable
empty string by returning string.Empty instead of returning text ??
string.Empty; update the branch so any null or whitespace-only input collapses
to string.Empty to prevent whitespace-only strings from flowing into synthesis.
- Around line 325-333: The regex in ExpandAddressAbbreviations only matches when
the abbreviation immediately follows a digit+space, so addresses like "123 Main
St" are missed; update the pattern used when replacing AddressAbbreviationMap
entries so it anchors to an address-like prefix (the house number) but allows
intermediate street-name tokens between the number and the suffix (e.g. require
a leading digit/word-boundary like \b\d+\b then allow one or more words/spaces
before matching the abbreviation), keep the word boundary on the abbreviation
(kvp.Key) and preserve RegexOptions.IgnoreCase | RegexOptions.CultureInvariant
and the OrderByDescending(k => k.Key.Length) handling in
ExpandAddressAbbreviations.
---
Duplicate comments:
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs`:
- Around line 223-228: In TextPreprocessor, remove the debug log that prints raw
dispatch text (the _logger.LogDebug call that interpolates original and result)
to avoid PII/PHI leaks; instead log a non-sensitive summary such as that
normalization occurred and include safe metadata (e.g., a hash/fingerprint of
original/result, their lengths, or which normalization rules matched) or the
specific transformations applied so you can trace behavior without storing raw
text.
---
Nitpick comments:
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs`:
- Around line 360-374: The IsEnglishVoice logic is duplicated between
TextPreprocessor.IsEnglishVoice and AudioProcessingService (lines ~102-116);
extract this predicate into a single public static helper (e.g.,
VoiceUtils.IsEnglishVoice or TtsVoiceHelper.IsEnglishVoice) that implements the
same pure logic (null/whitespace check, trim, '+' variant split, compare "en",
"en-..." and "mb-us1" case-insensitively), replace both existing implementations
to call that helper, make the helper internal/public static so both classes can
reference it, and add/adjust unit tests to cover the single helper instead of
two copies.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fe0ca918-2ad0-4134-b6bb-a61f98230483
📒 Files selected for processing (2)
Web/Resgrid.Web.Tts/Services/S3StorageService.csWeb/Resgrid.Web.Tts/Services/TextPreprocessor.cs
| if (string.IsNullOrWhiteSpace(text)) | ||
| { | ||
| return text ?? string.Empty; |
There was a problem hiding this comment.
Normalize blank input to string.Empty.
This branch preserves whitespace-only input unchanged, so " " still flows into synthesis instead of collapsing to a stable empty value.
💡 Suggested fix
if (string.IsNullOrWhiteSpace(text))
{
- return text ?? string.Empty;
+ return string.Empty;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Web/Resgrid.Web.Tts/Services/TextPreprocessor.cs` around lines 195 - 197, In
TextPreprocessor (in the method containing the if
(string.IsNullOrWhiteSpace(text)) branch) normalize blank input to a stable
empty string by returning string.Empty instead of returning text ??
string.Empty; update the branch so any null or whitespace-only input collapses
to string.Empty to prevent whitespace-only strings from flowing into synthesis.
|
Approve |
Summary by CodeRabbit
New Features
Performance & Reliability
Tests
Documentation