Skip to content

Conversation

alisonhawk
Copy link
Collaborator

@alisonhawk alisonhawk commented Oct 3, 2025

  • Added OpenAI client initialization with API key validation.
  • Created validation utility for OpenAI chat completions using Zod schema.
  • Developed complete post analysis functionality, integrating title generation, categorization, and sentiment analysis.
  • Implemented Redis post group management with title and sentiment summary generation.
  • Added Redis client initialization and connection handling with error management.
  • Created deduplication listener for posts using cosine similarity on embeddings.
  • Seeded Redis database with mock data and provided command-line interface for seeding operations.
  • Added embedding retrieval with retry logic for OpenAI API calls.

Summary by CodeRabbit

  • New Features
    • Added robust Redis client with retry/backoff and safer logging.
    • Introduced Redis seeding from sample data and utilities to check/test Redis.
    • Added background listener to auto-deduplicate highly similar posts.
  • Refactor
    • Reorganized modules and updated import paths across the app with no behavioral changes to analysis/classification flows.
  • Tests
    • Added/updated simple tests and scripts for embeddings and Redis connectivity.
  • Chores
    • Updated seed and Redis module paths to align with the new structure.

- Added OpenAI client initialization with API key validation.
- Created validation utility for OpenAI chat completions using Zod schema.
- Developed complete post analysis functionality, integrating title generation, categorization, and sentiment analysis.
- Implemented Redis post group management with title and sentiment summary generation.
- Added Redis client initialization and connection handling with error management.
- Created deduplication listener for posts using cosine similarity on embeddings.
- Seeded Redis database with mock data and provided command-line interface for seeding operations.
- Added embedding retrieval with retry logic for OpenAI API calls.
Copy link

coderabbitai bot commented Oct 3, 2025

Walkthrough

Relocates several analysis and OpenAI modules and updates import paths accordingly. Introduces a robust Redis client with initialization, retries, and concurrency control. Adds Redis seeding, test, and check utilities. Moves the Redis deduplication listener into src/redis with keyspace subscription; removes older root-level equivalents.

Changes

Cohort / File(s) Summary of changes
CLI import re-targeting
run-classifier.ts
Updated internal imports to new module locations; no control-flow changes.
Analysis module relocation
src/analysis/analyzeSentiment.ts, src/analysis/generateTitle.ts, src/post/completePostAnalysis.ts, src/lib/constants.ts
Added relocation comments; updated imports to new analysis/openai paths; added bridging imports to .js counterparts; no logic changes.
OpenAI/util path updates
src/openai/classifyWithOpenAI.ts, src/openai/openaiValidationUtil.ts, src/test/embeddingTest.ts
Adjusted import paths to reflect new structure; added categorizePost import; updated utility import paths.
Seed and linking path updates
src/seed/seedData.ts, src/seed/seedDatabase.ts, src/redis/postGroup.ts
Updated imports to reference moved Redis and analysis modules; behavior unchanged.
Redis client (new)
src/redis/redisClient.ts
Added env config, IPv4 URL handling, redacted logging, singleton init with retry/backoff, concurrent init coordination, error handling; exported initRedis and stricter getRedisClient.
Redis seeding and checks (add/remove)
src/redis/redisSeed.ts, src/redis/redisCheck.ts, src/redisCheck.ts (removed)
Added seeding from optional JSON file with init/quit handling; added check script (duplicated block present); removed old root-level check script.
Redis dedupe listener move
src/redis/redisDedupeListener.ts, src/redisDedupeListener.ts (removed)
Implemented keyspace-based deduplication listener under src/redis; subscribes to posts set events, computes embeddings, prunes near-duplicates; includes locking and caching; file content duplicated; removed old root-level listener.
Redis connectivity test (new)
src/redis/redisTest.ts
Added simple ping test using initRedis; contains redundant duplicate import lines.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Cron as Cron/External Writer
  participant R as Redis
  participant L as Dedupe Listener
  participant E as Embedding API

  Note over R,L: Keyspace notifications enabled for "posts"
  Cron->>R: SET "posts" "[...posts]"
  R-->>L: __keyspace@0__:posts = "set" event

  rect rgba(200,235,255,0.3)
    L->>R: SETNX "dedupe:lock:posts" 1 (TTL short)
    alt lock acquired
      L->>R: GET "posts"
      L->>L: Parse JSON, pick latest and prior
      opt missing embeddings
        L->>R: GET "emb:<id>"
        alt cache miss
          L->>E: getEmbeddingWithRetry(post)
          E-->>L: embedding
          L->>R: SETEX "emb:<id>" embedding
        end
      end
      L->>L: cosineSimilarity(latest vs prior)
      alt similarity > 0.95
        L->>R: SET "posts" "[...filtered]"
      else no duplicate
        Note over L: Do nothing
      end
    else lock not acquired
      Note over L: Another worker handling
    end
    L->>R: DEL "dedupe:lock:posts"
  end
Loading
sequenceDiagram
  autonumber
  participant App as Caller
  participant RC as redisClient.ts
  participant R as Redis

  App->>RC: initRedis()
  alt not initialized and not connecting
    RC->>R: connect (attempt 1..5 with backoff)
    alt connect ok
      RC-->>App: client
    else connect fails after retries
      RC-->>App: throw original error
    end
  else connecting in-flight
    App-->>RC: awaits connectingPromise
    RC-->>App: client or error
  end

  App->>RC: getRedisClient()
  alt client ready
    RC-->>App: client
  else not initialized/closed
    RC-->>App: throw "initialize first"
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • tasin2610

Poem

hop hop, I shuffle paths with care,
wires replugged, Redis hums in air.
Keys go ping, dupes go poof—so neat!
Backoff, lock, a tidy feat.
Title tulips bloom in code’s bright light—
a rabbit signs: deploy tonight. ✨🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run `@coderabbitai generate docstrings` to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit's high-level summary is enabled.
Title Check ✅ Passed The title “feat: Implement OpenAI client and validation utilities” accurately describes the addition of API key handling and Zod-based validation for OpenAI chat completions but does not reflect the broader scope of this pull request, which also introduces post‐analysis workflows, Redis integration, deduplication, seeding scripts, and embedding retry logic.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/folder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/redis/redisTest.ts (1)

1-25: Critical: Remove duplicate code.

The file contains two identical copies of the testRedis function and its invocation (lines 1-12 and lines 13-24). This appears to be a merge artifact or copy-paste error. Additionally, the getRedisClient import on line 13 is unused.

Apply this diff to remove the duplicate code:

-import { initRedis } from './redisClient.js';
-
-async function testRedis() {
-  const client = await initRedis();
-
-  const pong = await client.ping();
-  console.log('Ping Response:', pong); // Output: "PONG" means connection success
-
-  await client.disconnect();
-}
-
-testRedis();
 import { initRedis, getRedisClient } from './redisClient.js';
 
 async function testRedis() {

Or if the second block was intended (with getRedisClient import), remove the first block instead.

src/redis/redisSeed.ts (1)

36-42: Avoid throw in finally block.

Throwing in a finally block (line 41) overwrites any error from the try block. If redisClient.set() fails and redisClient.quit() also fails, only the quit error will propagate, masking the original seeding failure.

Log the disconnect error instead of rethrowing:

     try {
       await redisClient.quit();
     } catch (err) {
       console.error('Failed to disconnect Redis client:', err);
-      // rethrow so callers see the original failure if needed
-      throw err;
     }
src/redis/redisClient.ts (2)

60-66: Unsafe type assertion during cleanup.

Line 65 uses (newClient as any).isOpen to check if the client is open before disconnecting. This type assertion bypasses TypeScript's type checking and could fail if the property doesn't exist or has a different structure in a failed connection state.

Consider a safer approach:

         try {
           // Attempt to cleanly disconnect if partially connected
-          // ignore errors from disconnect
-          // eslint-disable-next-line @typescript-eslint/no-explicit-any
-          if ((newClient as any).isOpen) await newClient.disconnect();
+          // Try to disconnect, ignoring any errors
+          await newClient.disconnect();
-        } catch (e) {}
+        } catch (e) {
+          // Ignore cleanup errors
+        }

The disconnect() method should handle being called on a non-connected client gracefully, so the explicit check may be unnecessary.


9-20: Remove duplicate safeRedisUrlForLog definitions and unused utility
safeRedisUrlForLog is defined at lines 9 and 100 but never used; remove duplicates or extract to a shared helper if needed.

🧹 Nitpick comments (3)
src/redis/redisSeed.ts (1)

20-28: Consider validating parsed JSON structure.

After parsing sample_posts.json, the code assumes the JSON is an array of valid InputPost objects. Malformed JSON matching the array type but with missing/incorrect fields could cause downstream issues.

Add runtime validation:

     try {
       const raw = fs.readFileSync(dataPath, 'utf8');
-      inputPosts = JSON.parse(raw);
+      const parsed = JSON.parse(raw);
+      if (!Array.isArray(parsed)) {
+        throw new Error('Expected an array of posts');
+      }
+      inputPosts = parsed;
       console.log(`Loaded ${inputPosts.length} posts from data/sample_posts.json`);
     } catch (err) {

For stronger validation, consider using a schema validator like Zod (already a project dependency per learnings).

src/post/completePostAnalysis.ts (2)

20-69: Simplify redundant error handling.

The error handling has some redundancy:

  1. Line 34-36 throws an error when title is null, which is immediately caught by the try-catch at line 38-41 and added to the errors array.
  2. Line 39 (and similar lines 50, 61) converts the error to a string using template literals, which may lose stack trace information.

Consider simplifying by checking for null directly without throwing, and preserving error details:

 // Generate title
 try {
   const title = await generateTitleForPost(post, usedClient);
-  if (title === null) {
-    throw new Error(`Title generation failed for post: ${post}`);
-  }
-  result.title = title;
+  if (title === null) {
+    result.errors?.push('Title generation returned null');
+  } else {
+    result.title = title;
+  }
 } catch (e) {
-  result.errors?.push(`Title generation failed: ${e}`);
+  result.errors?.push(`Title generation failed: ${e instanceof Error ? e.message : String(e)}`);
   console.error('Title generation error:', e);
 }

Apply similar changes to the categorization and sentiment blocks.


72-95: Consider parallel processing for better performance.

The function processes posts sequentially, which could be slow for large batches. Consider using Promise.allSettled() for parallel processing with proper concurrency control:

 export async function analyzeMultipleCompletePosts(
   posts: string[],
   clientOverride?: OpenAI
 ): Promise<PostAnalysisResult[]> {
-  const results: PostAnalysisResult[] = [];
-
-  for (const post of posts) {
-    try {
-      const analysis = await analyzeCompletePost(post, clientOverride);
-      results.push(analysis);
-    } catch (e) {
-      console.error('Error analyzing post:', post, e);
-      results.push({
-        post,
-        title: '',
-        categorization: { categories: [], subcategories: [] },
-        sentiment: 'NEUTRAL',
-        errors: [`Complete analysis failed: ${e}`]
-      });
-    }
-  }
-
-  return results;
+  const results = await Promise.allSettled(
+    posts.map(post => analyzeCompletePost(post, clientOverride))
+  );
+
+  return results.map((result, index) => {
+    if (result.status === 'fulfilled') {
+      return result.value;
+    } else {
+      console.error('Error analyzing post:', posts[index], result.reason);
+      return {
+        post: posts[index],
+        title: '',
+        categorization: { categories: [], subcategories: [] },
+        sentiment: 'NEUTRAL',
+        errors: [`Complete analysis failed: ${result.reason}`]
+      };
+    }
+  });
 }

Note: Add concurrency limiting if processing many posts to avoid rate limits.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ec8528 and a0dde88.

📒 Files selected for processing (18)
  • run-classifier.ts (1 hunks)
  • src/analysis/analyzeSentiment.ts (1 hunks)
  • src/analysis/generateTitle.ts (1 hunks)
  • src/lib/constants.ts (1 hunks)
  • src/openai/classifyWithOpenAI.ts (1 hunks)
  • src/openai/openaiValidationUtil.ts (1 hunks)
  • src/post/completePostAnalysis.ts (1 hunks)
  • src/redis/postGroup.ts (1 hunks)
  • src/redis/redisCheck.ts (1 hunks)
  • src/redis/redisClient.ts (1 hunks)
  • src/redis/redisDedupeListener.ts (1 hunks)
  • src/redis/redisSeed.ts (1 hunks)
  • src/redis/redisTest.ts (1 hunks)
  • src/redisCheck.ts (0 hunks)
  • src/redisDedupeListener.ts (0 hunks)
  • src/seed/seedData.ts (1 hunks)
  • src/seed/seedDatabase.ts (1 hunks)
  • src/test/embeddingTest.ts (1 hunks)
💤 Files with no reviewable changes (2)
  • src/redisCheck.ts
  • src/redisDedupeListener.ts
🧰 Additional context used
🧬 Code graph analysis (7)
src/redis/redisCheck.ts (1)
src/redis/redisClient.ts (2)
  • initRedis (25-83)
  • initRedis (116-174)
src/seed/seedData.ts (2)
src/seedDatabase.ts (5)
  • seedDatabase (9-38)
  • verifySeeding (63-87)
  • group (26-31)
  • post (28-30)
  • group (80-82)
src/postGroup.ts (2)
  • fetchPostGroupsFromRedis (38-50)
  • logTitlesForAllPostGroups (53-98)
src/openai/classifyWithOpenAI.ts (3)
src/completePostAnalysis.ts (3)
  • analyzeCompletePost (17-66)
  • result (115-137)
  • runCompleteAnalysisDemo (100-143)
src/generateTitle.ts (1)
  • generateSentimentSummariesForGroup (69-120)
src/classifyWithOpenAI.ts (2)
  • runCategorization (53-63)
  • categorizePost (26-50)
src/analysis/generateTitle.ts (3)
src/generateTitle.ts (1)
  • generateTitleForPost (32-66)
src/completePostAnalysis.ts (1)
  • analyzeCompletePost (17-66)
src/analyzeSentiment.ts (1)
  • runExample (53-75)
src/redis/redisDedupeListener.ts (2)
src/redis/redisClient.ts (2)
  • initRedis (25-83)
  • initRedis (116-174)
src/test/embeddingTest.ts (1)
  • getEmbeddingWithRetry (3-40)
src/redis/redisTest.ts (1)
src/redis/redisClient.ts (2)
  • initRedis (25-83)
  • initRedis (116-174)
src/redis/redisSeed.ts (1)
src/redis/redisClient.ts (2)
  • initRedis (25-83)
  • initRedis (116-174)
🪛 Biome (2.1.2)
src/redis/redisCheck.ts

[error] 16-16: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 18-18: Shouldn't redeclare 'checkRedisSeed'. Consider to delete it or rename it.

'checkRedisSeed' is defined here:

(lint/suspicious/noRedeclare)

src/redis/redisDedupeListener.ts

[error] 101-101: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 102-102: Shouldn't redeclare 'getEmbeddingWithRetry'. Consider to delete it or rename it.

'getEmbeddingWithRetry' is defined here:

(lint/suspicious/noRedeclare)


[error] 104-104: Shouldn't redeclare 'cosineSimilarity'. Consider to delete it or rename it.

'cosineSimilarity' is defined here:

(lint/suspicious/noRedeclare)


[error] 120-120: Shouldn't redeclare 'main'. Consider to delete it or rename it.

'main' is defined here:

(lint/suspicious/noRedeclare)

src/redis/redisClient.ts

[error] 92-92: Shouldn't redeclare 'dotenv'. Consider to delete it or rename it.

'dotenv' is defined here:

(lint/suspicious/noRedeclare)


[error] 93-93: Shouldn't redeclare 'createClient'. Consider to delete it or rename it.

'createClient' is defined here:

(lint/suspicious/noRedeclare)


[error] 93-93: Shouldn't redeclare 'RedisClientType'. Consider to delete it or rename it.

'RedisClientType' is defined here:

(lint/suspicious/noRedeclare)


[error] 98-98: Shouldn't redeclare 'REDIS_URL'. Consider to delete it or rename it.

'REDIS_URL' is defined here:

(lint/suspicious/noRedeclare)


[error] 100-100: Shouldn't redeclare 'safeRedisUrlForLog'. Consider to delete it or rename it.

'safeRedisUrlForLog' is defined here:

(lint/suspicious/noRedeclare)


[error] 113-113: Shouldn't redeclare 'client'. Consider to delete it or rename it.

'client' is defined here:

(lint/suspicious/noRedeclare)


[error] 114-114: Shouldn't redeclare 'connecting'. Consider to delete it or rename it.

'connecting' is defined here:

(lint/suspicious/noRedeclare)


[error] 116-116: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 176-176: Shouldn't redeclare 'getRedisClient'. Consider to delete it or rename it.

'getRedisClient' is defined here:

(lint/suspicious/noRedeclare)

src/lib/constants.ts

[error] 3-3: Expected an expression, or an assignment but instead found ']'.

Expected an expression, or an assignment here.

(parse)

src/redis/redisSeed.ts

[error] 87-87: Unsafe usage of 'throw'.

'throw' in 'finally' overwrites the control flow statements inside 'try' and 'catch'.

(lint/correctness/noUnsafeFinally)


[error] 47-47: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 48-48: Shouldn't redeclare 'fs'. Consider to delete it or rename it.

'fs' is defined here:

(lint/suspicious/noRedeclare)


[error] 49-49: Shouldn't redeclare 'path'. Consider to delete it or rename it.

'path' is defined here:

(lint/suspicious/noRedeclare)


[error] 51-51: Shouldn't redeclare 'InputPost'. Consider to delete it or rename it.

'InputPost' is defined here:

(lint/suspicious/noRedeclare)


[error] 59-59: Shouldn't redeclare 'seedRedis'. Consider to delete it or rename it.

'seedRedis' is defined here:

(lint/suspicious/noRedeclare)

🔇 Additional comments (9)
src/seed/seedData.ts (1)

1-1: LGTM: Import path correctly updated.

The import path for PostGroup has been correctly updated to reflect the relocation to the Redis module folder.

src/seed/seedDatabase.ts (1)

1-3: LGTM: Import paths correctly updated.

The import paths have been correctly updated to reflect the relocation of Redis-related modules to the src/redis folder.

src/openai/openaiValidationUtil.ts (1)

7-7: LGTM: Utils import path correctly updated.

The import path for utility functions has been correctly updated to reflect the parent directory relationship.

src/redis/postGroup.ts (1)

2-2: LGTM: Import path correctly updated.

The import path for generateTitle functions has been correctly updated to reflect the relocation to the analysis subfolder.

src/analysis/analyzeSentiment.ts (1)

5-6: LGTM: Import paths correctly updated.

The import paths for OpenAI client and validation utilities have been correctly updated to reflect the analysis subfolder location.

src/openai/classifyWithOpenAI.ts (1)

10-11: LGTM: Import paths correctly updated.

The import paths for generateTitleForPost and CATEGORIES have been correctly updated to reflect the new module locations.

src/redis/redisSeed.ts (1)

13-15: LGTM on Redis client initialization.

Correctly awaits initRedis() which handles retries and concurrent calls as shown in redisClient.ts.

src/post/completePostAnalysis.ts (1)

5-8: LGTM! Import paths updated correctly.

The import paths have been updated to reflect the module reorganization, moving OpenAI utilities to ../openai/ and analysis functions to ../analysis/. The paths are correct for the file's location.

src/redis/redisClient.ts (1)

85-91: LGTM! Clear initialization guard.

The function properly enforces that initRedis() must be called before accessing the client, with a helpful error message.

Comment on lines +1 to +3
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { analyzeMultiplePosts } from './analyzeSentiment.js';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Remove incorrect self-referential import.

Lines 1-3 contain a relocation comment and a self-referential import that imports analyzeMultiplePosts from ./analyzeSentiment.js. This creates an incorrect circular dependency since the file is trying to import a function that it itself defines (the analyzeMultiplePosts function is defined in this same file at lines 22-54).

Apply this diff to remove the incorrect lines:

-// This file has been moved to the analysis subfolder.
-// Please update your imports accordingly.
-import { analyzeMultiplePosts } from './analyzeSentiment.js';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { analyzeMultiplePosts } from './analyzeSentiment.js';
🤖 Prompt for AI Agents
In src/analysis/analyzeSentiment.ts lines 1-3, remove the self-referential
relocation comment and the import statement that imports analyzeMultiplePosts
from './analyzeSentiment.js' because analyzeMultiplePosts is defined in this
same file (lines 22-54) which creates a circular dependency; delete those lines
and ensure any external modules import this file from its new analysis/ path
instead of relying on this file to re-import itself.

Comment on lines +1 to +3
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
export const CATEGORIES = [...]; // Keep the original content
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Remove malformed placeholder export.

Lines 1-3 contain a relocation comment with a malformed export statement that has incomplete syntax (export const CATEGORIES = [...];). This causes a parse error as reported by static analysis tools. The actual CATEGORIES export is correctly defined on lines 6-25.

Additionally, the relocation comment states the file has been moved to the analysis subfolder, but the file path shows it's still in src/lib/constants.ts. This creates confusion about the actual file location.

Apply this diff to remove the malformed lines:

-// This file has been moved to the analysis subfolder.
-// Please update your imports accordingly.
-export const CATEGORIES = [...]; // Keep the original content
-// src/constants.ts
-

Based on static analysis hints.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Biome (2.1.2)

[error] 3-3: Expected an expression, or an assignment but instead found ']'.

Expected an expression, or an assignment here.

(parse)

🤖 Prompt for AI Agents
In src/lib/constants.ts lines 1-3, remove the relocation comment and the
malformed placeholder export (the entire lines containing "// This file has been
moved..." and "export const CATEGORIES = [...];") so the file parses cleanly and
keeps the correct CATEGORIES export defined on lines 6-25; if the constants were
actually moved to an analysis subfolder, update any imports across the codebase
to point to the new path, otherwise delete the misleading relocation note only.

Comment on lines +1 to +3
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { categorizePost } from './analysis/classifyWithOpenAI.js';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Remove incorrect self-referential import.

Lines 1-3 contain a relocation comment and a self-referential import that imports categorizePost from ./analysis/classifyWithOpenAI.js. This creates an incorrect circular dependency since:

  1. The file is located at src/openai/classifyWithOpenAI.ts, so the path ./analysis/classifyWithOpenAI.js would incorrectly point to src/openai/analysis/classifyWithOpenAI.js
  2. The categorizePost function is defined in this same file at line 29

The relocation comment suggests this file should be moved to the analysis subfolder, but it's still in src/openai/.

Apply this diff to remove the incorrect lines:

-// This file has been moved to the analysis subfolder.
-// Please update your imports accordingly.
-import { categorizePost } from './analysis/classifyWithOpenAI.js';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { categorizePost } from './analysis/classifyWithOpenAI.js';
🤖 Prompt for AI Agents
In src/openai/classifyWithOpenAI.ts lines 1-3: remove the relocation comment and
the self-referential import of categorizePost from
./analysis/classifyWithOpenAI.js; keep the local categorizePost implementation
in this file (line ~29) as the source of truth, and update any external modules
that expect the function from the analysis subfolder to import from the correct
path (or move this file to src/analysis/ and update all imports accordingly) so
there is no circular or incorrect path reference.

Comment on lines +1 to +3
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { callOpenAIWithValidation } from './analysis/openaiValidationUtil.js';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Remove incorrect self-referential import.

Lines 1-3 contain a relocation comment and a self-referential import that imports callOpenAIWithValidation from ./analysis/openaiValidationUtil.js. This creates an incorrect circular dependency since the file is trying to import a function from itself (or from a path that doesn't align with the current file location).

The relocation comment indicates this file should be moved to the analysis subfolder, but the file path shows it's still in src/openai/. Either:

  1. The file needs to be moved to src/analysis/openaiValidationUtil.ts and the self-import removed, or
  2. The relocation comment and self-import should be removed if the file is staying in src/openai/

Apply this diff to remove the incorrect lines:

-// This file has been moved to the analysis subfolder.
-// Please update your imports accordingly.
-import { callOpenAIWithValidation } from './analysis/openaiValidationUtil.js';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { callOpenAIWithValidation } from './analysis/openaiValidationUtil.js';
🤖 Prompt for AI Agents
In src/openai/openaiValidationUtil.ts lines 1-3, remove the relocation comment
and the self-referential import statement that imports callOpenAIWithValidation
from ./analysis/openaiValidationUtil.js; if the file was intended to be
relocated instead, move this file to src/analysis/openaiValidationUtil.ts and
update any external import paths to point to the new location (ensuring no
self-import remains) so there are no circular or incorrect imports.

Comment on lines +1 to +3
// This file has been moved to the analysis subfolder.
// Please update your imports accordingly.
import { analyzeCompletePost } from './analysis/completePostAnalysis.js';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Potential circular import detected.

Line 3 imports analyzeCompletePost from './analysis/completePostAnalysis.js', but this file also defines and exports analyzeCompletePost at line 20. This creates a circular dependency risk if both files attempt to import from each other.

If this is a temporary forwarding stub during migration, ensure that './analysis/completePostAnalysis.js' is the canonical implementation and this file will be removed after all imports are updated. Otherwise, remove the import on line 3 to avoid the circular dependency.

Verify the module structure:


🏁 Script executed:

#!/bin/bash
# Check if ./analysis/completePostAnalysis.js exists and what it exports/imports
fd -t f 'completePostAnalysis' src/
rg -n "export.*analyzeCompletePost" src/
rg -n "from.*completePostAnalysis" src/

Length of output: 289


Fix missing target module and duplicate export. Line 3 imports from ./analysis/completePostAnalysis.js, which doesn’t exist, and line 20 redefines analyzeCompletePost. Move the implementation into src/post/analysis/completePostAnalysis.ts, update all imports, and remove the inline definition in this file.

🤖 Prompt for AI Agents
In src/post/completePostAnalysis.ts around lines 1 to 3, the file imports from
./analysis/completePostAnalysis.js which does not exist and also contains a
duplicate inline export of analyzeCompletePost (line ~20); move the existing
analyzeCompletePost implementation into a new file
src/post/analysis/completePostAnalysis.ts, export it there, update all project
imports to point to src/post/analysis/completePostAnalysis (use correct .ts path
or index barrel as per project convention), and remove the inline
analyzeCompletePost definition and duplicate export from
src/post/completePostAnalysis.ts so that the file only re-exports or imports the
implementation from the new module.

Comment on lines +1 to +30
import { initRedis } from './redis/redisClient.js';

async function checkRedisSeed() {
const client = await initRedis();
try {
const posts = JSON.parse((await client.get('posts')) || '[]');
console.log(JSON.stringify(posts));
} catch (err) {
console.error('Error checking Redis seed:', err);
} finally {
await client.disconnect();
}
}

checkRedisSeed();
import { initRedis } from './redisClient.js';

async function checkRedisSeed() {
const client = await initRedis();
try {
const posts = JSON.parse((await client.get('posts')) || '[]');
console.log(JSON.stringify(posts));
} catch (err) {
console.error('Error checking Redis seed:', err);
} finally {
await client.disconnect();
}
}

checkRedisSeed();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Eliminate duplicate block and fix Redis import path

The file currently includes two copies of checkRedisSeed. The first copy imports from './redis/redisClient.js', which does not exist relative to this file, and both copies redeclare the same symbols—TypeScript/Node will error and the script will fail to run. Drop the mistaken block and keep a single definition that imports from './redisClient.js'.

-import { initRedis } from './redis/redisClient.js';
-
-async function checkRedisSeed() {
-  const client = await initRedis();
-  try {
-    const posts = JSON.parse((await client.get('posts')) || '[]');
-    console.log(JSON.stringify(posts));
-  } catch (err) {
-    console.error('Error checking Redis seed:', err);
-  } finally {
-    await client.disconnect();
-  }
-}
-
-checkRedisSeed();
-import { initRedis } from './redisClient.js';
+import { initRedis } from './redisClient.js';
 
 async function checkRedisSeed() {
   const client = await initRedis();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { initRedis } from './redis/redisClient.js';
async function checkRedisSeed() {
const client = await initRedis();
try {
const posts = JSON.parse((await client.get('posts')) || '[]');
console.log(JSON.stringify(posts));
} catch (err) {
console.error('Error checking Redis seed:', err);
} finally {
await client.disconnect();
}
}
checkRedisSeed();
import { initRedis } from './redisClient.js';
async function checkRedisSeed() {
const client = await initRedis();
try {
const posts = JSON.parse((await client.get('posts')) || '[]');
console.log(JSON.stringify(posts));
} catch (err) {
console.error('Error checking Redis seed:', err);
} finally {
await client.disconnect();
}
}
checkRedisSeed();
import { initRedis } from './redisClient.js';
async function checkRedisSeed() {
const client = await initRedis();
try {
const posts = JSON.parse((await client.get('posts')) || '[]');
console.log(JSON.stringify(posts));
} catch (err) {
console.error('Error checking Redis seed:', err);
} finally {
await client.disconnect();
}
}
checkRedisSeed();
🧰 Tools
🪛 Biome (2.1.2)

[error] 16-16: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 18-18: Shouldn't redeclare 'checkRedisSeed'. Consider to delete it or rename it.

'checkRedisSeed' is defined here:

(lint/suspicious/noRedeclare)

🤖 Prompt for AI Agents
In src/redis/redisCheck.ts lines 1-30 there are two duplicate definitions of
checkRedisSeed and an incorrect import path; remove the entire duplicated block
(keep only one function) and ensure the remaining import is corrected to import
{ initRedis } from './redisClient.js' (not './redis/redisClient.js'), leaving a
single checkRedisSeed implementation that calls initRedis, handles
try/catch/finally, and invokes checkRedisSeed() once.

Comment on lines +92 to +182
import dotenv from 'dotenv';
import { createClient, RedisClientType } from 'redis';

dotenv.config();

// Prefer IPv4 loopback by default to avoid ::1/IPv6 resolution issues on some systems
const REDIS_URL = (process.env.REDIS_URL || 'redis://127.0.0.1:6379').replace('localhost', '127.0.0.1');

function safeRedisUrlForLog(url: string) {
try {
const u = new URL(url);
// show protocol, host and port only; hide auth/userinfo
const host = u.hostname || '';
const port = u.port ? `:${u.port}` : '';
return `${u.protocol}//${host}${port}`;
} catch (e) {
// fallback: remove everything between // and @ if present
return url.replace(/\/\/.*@/, '//');
}
}

let client: RedisClientType | null = null;
let connecting: Promise<RedisClientType> | null = null;

export async function initRedis(): Promise<RedisClientType> {
if (client && client.isOpen) return client;

// If a connect is already in progress, wait for it and return the resulting client
if (connecting) {
try {
await connecting;
} catch (err) {
// if the previous connecting attempt failed, clear it and continue to try again
}
if (client && client.isOpen) return client;
}

// Start a single connecting promise that other callers can await
connecting = (async (): Promise<RedisClientType> => {
const newClient = createClient({ url: REDIS_URL });

newClient.on('error', (err: unknown) => {
console.error('Redis Client Error:', err);
});

const maxRetries = 5;
const baseDelayMs = 200; // exponential backoff base

for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await newClient.connect();
// assign to module client (cast to satisfy TS) and return the concrete client
client = newClient as unknown as RedisClientType;
// Clear connecting before returning so subsequent callers don't wait
connecting = null;
return newClient as unknown as RedisClientType;
} catch (err) {
console.error(`Redis connect attempt ${attempt} failed:`, err);
// If last attempt, clean up and rethrow
if (attempt === maxRetries) {
try {
// Attempt to cleanly disconnect if partially connected
// ignore errors from disconnect
// eslint-disable-next-line @typescript-eslint/no-explicit-any
if ((newClient as any).isOpen) await newClient.disconnect();
} catch (e) {}
client = null;
connecting = null;
console.error('All Redis connection attempts failed.');
throw err;
}
const delay = baseDelayMs * 2 ** (attempt - 1);
await new Promise((res) => setTimeout(res, delay));
}
}

// unreachable, but satisfy TypeScript
connecting = null;
throw new Error('Redis connect failed');
})();

return connecting;
}

export function getRedisClient(): RedisClientType {
if (!client || !client.isOpen) {
throw new Error('Redis not initialized or client is closed. Call initRedis() first.');
}

return client;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove duplicate code block.

The entire file content (lines 1-91) has been duplicated at lines 92-182. This will cause compilation errors due to redeclaration of all identifiers, as correctly flagged by the static analysis tool.

Remove the duplicate lines:

-import dotenv from 'dotenv';
-import { createClient, RedisClientType } from 'redis';
-
-dotenv.config();
-
-// Prefer IPv4 loopback by default to avoid ::1/IPv6 resolution issues on some systems
-const REDIS_URL = (process.env.REDIS_URL || 'redis://127.0.0.1:6379').replace('localhost', '127.0.0.1');
-
-function safeRedisUrlForLog(url: string) {
-  try {
-    const u = new URL(url);
-    // show protocol, host and port only; hide auth/userinfo
-    const host = u.hostname || '';
-    const port = u.port ? `:${u.port}` : '';
-    return `${u.protocol}//${host}${port}`;
-  } catch (e) {
-    // fallback: remove everything between // and @ if present
-    return url.replace(/\/\/.*@/, '//');
-  }
-}
-
-let client: RedisClientType | null = null;
-let connecting: Promise<RedisClientType> | null = null;
-
-export async function initRedis(): Promise<RedisClientType> {
-  if (client && client.isOpen) return client;
-
-  // If a connect is already in progress, wait for it and return the resulting client
-  if (connecting) {
-    try {
-      await connecting;
-    } catch (err) {
-      // if the previous connecting attempt failed, clear it and continue to try again
-    }
-    if (client && client.isOpen) return client;
-  }
-
-  // Start a single connecting promise that other callers can await
-  connecting = (async (): Promise<RedisClientType> => {
-    const newClient = createClient({ url: REDIS_URL });
-
-    newClient.on('error', (err: unknown) => {
-      console.error('Redis Client Error:', err);
-    });
-
-    const maxRetries = 5;
-    const baseDelayMs = 200; // exponential backoff base
-
-    for (let attempt = 1; attempt <= maxRetries; attempt++) {
-      try {
-        await newClient.connect();
-        // assign to module client (cast to satisfy TS) and return the concrete client
-        client = newClient as unknown as RedisClientType;
-        // Clear connecting before returning so subsequent callers don't wait
-        connecting = null;
-        return newClient as unknown as RedisClientType;
-      } catch (err) {
-        console.error(`Redis connect attempt ${attempt} failed:`, err);
-        // If last attempt, clean up and rethrow
-        if (attempt === maxRetries) {
-          try {
-            // Attempt to cleanly disconnect if partially connected
-            // ignore errors from disconnect
-            // eslint-disable-next-line @typescript-eslint/no-explicit-any
-            if ((newClient as any).isOpen) await newClient.disconnect();
-          } catch (e) {}
-          client = null;
-          connecting = null;
-          console.error('All Redis connection attempts failed.');
-          throw err;
-        }
-        const delay = baseDelayMs * 2 ** (attempt - 1);
-        await new Promise((res) => setTimeout(res, delay));
-      }
-    }
-
-    // unreachable, but satisfy TypeScript
-    connecting = null;
-    throw new Error('Redis connect failed');
-  })();
-
-  return connecting;
-}
-
-export function getRedisClient(): RedisClientType {
-  if (!client || !client.isOpen) {
-    throw new Error('Redis not initialized or client is closed. Call initRedis() first.');
-  }
-
-  return client;
-}

The file should only contain lines 1-91.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import dotenv from 'dotenv';
import { createClient, RedisClientType } from 'redis';
dotenv.config();
// Prefer IPv4 loopback by default to avoid ::1/IPv6 resolution issues on some systems
const REDIS_URL = (process.env.REDIS_URL || 'redis://127.0.0.1:6379').replace('localhost', '127.0.0.1');
function safeRedisUrlForLog(url: string) {
try {
const u = new URL(url);
// show protocol, host and port only; hide auth/userinfo
const host = u.hostname || '';
const port = u.port ? `:${u.port}` : '';
return `${u.protocol}//${host}${port}`;
} catch (e) {
// fallback: remove everything between // and @ if present
return url.replace(/\/\/.*@/, '//');
}
}
let client: RedisClientType | null = null;
let connecting: Promise<RedisClientType> | null = null;
export async function initRedis(): Promise<RedisClientType> {
if (client && client.isOpen) return client;
// If a connect is already in progress, wait for it and return the resulting client
if (connecting) {
try {
await connecting;
} catch (err) {
// if the previous connecting attempt failed, clear it and continue to try again
}
if (client && client.isOpen) return client;
}
// Start a single connecting promise that other callers can await
connecting = (async (): Promise<RedisClientType> => {
const newClient = createClient({ url: REDIS_URL });
newClient.on('error', (err: unknown) => {
console.error('Redis Client Error:', err);
});
const maxRetries = 5;
const baseDelayMs = 200; // exponential backoff base
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await newClient.connect();
// assign to module client (cast to satisfy TS) and return the concrete client
client = newClient as unknown as RedisClientType;
// Clear connecting before returning so subsequent callers don't wait
connecting = null;
return newClient as unknown as RedisClientType;
} catch (err) {
console.error(`Redis connect attempt ${attempt} failed:`, err);
// If last attempt, clean up and rethrow
if (attempt === maxRetries) {
try {
// Attempt to cleanly disconnect if partially connected
// ignore errors from disconnect
// eslint-disable-next-line @typescript-eslint/no-explicit-any
if ((newClient as any).isOpen) await newClient.disconnect();
} catch (e) {}
client = null;
connecting = null;
console.error('All Redis connection attempts failed.');
throw err;
}
const delay = baseDelayMs * 2 ** (attempt - 1);
await new Promise((res) => setTimeout(res, delay));
}
}
// unreachable, but satisfy TypeScript
connecting = null;
throw new Error('Redis connect failed');
})();
return connecting;
}
export function getRedisClient(): RedisClientType {
if (!client || !client.isOpen) {
throw new Error('Redis not initialized or client is closed. Call initRedis() first.');
}
return client;
}
🧰 Tools
🪛 Biome (2.1.2)

[error] 92-92: Shouldn't redeclare 'dotenv'. Consider to delete it or rename it.

'dotenv' is defined here:

(lint/suspicious/noRedeclare)


[error] 93-93: Shouldn't redeclare 'createClient'. Consider to delete it or rename it.

'createClient' is defined here:

(lint/suspicious/noRedeclare)


[error] 93-93: Shouldn't redeclare 'RedisClientType'. Consider to delete it or rename it.

'RedisClientType' is defined here:

(lint/suspicious/noRedeclare)


[error] 98-98: Shouldn't redeclare 'REDIS_URL'. Consider to delete it or rename it.

'REDIS_URL' is defined here:

(lint/suspicious/noRedeclare)


[error] 100-100: Shouldn't redeclare 'safeRedisUrlForLog'. Consider to delete it or rename it.

'safeRedisUrlForLog' is defined here:

(lint/suspicious/noRedeclare)


[error] 113-113: Shouldn't redeclare 'client'. Consider to delete it or rename it.

'client' is defined here:

(lint/suspicious/noRedeclare)


[error] 114-114: Shouldn't redeclare 'connecting'. Consider to delete it or rename it.

'connecting' is defined here:

(lint/suspicious/noRedeclare)


[error] 116-116: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 176-176: Shouldn't redeclare 'getRedisClient'. Consider to delete it or rename it.

'getRedisClient' is defined here:

(lint/suspicious/noRedeclare)

🤖 Prompt for AI Agents
In src/redis/redisClient.ts lines 92-182, the file contains a duplicate copy of
lines 1-91; remove the duplicated block so the file contains only the original
implementation (lines 1-91). After deleting the duplicate, ensure there is a
single declaration of imports, variables (client/connecting), and the exported
functions (initRedis and getRedisClient), then run the TypeScript build/linter
to verify no redeclarations remain.

Comment on lines +1 to +200
import { initRedis } from './redisClient.js';
import { getEmbeddingWithRetry } from '../test/embeddingTest.js';

function cosineSimilarity(a: number[], b: number[]): number {
if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) return 0;
let dot = 0,
na = 0,
nb = 0;
for (let i = 0; i < a.length; i++) {
const ai = a[i] ?? 0;
const bi = b[i] ?? 0;
dot += ai * bi;
na += ai * ai;
nb += bi * bi;
}
if (na === 0 || nb === 0) return 0;
return dot / (Math.sqrt(na) * Math.sqrt(nb));
}

async function main() {
const redisClient = await initRedis();
const redisSubscriber = redisClient.duplicate();
await redisSubscriber.connect();

// First, let's verify keyspace notifications are enabled
const config = await redisClient.configGet('notify-keyspace-events');

{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log('Merging keyspace notification flags:', merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log('Keyspace notifications updated.');
}
}

// Also try keyspace pattern (different format)
await redisSubscriber.pSubscribe('__keyspace@0__:posts', async (message, pattern) => {
const lockKey = 'dedupe:lock:posts';
const gotLock = await redisClient.set(lockKey, '1', { NX: true, PX: 10000 });
if (!gotLock) return;
try {
if (message === 'set') {
console.log('Posts key was set via keyspace, processing update...');
const postsRaw = await redisClient.get('posts');
if (!postsRaw) return;
let posts: unknown;
try {
posts = JSON.parse(postsRaw);
} catch (e) {
console.error('Failed to parse posts JSON:', e);
return;
}
if (!Array.isArray(posts)) {
console.warn('Expected posts array; got:', typeof posts);
return;
}
// ...existing deduplication logic...
const latestPost = posts[posts.length - 1]; // Assume last is new
const latestCacheKey = `emb:${latestPost.id}`;
let latestEmbedding = await redisClient.get(latestCacheKey).then((s: any) => (s ? JSON.parse(s) : null));
if (!latestEmbedding) {
latestEmbedding = await getEmbeddingWithRetry(latestPost.content);
await redisClient.set(latestCacheKey, JSON.stringify(latestEmbedding), { EX: 86400 });
}
console.log(latestEmbedding);

for (let i = 0; i < posts.length - 1; i++) {
const otherPost = posts[i];
const otherCacheKey = `emb:${otherPost.id}`;
let otherEmbedding = await redisClient.get(otherCacheKey).then((s: any) => (s ? JSON.parse(s) : null));
if (!otherEmbedding) {
otherEmbedding = await getEmbeddingWithRetry(otherPost.content);
await redisClient.set(otherCacheKey, JSON.stringify(otherEmbedding), { EX: 86400 });
}
const similarity = cosineSimilarity(latestEmbedding, otherEmbedding);
if (similarity > 0.95) {
console.log(`Duplicate detected: ${latestPost.id} is similar to post ${otherPost.id}`);
// Remove the duplicate post (latestPost) from the array
const updatedPosts = posts.filter((p: any) => p.id !== latestPost.id);
await redisClient.set('posts', JSON.stringify(updatedPosts));
console.log(`Removed duplicate post ${latestPost.id} from Redis.`);
break;
}
console.log('similarity:', similarity.toFixed(4));
}
}
} finally {
try {
await redisClient.del(lockKey);
} catch {}
}
});

console.log('Listening for changes to posts array and deduping using embeddings...');
}

main().catch(console.error);
import { initRedis } from './redisClient.js';
import { getEmbeddingWithRetry } from '../test/embeddingTest.js';

function cosineSimilarity(a: number[], b: number[]): number {
if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) return 0;
let dot = 0,
na = 0,
nb = 0;
for (let i = 0; i < a.length; i++) {
const ai = a[i] ?? 0;
const bi = b[i] ?? 0;
dot += ai * bi;
na += ai * ai;
nb += bi * bi;
}
if (na === 0 || nb === 0) return 0;
return dot / (Math.sqrt(na) * Math.sqrt(nb));
}

async function main() {
const redisClient = await initRedis();
const redisSubscriber = redisClient.duplicate();
await redisSubscriber.connect();

// First, let's verify keyspace notifications are enabled
const config = await redisClient.configGet('notify-keyspace-events');

{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log('Merging keyspace notification flags:', merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log('Keyspace notifications updated.');
}
}

// Also try keyspace pattern (different format)
await redisSubscriber.pSubscribe('__keyspace@0__:posts', async (message, pattern) => {
const lockKey = 'dedupe:lock:posts';
const gotLock = await redisClient.set(lockKey, '1', { NX: true, PX: 10000 });
if (!gotLock) return;
try {
if (message === 'set') {
console.log('Posts key was set via keyspace, processing update...');
const postsRaw = await redisClient.get('posts');
if (!postsRaw) return;
let posts: unknown;
try {
posts = JSON.parse(postsRaw);
} catch (e) {
console.error('Failed to parse posts JSON:', e);
return;
}
if (!Array.isArray(posts)) {
console.warn('Expected posts array; got:', typeof posts);
return;
}
// ...existing deduplication logic...
const latestPost = posts[posts.length - 1]; // Assume last is new
const latestCacheKey = `emb:${latestPost.id}`;
let latestEmbedding = await redisClient.get(latestCacheKey).then((s: any) => (s ? JSON.parse(s) : null));
if (!latestEmbedding) {
latestEmbedding = await getEmbeddingWithRetry(latestPost.content);
await redisClient.set(latestCacheKey, JSON.stringify(latestEmbedding), { EX: 86400 });
}
console.log(latestEmbedding);

for (let i = 0; i < posts.length - 1; i++) {
const otherPost = posts[i];
const otherCacheKey = `emb:${otherPost.id}`;
let otherEmbedding = await redisClient.get(otherCacheKey).then((s: any) => (s ? JSON.parse(s) : null));
if (!otherEmbedding) {
otherEmbedding = await getEmbeddingWithRetry(otherPost.content);
await redisClient.set(otherCacheKey, JSON.stringify(otherEmbedding), { EX: 86400 });
}
const similarity = cosineSimilarity(latestEmbedding, otherEmbedding);
if (similarity > 0.95) {
console.log(`Duplicate detected: ${latestPost.id} is similar to post ${otherPost.id}`);
// Remove the duplicate post (latestPost) from the array
const updatedPosts = posts.filter((p: any) => p.id !== latestPost.id);
await redisClient.set('posts', JSON.stringify(updatedPosts));
console.log(`Removed duplicate post ${latestPost.id} from Redis.`);
break;
}
console.log('similarity:', similarity.toFixed(4));
}
}
} finally {
try {
await redisClient.del(lockKey);
} catch {}
}
});

console.log('Listening for changes to posts array and deduping using embeddings...');
}

main().catch(console.error);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove the duplicated listener block

This file was accidentally concatenated twice—every import, helper, and the main function appear twice. The duplicate definitions will cause module-level redeclaration failures and prevent the listener from starting. Trim the second copy so the module exports exactly one set of definitions.

-// (delete lines 101-200: second copy of imports, cosineSimilarity, main, and invocation)

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Biome (2.1.2)

[error] 101-101: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 102-102: Shouldn't redeclare 'getEmbeddingWithRetry'. Consider to delete it or rename it.

'getEmbeddingWithRetry' is defined here:

(lint/suspicious/noRedeclare)


[error] 104-104: Shouldn't redeclare 'cosineSimilarity'. Consider to delete it or rename it.

'cosineSimilarity' is defined here:

(lint/suspicious/noRedeclare)


[error] 120-120: Shouldn't redeclare 'main'. Consider to delete it or rename it.

'main' is defined here:

(lint/suspicious/noRedeclare)

🤖 Prompt for AI Agents
In src/redis/redisDedupeListener.ts (lines 1-200) the entire module has been
accidentally duplicated—imports, cosineSimilarity, main and main().catch(...)
appear twice; remove the second copy so the file contains a single set of
imports, a single cosineSimilarity function, a single main function and a single
main().catch(...) invocation (ensure no leftover duplicated declarations or
stray imports remain and the file still ends by invoking
main().catch(console.error)).

Comment on lines +47 to +92
import { initRedis } from './redisClient.js';
import * as fs from 'fs';
import * as path from 'path';

type InputPost = {
id: string;
content: string;
url: string;
created_at: string;
author: { name: string; handle: string; pfpUrl: string } | null;
};

async function seedRedis() {
const redisClient = await initRedis();

// Try to read user-provided input from data/sample_posts.json (project root)
const dataPath = path.resolve(process.cwd(), 'data', 'sample_posts.json');
let inputPosts: InputPost[] = [];

if (fs.existsSync(dataPath)) {
try {
const raw = fs.readFileSync(dataPath, 'utf8');
inputPosts = JSON.parse(raw);
console.log(`Loaded ${inputPosts.length} posts from data/sample_posts.json`);
} catch (err) {
console.error('Failed to parse data/sample_posts.json, falling back to bundled sample:', err);
}
}

try {
await redisClient.set('posts', JSON.stringify(inputPosts));
console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
} finally {
// Always attempt to close the client. If disconnect fails, surface the error
// after attempting to close (don't swallow fatal errors silently).
try {
await redisClient.quit();
} catch (err) {
console.error('Failed to disconnect Redis client:', err);
// rethrow so callers see the original failure if needed
throw err;
}
}
}

seedRedis();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove the complete file duplication.

Lines 47-92 are an exact duplicate of lines 1-46. This duplication causes all symbols to be redeclared and will execute seedRedis() twice.

Delete lines 47-92 entirely:

-import { initRedis } from './redisClient.js';
-import * as fs from 'fs';
-import * as path from 'path';
-
-type InputPost = {
-  id: string;
-  content: string;
-  url: string;
-  created_at: string;
-  author: { name: string; handle: string; pfpUrl: string } | null;
-};
-
-async function seedRedis() {
-  const redisClient = await initRedis();
-
-  // Try to read user-provided input from data/sample_posts.json (project root)
-  const dataPath = path.resolve(process.cwd(), 'data', 'sample_posts.json');
-  let inputPosts: InputPost[] = [];
-
-  if (fs.existsSync(dataPath)) {
-    try {
-      const raw = fs.readFileSync(dataPath, 'utf8');
-      inputPosts = JSON.parse(raw);
-      console.log(`Loaded ${inputPosts.length} posts from data/sample_posts.json`);
-    } catch (err) {
-      console.error('Failed to parse data/sample_posts.json, falling back to bundled sample:', err);
-    }
-  }
-
-  try {
-    await redisClient.set('posts', JSON.stringify(inputPosts));
-    console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
-  } finally {
-    // Always attempt to close the client. If disconnect fails, surface the error
-    // after attempting to close (don't swallow fatal errors silently).
-    try {
-      await redisClient.quit();
-    } catch (err) {
-      console.error('Failed to disconnect Redis client:', err);
-      // rethrow so callers see the original failure if needed
-      throw err;
-    }
-  }
-}
-
-seedRedis();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { initRedis } from './redisClient.js';
import * as fs from 'fs';
import * as path from 'path';
type InputPost = {
id: string;
content: string;
url: string;
created_at: string;
author: { name: string; handle: string; pfpUrl: string } | null;
};
async function seedRedis() {
const redisClient = await initRedis();
// Try to read user-provided input from data/sample_posts.json (project root)
const dataPath = path.resolve(process.cwd(), 'data', 'sample_posts.json');
let inputPosts: InputPost[] = [];
if (fs.existsSync(dataPath)) {
try {
const raw = fs.readFileSync(dataPath, 'utf8');
inputPosts = JSON.parse(raw);
console.log(`Loaded ${inputPosts.length} posts from data/sample_posts.json`);
} catch (err) {
console.error('Failed to parse data/sample_posts.json, falling back to bundled sample:', err);
}
}
try {
await redisClient.set('posts', JSON.stringify(inputPosts));
console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
} finally {
// Always attempt to close the client. If disconnect fails, surface the error
// after attempting to close (don't swallow fatal errors silently).
try {
await redisClient.quit();
} catch (err) {
console.error('Failed to disconnect Redis client:', err);
// rethrow so callers see the original failure if needed
throw err;
}
}
}
seedRedis();
import { initRedis } from './redisClient.js';
import * as fs from 'fs';
import * as path from 'path';
type InputPost = {
id: string;
content: string;
url: string;
created_at: string;
author: { name: string; handle: string; pfpUrl: string } | null;
};
async function seedRedis() {
const redisClient = await initRedis();
// Try to read user-provided input from data/sample_posts.json (project root)
const dataPath = path.resolve(process.cwd(), 'data', 'sample_posts.json');
let inputPosts: InputPost[] = [];
if (fs.existsSync(dataPath)) {
try {
const raw = fs.readFileSync(dataPath, 'utf8');
inputPosts = JSON.parse(raw);
console.log(`Loaded ${inputPosts.length} posts from data/sample_posts.json`);
} catch (err) {
console.error('Failed to parse data/sample_posts.json, falling back to bundled sample:', err);
}
}
try {
await redisClient.set('posts', JSON.stringify(inputPosts));
console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
} finally {
// Always attempt to close the client. If disconnect fails, surface the error
// after attempting to close (don't swallow fatal errors silently).
try {
await redisClient.quit();
} catch (err) {
console.error('Failed to disconnect Redis client:', err);
// rethrow so callers see the original failure if needed
throw err;
}
}
}
seedRedis();
🧰 Tools
🪛 Biome (2.1.2)

[error] 87-87: Unsafe usage of 'throw'.

'throw' in 'finally' overwrites the control flow statements inside 'try' and 'catch'.

(lint/correctness/noUnsafeFinally)


[error] 47-47: Shouldn't redeclare 'initRedis'. Consider to delete it or rename it.

'initRedis' is defined here:

(lint/suspicious/noRedeclare)


[error] 48-48: Shouldn't redeclare 'fs'. Consider to delete it or rename it.

'fs' is defined here:

(lint/suspicious/noRedeclare)


[error] 49-49: Shouldn't redeclare 'path'. Consider to delete it or rename it.

'path' is defined here:

(lint/suspicious/noRedeclare)


[error] 51-51: Shouldn't redeclare 'InputPost'. Consider to delete it or rename it.

'InputPost' is defined here:

(lint/suspicious/noRedeclare)


[error] 59-59: Shouldn't redeclare 'seedRedis'. Consider to delete it or rename it.

'seedRedis' is defined here:

(lint/suspicious/noRedeclare)

🤖 Prompt for AI Agents
In src/redis/redisSeed.ts lines 47 to 92 there's a complete duplicate of the
file (redeclaring types, imports, functions and calling seedRedis twice); remove
lines 47-92 entirely so each import/type/function is declared only once and
seedRedis() is invoked a single time.

@tasin2610 tasin2610 merged commit 8505c86 into main Oct 3, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants