Skip to content

Conversation

nickventon
Copy link
Collaborator

@nickventon nickventon commented Sep 21, 2025

Summary by CodeRabbit

  • Improvements

    • Enhanced CLI help and standardized console output.
    • More consistent title generation and sentiment guidance for predictable results.
    • More reliable full post analysis with clearer error reporting.
    • Improved duplicate post detection and handling with more informative logs.
  • Documentation

    • Added a concise project subtitle to the README.
  • Chores

    • Introduced a Prettier configuration and a format script; applied consistent formatting across the codebase.

Copy link

coderabbitai bot commented Sep 21, 2025

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately reflects the primary change: adding a Prettier-based formatting workflow (a .prettierrc plus a package.json "format" script) and applying formatting across the codebase, which is the dominant intent of the changeset. It is concise, tied to the main change, and clear enough for a teammate scanning history to understand the primary purpose of this PR.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/prettier

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/redisSeed.ts (1)

30-43: Disconnect error can mask the original failure. Preserve the root cause.

Currently, an error in quit() overrides any earlier set/parse error.

-  try {
-    await redisClient.set('posts', JSON.stringify(inputPosts));
-    console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
-  } finally {
-    // Always attempt to close the client. If disconnect fails, surface the error
-    // after attempting to close (don't swallow fatal errors silently).
-    try {
-      await redisClient.quit();
-    } catch (err) {
-      console.error('Failed to disconnect Redis client:', err);
-      // rethrow so callers see the original failure if needed
-      throw err;
-    }
-  }
+  let originalErr: unknown = null;
+  try {
+    await redisClient.set('posts', JSON.stringify(inputPosts));
+    console.log(`Redis seeding done! Seeded ${inputPosts.length} posts.`);
+  } catch (e) {
+    originalErr = e;
+  } finally {
+    try {
+      await redisClient.quit();
+    } catch (quitErr) {
+      console.error('Failed to disconnect Redis client:', quitErr);
+      if (!originalErr) throw quitErr;
+    }
+    if (originalErr) throw originalErr;
+  }
src/completePostAnalysis.ts (1)

5-5: Broken import: generateTitle.ts doesn’t export TitleResult.

This will fail type‑checking/build.

-import { generateTitleForPost, type TitleResult } from './generateTitle.js';
+import { generateTitleForPost } from './generateTitle.js';
src/classifyWithOpenAI.ts (1)

37-39: Prompt example uses categories not in the allowed list; causes validation failures.

Match the example to CATEGORIES to reduce retries/invalid responses.

-    "categories": ["DeFi","Safety"],
+    "categories": ["Decentralized Finance (DeFi)","Security & Privacy"],
     "subcategories": ["Yield Farming","Lending Strategies","Risk Management"]
🧹 Nitpick comments (39)
.prettierrc (1)

1-10: Prettier baseline looks good; add EOL + ignore for consistency

  • Add endOfLine to avoid cross‑OS diffs.
  • Add a .prettierignore for generated/build artifacts.

Apply this minimal tweak:

 {
   "printWidth": 120,
   "trailingComma": "none",
   "singleQuote": true,
   "semi": true,
   "useTabs": false,
   "tabWidth": 2,
   "arrowParens": "always",
-  "jsxSingleQuote": true
+  "jsxSingleQuote": true,
+  "endOfLine": "lf"
 }

And create a .prettierignore:

# build artifacts
dist/
coverage/
package.json (1)

17-19: Prefer local bin over npx; add format:check for CI

  • Use the locally installed prettier binary for speed/repro.
  • Add a non‑mutating check script to gate CI.
   "scripts": {
@@
-    "seed:verify": "tsx src/seed.ts verify",
-    "format": "npx prettier --write ."
+    "seed:verify": "tsx src/seed.ts verify",
+    "format": "prettier --write .",
+    "format:check": "prettier --check ."
   },
@@
   "devDependencies": {
     "@types/node": "^24.3.1",
-    "prettier": "^3.6.2",
+    "prettier": "^3.6.2",
     "tsx": "^4.20.5",
     "typescript": "^5.9.2"
   }

Also applies to: 28-33

src/redisTest.ts (1)

3-10: Ensure disconnect on failure

Wrap with try/finally so the client disconnects even if ping throws.

 async function testRedis() {
-  const client = await initRedis();
-
-  const pong = await client.ping();
-  console.log('Ping Response:', pong); // Output: "PONG" means connection success
-
-  await client.disconnect();
+  const client = await initRedis();
+  try {
+    const pong = await client.ping();
+    console.log('Ping Response:', pong);
+  } finally {
+    await client.disconnect();
+  }
 }
src/redisClient.ts (2)

7-7: Make localhost normalization URL-safe

String replace may hit unintended substrings. Parse the URL and set hostname explicitly.

-const REDIS_URL = (process.env.REDIS_URL || 'redis://127.0.0.1:6379').replace('localhost', '127.0.0.1');
+const RAW_REDIS_URL = process.env.REDIS_URL || 'redis://127.0.0.1:6379';
+const REDIS_URL = (() => {
+  try {
+    const u = new URL(RAW_REDIS_URL);
+    if (u.hostname === 'localhost') u.hostname = '127.0.0.1';
+    return u.toString();
+  } catch {
+    return RAW_REDIS_URL;
+  }
+})();

9-20: Unused helper: safeRedisUrlForLog

Either remove it or use it in future connection logs to avoid leaking credentials in URLs.

src/embeddingTest.ts (2)

24-29: Log says “falling back to mock” but code rethrows. Align log with behavior.

Apply this diff to prevent misleading logs:

-          console.error('Max retries reached for OpenAI. Falling back to mock embedding.');
+          console.error('Max retries reached for OpenAI. No fallback available; rethrowing.');

30-33: Unify backoff with shared utils and add explicit return type.

Keeps retry behavior consistent across the codebase and clarifies API.

+import { sleep, jitteredBackoff } from './lib/utils.js';
 
-export async function getEmbeddingWithRetry(input: string, opts?: { model?: string; maxRetries?: number }) {
+export async function getEmbeddingWithRetry(
+  input: string,
+  opts?: { model?: string; maxRetries?: number }
+): Promise<number[]> {
@@
-        const delay = baseDelay * 2 ** (attempt - 1);
-        await new Promise((r) => setTimeout(r, delay));
+        const delay = jitteredBackoff(baseDelay, attempt);
+        await sleep(delay);

Also applies to: 39-40

src/lib/utils.ts (1)

6-7: Consider capping exponential growth to avoid runaway delays.

Add an optional maxBackoff to guard high attempt counts.

-export const jitteredBackoff = (baseMs: number, attempt: number, jitterMs: number = 100): number =>
-  baseMs * 2 ** (attempt - 1) + Math.floor(Math.random() * jitterMs);
+export const jitteredBackoff = (
+  baseMs: number,
+  attempt: number,
+  jitterMs: number = 100,
+  maxBackoffMs: number = 30_000
+): number => Math.min(baseMs * 2 ** (attempt - 1), maxBackoffMs) + Math.floor(Math.random() * jitterMs);
src/generateTitle.ts (2)

32-49: Title generation flow reads clean; consider passing maxTokens.

Optional: add a conservative maxTokens for consistency with other calls.

-      schema: TitleSchema,
-      retryCount: 3
+      schema: TitleSchema,
+      retryCount: 3,
+      maxTokens: 150

Also applies to: 49-66


123-139: Sequential processing may be slow with many posts.

Optional: process with limited concurrency (e.g., p-limit(3)).

src/openaiValidationUtil.ts (3)

6-9: Docstring contradicts behavior. It throws; it doesn’t return null.

Update comment to avoid confusion.

- * Returns the validated result or null if all attempts fail.
+ * Returns the validated result or throws after exhausting attempts.

37-43: Content extraction assumes string; safe for json_object, but add guard.

Minor: handle non-string content defensively.

-      let raw: string | undefined;
-      if ('choices' in response && Array.isArray(response.choices)) {
-        const content = response.choices[0]?.message?.content;
-        raw = typeof content === 'string' ? content : undefined;
-      }
+      let raw: string | undefined;
+      if (Array.isArray(response.choices)) {
+        const content = response.choices[0]?.message?.content;
+        raw = typeof content === 'string' ? content : undefined;
+      }
       if (!raw) throw new Error('No content returned from OpenAI');

Also applies to: 44-53


56-63: Align rate-limit detection with other callers.

Optionally include code === 'rate_limit' for symmetry with embeddingTest.

-      const isRateLimit = status === 429 || type === 'rate_limit';
+      const isRateLimit = status === 429 || type === 'rate_limit' || anyErr?.code === 'rate_limit';
src/postGroup.ts (3)

39-50: Parsing safety is fine; consider validating shape.

Optional: parse to PostGroup[] via Zod to avoid silently returning [] on partial corruption.


60-81: End-to-end loop is sequential. Consider bounded concurrency.

Speeds up multi-group processing without overloading OpenAI/Redis.

-  for (const group of postGroups) {
+  const concurrency = 3;
+  const queue = [...postGroups];
+  const workers: Promise<void>[] = [];
+  for (let i = 0; i < concurrency; i++) {
+    workers.push((async () => {
+      for (;;) {
+        const group = queue.shift();
+        if (!group) break;
         try {
           // Always generate and update title
           const title = await generateTitleForPostGroup(group);
           // Always generate and update sentiment summaries
           const summaries = await generateSentimentSummariesForPostGroup(group);
@@
           postGroupsWithOrderedKeys.push({
             id: group.id,
             title,
             bullishSummary: summaries.bullishSummary,
             bearishSummary: summaries.bearishSummary,
             neutralSummary: summaries.neutralSummary,
             posts: group.posts
           });
         } catch (e) {
           if (context === 'CRON') {
             console.error(`[CRON] Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
           } else {
             console.error(`Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
           }
         }
-    } catch (e) {
-      if (context === 'CRON') {
-        console.error(`[CRON] Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
-      } else {
-        console.error(`Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
-      }
-    }
-  }
+      }
+    })());
+  }
+  await Promise.all(workers);

90-97: Saving back to Redis is fine; consider an expiry or versioning key.

Optional: set a TTL (e.g., 6–12h) or write to a versioned key to avoid stale reads.

run-classifier.ts (3)

28-38: Optional: externalize sample posts.

Move sample posts to a small fixture (e.g., data/sample_posts.json) to keep CLI lean and enable reuse in tests.


40-51: Optional: externalize title demo inputs.

Same rationale as sentiment sample posts.


59-63: Defaulting to “complete” run may surprise users.

Consider printing help when no command is provided, or add a prompt/flag to confirm networked calls by default.

Apply this diff if you prefer help-on-empty:

-    default:
-      console.log('No command specified. Running complete analysis by default...');
-      await runCompleteAnalysisDemo();
-      break;
+    default:
+      console.log('No command specified.\n');
+      showHelp();
+      break;
src/redisDedupeListener.ts (6)

5-18: Cosine similarity: handle edge cases explicitly and clamp result.

Return 0 on length mismatch is fine but silent; also clamp to [-1, 1] for numeric stability.

Apply:

-  if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) return 0;
+  if (!Array.isArray(a) || !Array.isArray(b)) return 0;
+  if (a.length !== b.length) {
+    console.warn('cosineSimilarity: length mismatch', { a: a.length, b: b.length });
+    return 0;
+  }
@@
-  return dot / (Math.sqrt(na) * Math.sqrt(nb));
+  const denom = Math.sqrt(na) * Math.sqrt(nb);
+  const raw = denom ? dot / denom : 0;
+  return Math.max(-1, Math.min(1, raw));

21-24: Subscriber lifecycle: add basic error/reconnect handling.

If the duplicate/connect fails or the connection drops, the process will sit idle. Consider logging connection state and retrying subscription on errors.


40-45: pSubscribe callback arg name is misleading.

node-redis v4 listener params are (message, channel). Rename second arg to channel to avoid confusion when debugging.

Apply:

-  await redisSubscriber.pSubscribe('__keyspace@0__:posts', async (message, pattern) => {
+  await redisSubscriber.pSubscribe('__keyspace@0__:posts', async (message, channel) => {

61-68: Don’t log raw embeddings.

They’re large, noisy, and may leak PII if inputs ever contain sensitive data. Remove or gate behind a debug flag.

Apply:

-        console.log(latestEmbedding);
+        // console.debug('latest embedding computed/cached');

70-88: Race safety on write; coalesce updates.

Even with locking, consider writing with a short retry if SET fails due to external modifications, or wrap the read/modify/write in a Lua script for atomicity to avoid lost updates.

I can provide a small EVAL script that validates the current array still ends with latestPost.id before applying the filtered update.


97-100: Startup and fatal error handling: OK.

Consider logging the Redis URL host (not creds) once to aid ops.

src/seedDatabase.ts (2)

45-58: Clear path LGTM.

Deletes only the intended key. Consider clearing related keys (e.g., posts) in a separate admin script to avoid accidental interference with listeners.


64-87: Verify path: add parse guard (optional).

Parse errors are caught by the outer try/catch already; optionally surface a clearer message if JSON is malformed.

src/seed.ts (2)

47-52: Avoid hard-exiting the process; return and set exitCode instead.

process.exit can truncate logs/flushes. Prefer returning from main and setting process.exitCode on failure.

-    console.log('✨ Operation completed successfully!');
-    process.exit(0);
+    console.log('✨ Operation completed successfully!');
+    return;
   } catch (error) {
-    console.error('💥 Operation failed:', error);
-    process.exit(1);
+    console.error('💥 Operation failed:', error);
+    process.exitCode = 1;
+    return;
   }

81-84: Use an ESM‑safe entrypoint and await the Promise.

require.main isn’t available under ESM/“type”: "module"/tsx. Also ensure rejections are handled.

-if (require.main === module) {
-  main();
-}
+// ESM-safe: run only when invoked directly
+if (import.meta.url === (await import('url')).pathToFileURL(process.argv[1]).href) {
+  await main().catch((err) => {
+    console.error(err);
+    process.exitCode = 1;
+  });
+}

Add near the top (once):

import { pathToFileURL } from 'url';
src/analyzeSentiment.ts (3)

32-50: Optional: parallelize per‑post calls with proper error isolation.

This cuts wall‑clock latency for N posts while preserving partial successes.

-  for (const post of posts) {
-    try {
-      const validated = await callOpenAIWithValidation({
-        client: openai,
-        systemPrompt,
-        userPrompt: post,
-        schema: SentimentSchema,
-        retryCount: 3
-      });
-
-      if (validated) {
-        results.push({ post, sentiment: validated.sentiment });
-      }
-    } catch (e) {
-      console.error('Error analyzing post:', post, e);
-      continue;
-    }
-  }
+  const settled = await Promise.allSettled(
+    posts.map(async (post) => {
+      const validated = await callOpenAIWithValidation({
+        client: openai,
+        systemPrompt,
+        userPrompt: post,
+        schema: SentimentSchema,
+        retryCount: 3
+      });
+      return { post, sentiment: validated.sentiment };
+    })
+  );
+  for (const r of settled) {
+    if (r.status === 'fulfilled' && r.value) results.push(r.value);
+    if (r.status === 'rejected') console.error('Error analyzing post:', r.reason);
+  }

46-47: Don’t log full post content on errors (PII/noise); truncate.

-      console.error('Error analyzing post:', post, e);
+      console.error('Error analyzing post:', `${post.slice(0,80)}${post.length>80?'...':''}`, e);

78-80: ESM‑safe entrypoint guard.

-if (require.main === module) {
-  runExample();
-}
+if (import.meta.url === (await import('url')).pathToFileURL(process.argv[1]).href) {
+  await runExample();
+}
src/completePostAnalysis.ts (5)

31-33: Redundant null check; generateTitleForPost never returns null.

The function throws on failure. Remove the dead branch.

-    if (title === null) {
-      throw new Error(`Title generation failed for post: ${post}`);
-    }

36-38: Normalize error messages to strings.

Avoid interpolating raw Error objects.

-    result.errors?.push(`Title generation failed: ${e}`);
-    console.error('Title generation error:', e);
+    const msg = e instanceof Error ? e.message : String(e);
+    result.errors?.push(`Title generation failed: ${msg}`);
+    console.error('Title generation error:', msg);

46-49: Same normalization for categorization errors.

-    result.errors?.push(`Categorization failed: ${e}`);
-    console.error('Categorization error:', e);
+    const msg = e instanceof Error ? e.message : String(e);
+    result.errors?.push(`Categorization failed: ${msg}`);
+    console.error('Categorization error:', msg);

58-60: Same normalization for sentiment errors.

-    result.errors?.push(`Sentiment analysis failed: ${e}`);
-    console.error('Sentiment analysis error:', e);
+    const msg = e instanceof Error ? e.message : String(e);
+    result.errors?.push(`Sentiment analysis failed: ${msg}`);
+    console.error('Sentiment analysis error:', msg);

62-65: Consider returning partial results with errors instead of throwing.

Current throw discards partial success and forces outer flow to fabricate an empty result. Returning the populated result with errors preserves signal and reduces control‑flow churn.

src/classifyWithOpenAI.ts (2)

11-22: Optionally enforce strict schema (no extra keys).

Prevents models from adding fields outside the contract.

-const CategorizationSchema = z.object({
+const CategorizationSchema = z
+  .object({
     categories: z
       .array(z.string())
       .min(1, 'At least one category is required')
       .max(2, 'At most 2 categories are allowed')
       .refine((arr) => arr.every((c) => CATEGORIES.includes(c)), {
         message: 'All categories must be from the allowed list'
       }),
     subcategories: z
       .array(z.string().min(1, 'Subcategories cannot be empty strings'))
       .min(1, 'At least one subcategory is required')
-});
+  })
+  .strict();

66-68: ESM‑safe entrypoint guard.

-if (require.main === module) {
-  runCategorization();
-}
+if (import.meta.url === (await import('url')).pathToFileURL(process.argv[1]).href) {
+  await runCategorization();
+}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 887d961 and 0867ae8.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (25)
  • .coderabbit.yaml (1 hunks)
  • .prettierrc (1 hunks)
  • .vscode/launch.json (1 hunks)
  • README.md (1 hunks)
  • data/sample_posts.json (1 hunks)
  • package.json (2 hunks)
  • run-classifier.ts (2 hunks)
  • src/analyzeSentiment.ts (2 hunks)
  • src/classifyWithOpenAI.ts (3 hunks)
  • src/completePostAnalysis.ts (1 hunks)
  • src/constants.ts (1 hunks)
  • src/embeddingTest.ts (3 hunks)
  • src/generateTitle.ts (3 hunks)
  • src/lib/utils.ts (1 hunks)
  • src/openaiClient.ts (1 hunks)
  • src/openaiValidationUtil.ts (2 hunks)
  • src/postGroup.ts (1 hunks)
  • src/redisCheck.ts (1 hunks)
  • src/redisClient.ts (4 hunks)
  • src/redisDedupeListener.ts (1 hunks)
  • src/redisSeed.ts (2 hunks)
  • src/redisTest.ts (1 hunks)
  • src/seed.ts (3 hunks)
  • src/seedData.ts (1 hunks)
  • src/seedDatabase.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (12)
src/redisCheck.ts (1)
src/redisClient.ts (1)
  • initRedis (25-83)
src/seedDatabase.ts (3)
src/redisClient.ts (2)
  • initRedis (25-83)
  • getRedisClient (85-91)
src/seedData.ts (1)
  • seedData (3-42)
src/postGroup.ts (1)
  • PostGroup (17-24)
src/generateTitle.ts (1)
src/openaiValidationUtil.ts (1)
  • callOpenAIWithValidation (10-74)
run-classifier.ts (4)
src/completePostAnalysis.ts (1)
  • runCompleteAnalysisDemo (100-143)
src/classifyWithOpenAI.ts (1)
  • runCategorization (53-63)
src/analyzeSentiment.ts (1)
  • analyzeMultiplePosts (19-51)
src/generateTitle.ts (1)
  • generateTitlesForPosts (123-139)
src/postGroup.ts (2)
src/generateTitle.ts (3)
  • generateTitleForPost (32-66)
  • SentimentSummaries (18-22)
  • generateSentimentSummariesForGroup (69-120)
src/redisClient.ts (2)
  • initRedis (25-83)
  • getRedisClient (85-91)
src/seed.ts (1)
src/seedDatabase.ts (3)
  • clearDatabase (44-58)
  • verifySeeding (63-87)
  • seedDatabase (9-38)
src/completePostAnalysis.ts (3)
src/classifyWithOpenAI.ts (2)
  • Categorization (24-24)
  • categorizePost (26-50)
src/generateTitle.ts (1)
  • generateTitleForPost (32-66)
src/analyzeSentiment.ts (1)
  • analyzeMultiplePosts (19-51)
src/redisTest.ts (1)
src/redisClient.ts (1)
  • initRedis (25-83)
src/redisDedupeListener.ts (2)
src/redisClient.ts (1)
  • initRedis (25-83)
src/embeddingTest.ts (1)
  • getEmbeddingWithRetry (3-40)
src/openaiValidationUtil.ts (1)
src/lib/utils.ts (2)
  • jitteredBackoff (6-7)
  • sleep (3-3)
src/classifyWithOpenAI.ts (3)
src/constants.ts (1)
  • CATEGORIES (3-22)
src/openaiValidationUtil.ts (1)
  • callOpenAIWithValidation (10-74)
src/generateTitle.ts (1)
  • generateTitleForPost (32-66)
src/analyzeSentiment.ts (2)
src/openaiValidationUtil.ts (1)
  • callOpenAIWithValidation (10-74)
src/generateTitle.ts (1)
  • generateTitleForPost (32-66)
🔇 Additional comments (26)
.coderabbit.yaml (1)

7-20: LGTM: keep drafts disabled for auto reviews

No functional change; config remains valid and aligned with your workflow.

README.md (1)

2-3: Nice, concise subtitle

Clear value proposition without bloat.

.vscode/launch.json (1)

2-22: Formatting-only changes look good

No behavioral changes; debug target remains the same.

src/constants.ts (1)

4-21: Quote normalization only

No behavioral changes. Good.

src/embeddingTest.ts (1)

17-21: Quota handling message is accurate; behavior unchanged.

Logging and rethrow on insufficient quota is fine for fail-fast flows.

src/lib/utils.ts (1)

3-3: LGTM on compact sleep helper.

src/openaiClient.ts (2)

7-9: Fail-fast on missing key is OK; verify this won’t break tests/CLI tooling.

If modules import this file at startup, it will throw before mocks/env are set.

Would you prefer lazy initialization to improve test ergonomics?

-export default openai;
+export default openai; // Consider exporting a getClient() that throws on first use instead.

11-13: LGTM.

Constructor args and typing look correct.

src/generateTitle.ts (3)

8-15: Tighter title constraints are good.

Zod min/max aligns with prompt instruction.


68-103: Schema + prompt for summaries looks solid.

Validation and retry wiring match the shared util.


104-113: Return only present summaries—nice.

Keeps output minimal and typed.

src/redisCheck.ts (1)

6-12: LGTM.

Safe default to [] and clean disconnect.

src/postGroup.ts (2)

28-30: LGTM.

Aggregating post content with double-newlines is reasonable input for title gen.


101-103: OK to keep require.main — current build/runtime is CommonJS

tsconfig.json is set to "module": "NodeNext" but package.json does not set "type": "module", and the start script runs node dist/run-classifier.js — compiled runtime will be CommonJS so require.main will be defined. If you switch to ESM (add "type":"module" or run via tsx/ts-node in ESM mode), replace this check with import.meta.main or an explicit CLI flag.

run-classifier.ts (4)

12-20: Header/log copy edit looks good.

Message normalization improves CLI UX. No functional change.


22-27: Categorization path: no behavior change.

All good.


67-85: Help text: concise and clear.

Nice, consistent with commands above.


89-91: Good error surface.

Non-zero exit with contextual message.

data/sample_posts.json (1)

35-35: EOF newline added.

Good housekeeping; no data changes.

src/seedData.ts (1)

4-42: Seed data formatting LGTM.

Consistent single quotes and explicit timestamps; aligns with PostGroup type.

src/seedDatabase.ts (1)

10-37: Seeding path: straightforward and clear.

Nice progress logs and summary; uses central Redis client helpers.

src/redisDedupeListener.ts (1)

1-3: Import-extension check — .js imports are already used; verify package.json/module output

  • src/redisDedupeListener.ts (lines 1–3) and other src files import with ".js" — the original claim that this file omits ".js" is incorrect.
  • tsconfig.json: "module":"NodeNext" / "moduleResolution":"NodeNext" (ts-node.esm: true).
  • package.json has no "type" field and start uses node dist/run-classifier.js — either add "type":"module" if tsc emits ESM, or change tsc/module output to CommonJS (or adjust the start script) to ensure runtime module resolution is consistent.

Likely an incorrect or invalid review comment.

src/seed.ts (2)

5-13: LGTM — formatting/help text only.

No behavioral changes; usage/help reads clearer.


81-84: Reconcile module type before changing the entrypoint guard.

package.json.type === "commonjs" while tsconfig.module === "NodeNext" and ts-node.esm === true — runtime is ambiguous. Confirm how the app is executed (ts-node with ESM vs compiled + node) before replacing the CommonJS guard.

  • Check package.json scripts (start/dev/build) and the compiled output in dist/ to determine runtime.
  • If runtime is CommonJS, keep if (require.main === module) main();. If runtime is ESM, replace with an ESM-safe check (example):
    import { fileURLToPath } from 'url';
    if (process.argv[1] === fileURLToPath(import.meta.url)) main();

File: src/seed.ts lines 81–84.

src/analyzeSentiment.ts (1)

7-16: LGTM — type/schema quote normalization.

src/completePostAnalysis.ts (1)

1-6: Remove unused TitleResult import from src/completePostAnalysis.ts
TitleResult is defined and used only in src/generateTitle.ts (export; used in generateTitlesForPosts). The only other repo occurrence is the import at src/completePostAnalysis.ts:5 — safe to remove.

Comment on lines +54 to 98
const postGroups = await fetchPostGroupsFromRedis();
if (!postGroups.length) {
console.log('No PostGroups found in Redis.');
return;
}
const postGroupsWithOrderedKeys = [];
for (const group of postGroups) {
try {
// Always generate and update title
const title = await generateTitleForPostGroup(group);
// Always generate and update sentiment summaries
const summaries = await generateSentimentSummariesForPostGroup(group);

if (context === 'CRON') {
console.log(`[CRON] Generated Title for PostGroup (id: ${group.id}) at ${new Date().toISOString()}:`, title);
} else {
console.log(`Title for PostGroup (id: ${group.id}):`, title);
}
if (context === 'CRON') {
console.log(`[CRON] Generated Title for PostGroup (id: ${group.id}) at ${new Date().toISOString()}:`, title);
} else {
console.log(`Title for PostGroup (id: ${group.id}):`, title);
}

postGroupsWithOrderedKeys.push({
id: group.id,
title,
bullishSummary: summaries.bullishSummary,
bearishSummary: summaries.bearishSummary,
neutralSummary: summaries.neutralSummary,
posts: group.posts
});
} catch (e) {
if (context === 'CRON') {
console.error(`[CRON] Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
} else {
console.error(`Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
}
}
}
// Always save updated PostGroups with titles and summaries back to Redis
await initRedis();
const redis = getRedisClient();
await redis.set('post-groups', JSON.stringify(postGroupsWithOrderedKeys));
if (context === 'CRON') {
console.log('[CRON] Updated post-groups with titles and sentiment summaries saved to Redis.');
} else {
console.log('Updated post-groups with titles and sentiment summaries saved to Redis.');
postGroupsWithOrderedKeys.push({
id: group.id,
title,
bullishSummary: summaries.bullishSummary,
bearishSummary: summaries.bearishSummary,
neutralSummary: summaries.neutralSummary,
posts: group.posts
});
} catch (e) {
if (context === 'CRON') {
console.error(`[CRON] Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
} else {
console.error(`Error generating title/summaries for PostGroup (id: ${group.id}):`, e);
}
}
}
// Always save updated PostGroups with titles and summaries back to Redis
await initRedis();
const redis = getRedisClient();
await redis.set('post-groups', JSON.stringify(postGroupsWithOrderedKeys));
if (context === 'CRON') {
console.log('[CRON] Updated post-groups with titles and sentiment summaries saved to Redis.');
} else {
console.log('Updated post-groups with titles and sentiment summaries saved to Redis.');
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Cron flow: add a try/catch inside the scheduled task to avoid unhandled rejections.

A Redis write failure would currently bubble up.

 cron.schedule('0 */6 * * *', async () => {
-  await logTitlesForAllPostGroups('CRON');
+  try {
+    await logTitlesForAllPostGroups('CRON');
+  } catch (e) {
+    console.error('[CRON] Unhandled error in scheduled task:', e);
+  }
 });

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/postGroup.ts around lines 54 to 98, the Redis write at the end
(initRedis/getRedisClient/redis.set) is not wrapped in a try/catch so a Redis
failure can bubble up in the CRON flow; wrap the Redis initialization and set
call in a try/catch that catches any error, logs a contextual error (including
whether context === 'CRON') and continues (do not rethrow), ensuring the
scheduled task never results in an unhandled rejection.

Comment on lines +25 to +37
// First, let's verify keyspace notifications are enabled
const config = await redisClient.configGet('notify-keyspace-events');

{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log("Merging keyspace notification flags:", merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log("Keyspace notifications updated.");
}
{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log('Merging keyspace notification flags:', merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log('Keyspace notifications updated.');
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

CONFIG SET may be blocked on managed Redis (e.g., Upstash). Guard it.

CONFIG SET often isn’t allowed; current code may throw and abort the listener.

Apply:

-  const config = await redisClient.configGet('notify-keyspace-events');
+  let config: Record<string, string> = {};
+  try {
+    config = await redisClient.configGet('notify-keyspace-events');
+  } catch (e) {
+    console.warn('CONFIG GET not permitted; proceeding without changing keyspace notifications.', e);
+    config = { 'notify-keyspace-events': '' };
+  }
@@
-    if (merged !== current) {
-      console.log('Merging keyspace notification flags:', merged);
-      await redisClient.configSet('notify-keyspace-events', merged);
-      console.log('Keyspace notifications updated.');
-    }
+    if (merged !== current) {
+      try {
+        console.log('Merging keyspace notification flags:', merged);
+        await redisClient.configSet('notify-keyspace-events', merged);
+        console.log('Keyspace notifications updated.');
+      } catch (e) {
+        console.warn('CONFIG SET not permitted; keyspace events may be disabled on this instance.', e);
+      }
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// First, let's verify keyspace notifications are enabled
const config = await redisClient.configGet('notify-keyspace-events');
{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log("Merging keyspace notification flags:", merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log("Keyspace notifications updated.");
}
{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
console.log('Merging keyspace notification flags:', merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log('Keyspace notifications updated.');
}
}
// First, let's verify keyspace notifications are enabled
let config: Record<string, string> = {};
try {
config = await redisClient.configGet('notify-keyspace-events');
} catch (e) {
console.warn('CONFIG GET not permitted; proceeding without changing keyspace notifications.', e);
config = { 'notify-keyspace-events': '' };
}
{
const current = config['notify-keyspace-events'] ?? '';
const required = ['K', 'E', 'A'];
const merged = Array.from(new Set([...current, ...required])).join('');
if (merged !== current) {
try {
console.log('Merging keyspace notification flags:', merged);
await redisClient.configSet('notify-keyspace-events', merged);
console.log('Keyspace notifications updated.');
} catch (e) {
console.warn('CONFIG SET not permitted; keyspace events may be disabled on this instance.', e);
}
}
}
🤖 Prompt for AI Agents
In src/redisDedupeListener.ts around lines 25 to 37, the code unconditionally
calls redisClient.configSet to enable keyspace notifications which can be
blocked on managed Redis (e.g., Upstash) and will throw/abort the listener; wrap
the configSet call in a try/catch, attempt the configSet only if merged !==
current, and on error log a non-fatal warning that configuration could not be
changed (include the error message) and continue without rethrowing so the
listener remains running; ensure normal flow proceeds whether or not the CONFIG
SET succeeds.

Comment on lines +41 to +44
const lockKey = 'dedupe:lock:posts';
const gotLock = await redisClient.set(lockKey, '1', { NX: true, PX: 10000 });
if (!gotLock) return;
try {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Distributed lock is unsafe; use a token and atomic unlock. TTL may expire mid-work.

Current pattern can delete another worker’s lock (no token) and TTL=10s is short relative to embedding calls. Use a unique value and Lua compare‑and‑del; consider longer TTL or renewal.

Apply:

+import { randomUUID } from 'node:crypto';
@@
-    const lockKey = 'dedupe:lock:posts';
-    const gotLock = await redisClient.set(lockKey, '1', { NX: true, PX: 10000 });
+    const lockKey = 'dedupe:lock:posts';
+    const lockVal = randomUUID();
+    const gotLock = await redisClient.set(lockKey, lockVal, { NX: true, PX: 60000 });
@@
-    } finally {
-      try {
-        await redisClient.del(lockKey);
-      } catch {}
-    }
+    } finally {
+      try {
+        // atomic unlock: delete only if value matches
+        await redisClient.eval(
+          'if redis.call("get", KEYS[1]) == ARGV[1] then return redis.call("del", KEYS[1]) else return 0 end',
+          { keys: [lockKey], arguments: [lockVal] }
+        );
+      } catch {}
+    }

Optionally renew TTL during long runs:

// fire-and-forget TTL heartbeat every 20s
const heartbeat = setInterval(() => redisClient.pexpire(lockKey, 60000).catch(() => {}), 20000);
try { /* work */ } finally { clearInterval(heartbeat); /* unlock */ }

Also applies to: 91-94

Comment on lines +45 to +59
if (message === 'set') {
console.log('Posts key was set via keyspace, processing update...');
const postsRaw = await redisClient.get('posts');
if (!postsRaw) return;
let posts: unknown;
try {
if (message === "set") {
console.log("Posts key was set via keyspace, processing update...");
const postsRaw = await redisClient.get("posts");
if (!postsRaw) return;
let posts: unknown;
try {
posts = JSON.parse(postsRaw);
} catch (e) {
console.error("Failed to parse posts JSON:", e);
return;
}
if (!Array.isArray(posts)) {
console.warn("Expected posts array; got:", typeof posts);
return;
}
// ...existing deduplication logic...
const latestPost = posts[posts.length - 1]; // Assume last is new
const latestCacheKey = `emb:${latestPost.id}`;
let latestEmbedding = await redisClient.get(latestCacheKey).then((s: any) => s ? JSON.parse(s) : null);
if (!latestEmbedding) {
latestEmbedding = await getEmbeddingWithRetry(latestPost.content);
await redisClient.set(latestCacheKey, JSON.stringify(latestEmbedding), { EX: 86400 });
}
console.log(latestEmbedding);
posts = JSON.parse(postsRaw);
} catch (e) {
console.error('Failed to parse posts JSON:', e);
return;
}
if (!Array.isArray(posts)) {
console.warn('Expected posts array; got:', typeof posts);
return;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Validate posts array contents and handle empty arrays.

Possible runtime errors: posts.length - 1 on empty, missing id/content fields.

Apply:

-        let posts: unknown;
+        let posts: unknown;
@@
-        if (!Array.isArray(posts)) {
+        if (!Array.isArray(posts)) {
           console.warn('Expected posts array; got:', typeof posts);
           return;
         }
+        if (posts.length === 0) {
+          console.log('Posts array is empty; nothing to process.');
+          return;
+        }
+        // narrow element shape
+        type Post = { id: string; content: string };
+        if (!posts.every((p: any) => p && typeof p.id === 'string' && typeof p.content === 'string')) {
+          console.warn('Posts do not conform to expected shape {id, content}.');
+          return;
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (message === 'set') {
console.log('Posts key was set via keyspace, processing update...');
const postsRaw = await redisClient.get('posts');
if (!postsRaw) return;
let posts: unknown;
try {
if (message === "set") {
console.log("Posts key was set via keyspace, processing update...");
const postsRaw = await redisClient.get("posts");
if (!postsRaw) return;
let posts: unknown;
try {
posts = JSON.parse(postsRaw);
} catch (e) {
console.error("Failed to parse posts JSON:", e);
return;
}
if (!Array.isArray(posts)) {
console.warn("Expected posts array; got:", typeof posts);
return;
}
// ...existing deduplication logic...
const latestPost = posts[posts.length - 1]; // Assume last is new
const latestCacheKey = `emb:${latestPost.id}`;
let latestEmbedding = await redisClient.get(latestCacheKey).then((s: any) => s ? JSON.parse(s) : null);
if (!latestEmbedding) {
latestEmbedding = await getEmbeddingWithRetry(latestPost.content);
await redisClient.set(latestCacheKey, JSON.stringify(latestEmbedding), { EX: 86400 });
}
console.log(latestEmbedding);
posts = JSON.parse(postsRaw);
} catch (e) {
console.error('Failed to parse posts JSON:', e);
return;
}
if (!Array.isArray(posts)) {
console.warn('Expected posts array; got:', typeof posts);
return;
}
if (message === 'set') {
console.log('Posts key was set via keyspace, processing update...');
const postsRaw = await redisClient.get('posts');
if (!postsRaw) return;
let posts: unknown;
try {
posts = JSON.parse(postsRaw);
} catch (e) {
console.error('Failed to parse posts JSON:', e);
return;
}
if (!Array.isArray(posts)) {
console.warn('Expected posts array; got:', typeof posts);
return;
}
if (posts.length === 0) {
console.log('Posts array is empty; nothing to process.');
return;
}
// narrow element shape
type Post = { id: string; content: string };
if (!posts.every((p: any) => p && typeof p.id === 'string' && typeof p.content === 'string')) {
console.warn('Posts do not conform to expected shape {id, content}.');
return;
}

@tasin2610
Copy link
Contributor

Thanks @nickventon for the weekend pr. This will make the code more robust, just check how many lines were formatted 😬

@tasin2610 tasin2610 merged commit 4ec8528 into main Sep 21, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants