Skip to content

fix(xiaohongshu): parseLikes should handle 2.1w / 1.5万 / 1.2k shortforms#1504

Merged
jackwener merged 2 commits into
jackwener:mainfrom
John15Wil:fix/xiaohongshu-comments-parseLikes-shortform
May 13, 2026
Merged

fix(xiaohongshu): parseLikes should handle 2.1w / 1.5万 / 1.2k shortforms#1504
jackwener merged 2 commits into
jackwener:mainfrom
John15Wil:fix/xiaohongshu-comments-parseLikes-shortform

Conversation

@John15Wil
Copy link
Copy Markdown
Contributor

Description

Fix parseLikes in clis/xiaohongshu/comments.js so it correctly parses Xiaohongshu's shortform like-count rendering (2.1w, 1.5万, 1.2k, 3千). The previous implementation only matched bare digits (/^\d+$/) and silently returned 0 for any shortform string, which inverts the ranking: the most-liked top-level comments (typically 10k+) end up with likes: 0 while mid-tier comments with plain numeric counts (e.g. 7569) bubble up to the top.

This affects both top-level comments and 楼中楼 sub-replies (same helper).

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 🌐 New site adapter
  • 📝 Documentation
  • ♻️ Refactor
  • 🔧 CI / build / tooling

Repro

On any popular Xiaohongshu thread where the top comment has more than ~10 000 likes, run:

opencli xiaohongshu comments "<full signed URL>" --limit 50 --with-replies true --format json

Sort the resulting JSON by likes desc — the actual most-liked rows show "likes": 0 because their rendered count is 2.1w / 1.1万 / etc., which the old regex rejected.

After this patch the same output puts the real top comments at rank 1.

Patch summary

Keep the fast path for bare integers, then add a single regex covering the well-known shortform suffixes:

Input Parsed
7569 7569
2.1w / 1.5万 21000 / 15000
1.2k / 3千 1200 / 3000
999+ 999 (trailing + tolerated)
anything else 0 (unchanged fallback)

Notes on testing

parseLikes lives inside the IIFE that gets serialized into a string and passed to page.evaluate(). The existing comments.test.js mocks page.evaluate's return value directly, so it does not exercise the IIFE-internal helpers — there is no current way to unit-test parseLikes in isolation without exporting it. Refactoring to make it directly importable would be a meaningful change in surface area, so I've intentionally kept this PR scoped to the bug fix.

I verified the fix manually against a thread with a 21k-liked top comment; ranking now matches the in-app order.

Checklist

  • I ran the checks relevant to this PR (manual verification on a live thread; existing tests untouched and still mock at the boundary above this helper)
  • I updated tests or docs if needed (see "Notes on testing")
  • I included output or screenshots when useful (see "Repro" / table above)

John15Wil and others added 2 commits May 12, 2026 17:51
Xiaohongshu renders top-popular comment like-counts as shortened
strings like '2.1w' / '1.1万' / '1.2k' once they exceed ~10 000.
The previous parseLikes only matched bare digits via /^\d+$/ and
silently returned 0 for any shortform, which inverted the sort
order: the highest-liked comments (often 10k+) ranked last while
mid-tier comments with plain numeric counts (e.g. 7569) appeared
on top.

Repro on any popular xiaohongshu thread (>10 000 likes on a top
comment): with --format json the most-upvoted parent rows show
"likes": 0.

This patch keeps the original fast path for plain integers and
adds a single regex for the well-known shortform suffixes:

  - w / 万 -> *10000
  - k / 千 -> *1000
  - trailing '+' tolerated (e.g. '999+')
  - unknown shapes still fall back to 0 (no behavior change)

Note: parseLikes runs inside the IIFE injected via page.evaluate(),
so the existing comments.test.js mock harness (which stubs
evaluate's return value directly) does not exercise it. A future
refactor that exports parseLikes for direct testing would be a
separate change.

Affects both top-level comments and 楼中楼 sub-replies (same
helper).
@jackwener jackwener merged commit b1dca04 into jackwener:main May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants