fix(xiaohongshu): parseLikes should handle 2.1w / 1.5万 / 1.2k shortforms#1504
Merged
jackwener merged 2 commits intoMay 13, 2026
Merged
Conversation
Xiaohongshu renders top-popular comment like-counts as shortened strings like '2.1w' / '1.1万' / '1.2k' once they exceed ~10 000. The previous parseLikes only matched bare digits via /^\d+$/ and silently returned 0 for any shortform, which inverted the sort order: the highest-liked comments (often 10k+) ranked last while mid-tier comments with plain numeric counts (e.g. 7569) appeared on top. Repro on any popular xiaohongshu thread (>10 000 likes on a top comment): with --format json the most-upvoted parent rows show "likes": 0. This patch keeps the original fast path for plain integers and adds a single regex for the well-known shortform suffixes: - w / 万 -> *10000 - k / 千 -> *1000 - trailing '+' tolerated (e.g. '999+') - unknown shapes still fall back to 0 (no behavior change) Note: parseLikes runs inside the IIFE injected via page.evaluate(), so the existing comments.test.js mock harness (which stubs evaluate's return value directly) does not exercise it. A future refactor that exports parseLikes for direct testing would be a separate change. Affects both top-level comments and 楼中楼 sub-replies (same helper).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix
parseLikesinclis/xiaohongshu/comments.jsso it correctly parses Xiaohongshu's shortform like-count rendering (2.1w,1.5万,1.2k,3千). The previous implementation only matched bare digits (/^\d+$/) and silently returned0for any shortform string, which inverts the ranking: the most-liked top-level comments (typically 10k+) end up withlikes: 0while mid-tier comments with plain numeric counts (e.g. 7569) bubble up to the top.This affects both top-level comments and 楼中楼 sub-replies (same helper).
Type of Change
Repro
On any popular Xiaohongshu thread where the top comment has more than ~10 000 likes, run:
Sort the resulting JSON by
likesdesc — the actual most-liked rows show"likes": 0because their rendered count is2.1w/1.1万/ etc., which the old regex rejected.After this patch the same output puts the real top comments at rank 1.
Patch summary
Keep the fast path for bare integers, then add a single regex covering the well-known shortform suffixes:
756975692.1w/1.5万21000/150001.2k/3千1200/3000999+999(trailing+tolerated)0(unchanged fallback)Notes on testing
parseLikeslives inside the IIFE that gets serialized into a string and passed topage.evaluate(). The existingcomments.test.jsmockspage.evaluate's return value directly, so it does not exercise the IIFE-internal helpers — there is no current way to unit-testparseLikesin isolation without exporting it. Refactoring to make it directly importable would be a meaningful change in surface area, so I've intentionally kept this PR scoped to the bug fix.I verified the fix manually against a thread with a 21k-liked top comment; ranking now matches the in-app order.
Checklist