UPDATE (February 4, 2024): This is the discussion about this project on HN: here. Please specifically read @dang's comment regarding the core assumption of this project: here. On a personal note, the number of Stories removed yesterday (Saturday, February 3, 2024) was the lowest ever recorded by the service. This includes 2 duplicate Stories. As a side note, in the list always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
The purpose of this project is to try to understand the type and scale of the moderation of the Hacker News Front Page.
NOTE: I love Hacker News. I try to read it every day. In the case of OnnxStream (here for example), 95% of the comments were helpful and intelligent. I also understand that moderating a site with huge traffic and where users are basically anonymous must be a very difficult task.
Returning to the purpose of this project, from what I have been able to see, the "public" (i.e. observable from the outside) moderation of the Front Page consists of two main tools: modification of the title of a Story (voluntarily or involuntarily influencing its growth in terms of rank) or directly its removal.
Regarding the first type of moderation, an excellent site is already available that tracks changes to Story titles. Here instead I will focus on the second type.
For the reasons explained in the "Why?" section below, I have developed a small application that logs all the Stories that are removed from the Front Page, for personal use. I later discovered that there is no tool/website that provides this type of information and I decided to make it public here. It was a difficult decision but my rationale is: is it better to have more transparency or less transparency?
If you know of a tool/website similar to this, please let me know: I will archive this repo or set it to private.
A possible very positive outcome for this project could be to have a list similar to this, but available directly among the HN lists. Or even to notify a user when a Story is penalized on the Front Page, perhaps indicating the number of flags and/or the reason, for example.
Feel free to skip this part or click to expand
A friend of mine posted two Stories on Hacker News related to OnnxStream (31 days apart), the first related to SDXL Turbo support and the second related to TinyLlama and Mistral 7B support.
In the case of the first, the Story was among the first on the Front Page, until its title was changed from "Stable Diffusion Turbo on a Raspberry Pi Zero 2 generates an image in 29 minutes" to "OnnxStream: Stable Diffusion XL 1.0 Base on a Raspberry Pi Zero 2". This effectively "killed" the Story. One user pointed out that the new title didn't reflect the spirit of the Story (thanks @practice9).
In the case of the second, the Story was in third place on the Front Page, less than an hour after the submission. In this case it was simply removed from the Front Page.
Having discovered this, perplexed, I sent an email to the moderator. @dang, who was very kind and quick in his response, explained to me that the Story had been flagged by users even without being explicitly [flagged], and that he could therefore only hypothesize the causes of the flag. His hypothesis was that (some?) users might be fed up with news related to LLMs.
While I have no reason to doubt Daniel's good faith, it's hard to believe that HN users would be tired of LLM-related news.
So I decided to develop a small console application to determine the frequency of this phenomenon (actually I was also motivated by the prospect of writing some C# code, after more than 2 years of complete abstinence). I subsequently discovered that there were no tools/websites that monitored this specific phenomenon and I therefore decided to make it public here.
Using the official HN API, the service fetches 90 Top Stories every minute and makes a comparison with the first 30 Top Stories (i.e. the Front Page) fetched the previous minute. It logs all missing Stories here. The assumption is that a Story cannot go from the top 30 to a position greater than 90 in a single minute, without having been explicitly removed. If a Story reappears on the Front Page, it is removed from this log. All Stories present in the second-chance pool are excluded from the log. Title and URL are those from when the Story first appeared in the top 30. The number of points and comments and the rank are those from when the Story was removed from the Front Page. The ID points to the news.social-protocols.org page for that Story, which provides a graph of the Story's position on the Front Page over time.
NOTE: always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
- 42334872 #3 58 points 7 comments -> Assassination Is a Leaky Abstraction
- 42339071 #29 4 points 0 comments -> Former zynga CEO turns into a pro-genocide activist
- 42340035 #2 13 points 6 comments -> Zero-based regulation made Idaho the least regulated state in the US
- 42330055 #13 381 points 109 comments -> 7 Databases in 7 Weeks for 2025
- 42341566 #13 21 points 20 comments -> Serverless VPN Self-hosted Be your own private on-demand VPN provider
- 42346084 #5 3 points 1 comments -> Times New Dumbass Font
- 42345570 #19 65 points 40 comments -> TikTok set to be banned in the US after losing appeal
- 42345500 #1 115 points 28 comments -> OpenWrt One router officially launched
- 42348664 #9 49 points 34 comments -> Firefox Is the Superior Browser
- 42349415 #16 28 points 41 comments -> Top internet sleuths say they won't help find the UnitedHealthcare CEO killer
- 42347885 #21 27 points 40 comments -> The Birmingham Blade: geographically tailored urban wind turbine designed by AI
- 42340000 #15 237 points 243 comments -> Rivian is opening its charging network to other EVs
- 42349797 #26 34 points 5 comments -> PSA: The Kagi search engine directly funds Yandex –- and refuses to stop
- 42350677 #16 10 points 3 comments -> Co-sleeping causes 3 more infant deaths in New York, officials say in warning
- 42350351 #19 58 points 42 comments -> Show HN: Scraper for job listings directly from company websites
- 42352291 #13 21 points 9 comments -> How One of the Richest Men Is Avoiding $8B in Taxes
- 42353338 #20 6 points 1 comments -> White British students not allowed to apply for security services internship
- 42352983 #9 41 points 16 comments -> US Food and Drug Administration moves to ban red food dye
- 42351698 #13 63 points 32 comments -> Google's AI weather prediction model is pretty darn good
- 42351490 #6 31 points 2 comments -> Windows on ARM Gets Major Gaming Boost with Prism Update
- 42352682 #13 27 points 4 comments -> Five of the best science fiction books of 2024
- 42350672 #30 54 points 53 comments -> The electric shock behind Europe's stuttering EV future
- 42286808 #17 6 points 1 comments -> 6 Lessons I learned working at an art gallery
- 42353983 #22 17 points 11 comments -> Legendary video game developer imagines a future where GPUs don't need PCs
- 42354201 #18 11 points 7 comments -> Lower-cost sodium-ion batteries are finally having their moment
- 42353540 #5 35 points 41 comments -> Economics and Homemakers
- 42353929 #17 12 points 3 comments -> FDIC's Redacted Pause Letters
- 42354056 #7 56 points 5 comments -> GrapheneOS on Pixels getting extended Android support
- 42355790 #28 4 points 0 comments -> The UC Berkeley Project That Is the AI Industry's Obsession
- 42355128 #25 42 points 36 comments -> What Arm's CEO makes of the Intel debacle
- 42292443 #12 4 points 0 comments -> 'Maya blue': The mystery dye recreated two centuries after it was lost
- 42353948 #9 62 points 47 comments -> Landlords Are Using AI to Raise Rents
- 42334383 #9 21 points 2 comments -> The Underground University
- 42311667 #24 27 points 40 comments -> A critical history of the FDA
- 42357663 #23 5 points 0 comments -> "Paycheck-to-paycheck" and five other popular myths
- 42356814 #25 6 points 1 comments -> Don't Block the Event Loop (Or the Worker Pool) in JavaScript
- 42359571 #7 10 points 1 comments -> I spent 2 years rebuilding my trading platform in Rust. I have no regrets
- 42360338 #4 10 points 8 comments -> In wake of CEO shooting, Amazon creates Executive Protection role
- 42360116 #18 13 points 4 comments -> Defusing AGPL-3 with Batch Processing
- 42360237 #11 19 points 17 comments -> Difference in Gastrointestinal Cancer Risk and Mortality by Dietary Patterns
- 42362291 #20 -> Records Seized by Israel Show Hamas Presence in U.N. Schools
- 42362970 #6 24 points 5 comments -> Lethal Dose of 55 Substances
- 42363087 #5 8 points 9 comments -> How to Create Intelligently Self-Modifying Software (Framework Release Soon)
- 42363592 #8 19 points 40 comments -> Skype Credit is no longer available
- 42364241 #6 39 points 8 comments -> Raspberry Pi 500 Review: The keyboard is the computer, again
- 42358358 #11 139 points 75 comments -> Replace Philips Hue Automation with Home Assistant's
- 42361503 #21 96 points 38 comments -> Broward Co. to vacate convictions for buying crack made by Sheriff's Office
- 42292956 #28 14 points 5 comments -> See how a lab-grown diamond is made
- 42361299 #15 152 points 49 comments -> Buffer Overflow Risk in Curl_inet_ntop and Inet_ntop4
- 42359905 #28 49 points 7 comments -> VictoriaLogs: A Grafana Dashboard for AWS VPC Flow Logs – Migrating from Grafan
- 42366505 #16 4 points 0 comments -> Flawless Replay, time-traveling debugger for Rust workflows
- 42366706 #30 10 points 1 comments -> The "Quiet Quitting" Myth Is Toxic for Tech
- 42366546 #17 27 points 5 comments -> Theory-building and why employee churn is lethal to software companies
- 42368697 #15 4 points 0 comments -> Howie Did It – 3D Printing a Printed Circuit Board [video]
- 42364630 #30 10 points 2 comments -> Practical GrapheneOS for the Paranoid • Ventral Digital
- 42368210 #8 -> Show HN: I used ChatGPT and Blender to combine 150 WW2 movies chronologically
- 42370370 #21 7 points 1 comments -> Mega-buildings are now slowing Earth's spin
- 42370688 #16 16 points 6 comments -> Luigi Mangione's arrest canary [video]
- 42370854 #29 4 points 0 comments -> The Dumbest Bike Lane Law Just Passed in Canada [video]
- 42347466 #26 16 points 9 comments -> Atoms for Peace: Learning to Love the Bomb
- 42371315 #7 6 points 2 comments -> How Can I Be an AI Engineer?
- 42371166 #15 -> Healthcare CEO killer studied computer science at UPenn, founded game dev club
- 42372508 #25 6 points 5 comments -> Giant Study Links Drinking Coffee with Almost 2 Extra Years of Life
- 42374469 #4 8 points 0 comments -> 15 Times to use AI, and 5 Not to
- 42375632 #6 12 points 3 comments -> ChatGPT's Sad Second Birthday
- 42316212 #17 7 points 1 comments -> Piskel – Free online editor for animated sprites and pixel art
- 42379435 #25 5 points 4 comments -> Scientists claim they've found the cause mystery colon cancers in young people
- 42386726 #5 44 points 38 comments -> Bankruptcy judge rejects The Onion's bid for Infowars
- 42386683 #25 37 points 0 comments -> The Onion's Purchase of Alex Jones' Infowars Stopped by US Judge
- 42387549 #2 121 points 94 comments -> The PayPal Mafia is taking over America's government
- 42388983 #13 67 points 33 comments -> Google are deliberately breaking YouTube when it detects you're running Firefox
- 42390630 #6 8 points 0 comments -> UnitedHealthcare's Leaked Talking Points
- 42391483 #14 29 points 13 comments -> One of our clients hasn't paid us $130k – or "Why Every Contract Clause Matters"
- 42386906 #28 5 points 1 comments -> A chatbot hinted a kid should kill his parentts over screen time limits: lawsui
- 42391486 #27 31 points 40 comments -> Show HN: Convert your LinkedIn profile to a resume
- 42350780 #29 9 points 2 comments -> WW1 dazzle camouflage was not as well understood as it might have been