Skip to content
crawl ptt articles from its website
Branch: master
Clone or download
Latest commit 792286d May 10, 2016
Type Name Latest commit message Commit time
Failed to load latest commit information.


crawl ptt articles from its website


scraping certain ptt board:

lsc <board-name>

All posts will be downloaded into data//post/ folder. There will also be a data//post-list.json to kepp track of your download history, so you can interrupt your download at any time and resume later.

categorize authors by title:

lsc <board-name> example for fetching articles for article generation example for categorizing purpose of articles analyze users stand point. output to data//id-stat.json show users statistics, generate suspect.json.


all sources are licensed under MIT License. ( I used CC-BY-4.0 license before, but MIT License is better for code license. please refer to correspondent license according to the time you fork this project. )

You can’t perform that action at this time.