Articles on 百度文库 are downloaded with 下载券 normally. TROUBLESOME, especially I have no 下载券 ! So I made this simple automatic tool to capture the article as images, then you can process these images as your mind, e.g. converting to pdf, merge them etc.
You can run codes directly
npm start
Or, you can build and run bin file
npm run build
# CLI
node ./dist/index.js
# Websocket server
node ./dist/server.js
More, you can install as global command
npm install -g .
# Just exec the command
# CLI
bdwkc
# Websocket server
bdwkc_ws
ATTENTION
Required environment variables if you run this program due to npm global installed command
PUPPETEER_EXECUTABLE_PATH: Puppeteer standard env. variable, which tells an executable path of Chromium/Chrome
BDWKC_OUTPUT_DIR: Output path of captured document images
- Support DOC
- Support PDF
- Support XLS
- Support TXT
- Support PPT
- Better CLI interface
- Websocket interface
- Perfect the capture method
MIT