Skip to content
forked from dongweiming/commentbox, 自学用
Python JavaScript Makefile
Branch: master
Clone or download
Pull request Compare This branch is even with JFluo2011:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets
data
libs
spider
static
templates
views
.gitattributes
.gitignore
README.md
app.py
config.py
ext.py
local_settings.py.tmpl
models.py
requirements.txt
run.py

README.md

Commentbox

一个抓取网易云音乐精彩评论的爬虫(forked from dongweiming/commentbox)

爬虫实现可见Python之美。请访问云音乐评论看真实效果。(链接均已失效)

预览,Web端:

移动端效果:

使用技术

  1. 后端: Flask + Mongoengine + Mako + requests + Redis + lxml + concurrent.futures

  2. 前端:React + Mobx + Fetch + Material-UI + ES6 + Webpack + Babel

Getting Started

虚拟环境和安装应用依赖

❯ git clone https://github.com/dongweiming/commentbox
❯ cd commentbox
❯ virtualenv venv
❯ source venv/bin/activate
❯ pip install -r requirements.txt
❯ cp local_settings.py.tmpl local_settings.py  # 然后修改其中的配置(如Redis,MongoDB)

爬虫篇

  1. 抓取之前可以添加一些代理地址到local_settings.py中,否则会影响爬取速度。
  2. 修改run.py中max_workers的数量,建议选择服务器CPU核数作为这个值。 然后启动python run.py就开始抓取了。

前端开发篇

先安装:

❯ cd assets
❯ npm install  # 推荐使用cnpm, 要不然有点慢

开发:

开发时可以先修改server.js里面的主机和端口号,然后启动

❯ make dev

目前默认后端使用8100端口,开发模式使用3000端口。

部署:

 ❯ make build

执行完毕就会在生成新的static/js/dist/index.bundle.js*文件了。

Enjoy it!

You can’t perform that action at this time.