Animu Crawling System
System's features
- Looking for new animu videos from rss
- Auto clone animu videos via torrent
- Media serve on browser
System's struct
.
├── README.md
├── dummy_files
├── pacman
├── logs
├── resources
├── scripts
├── server
└── media
pacman
=> rss listener, config rss list on resources/feeds.jsonlogs
=> rss listener's logresources
=> store rss listscripts
=> system's scripts (such as download script)server
=> media servermedia
=> store videosdummy_files
=> store *
How to run
Require packages
pacman
rss listener
For This service performed by python2.7 using libs:
- requests (install via
pip
) - python-bs4 (install via package manger, such as
apt
) - logging (install via
pip
)
Install python libs via old version of pip
may harm security problems. Please update your pip
to newest version to avoid that kind of errors.
server
media
For This service performed by node js. Just run npm install
on the first time run this project for installing dependencies packages.
For using npm start
command, please install nodemon package globally to your host.
scripts
download videos via torrent
For Install aria2
package via your package manager such as apt
. More informations go here
Add new rss
Rss list stored at resources/feeds.json
.
Sample config:
{
"title": "FEED LIST",
"data": [
{
"team": "fuyu",
"rss": "https://www.fuyufs.com/episode/feed",
"anchor": {
"tag": "a",
"css_selector": {"data-key":"quality_720p_torrent"}
}
},
{
"team": "HorribleSubs",
"rss": "https://nyaa.si/?page=rss&u=HorribleSubs",
"anchor": false
}
]
}
For each rss, if link
field from xml
code not contain url for .torrent file, you must specific anchor
field on rss config to point at tag that have .torrent file download link.
Run system
Run 3 services one by one
Pacman service - RSS listener
$ cd pacman
$ python main.py
Media service
If you installed nodemon
$ cd server
$ npm start
else
$ cd server
$ node app.js
Monitoring service (optional)
$ cd scripts
$ watch -n 10 "./status.sh"
Endpoint list
With xxx.yyy
is your host
=> Greeting endpoint
=> Cloned videos list
=> Crawling service log
=> Crawling system status (for checking hard disk capacity, system status, etc)