This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
commit c680a9bbd8d1a54945135a50ebbc779d08306e6e
tree c594b225d868790eddbeafbaca03d296933d4ba0
parent 126f80c2c2a68f25486b759786fe1d9b375ce708 parent a487211a027c53b2493e1a709119f5e70efde3ee
tree c594b225d868790eddbeafbaca03d296933d4ba0
parent 126f80c2c2a68f25486b759786fe1d9b375ce708 parent a487211a027c53b2493e1a709119f5e70efde3ee
BagelBox /
| name | age | message | |
|---|---|---|---|
| |
.gitignore | ||
| |
LICENSE | ||
| |
README | ||
| |
Rakefile | ||
| |
TODO | ||
| |
app/ | ||
| |
config/ | ||
| |
db/ | ||
| |
doc/ | ||
| |
downloads/ | ||
| |
lib/ | ||
| |
log/ | ||
| |
public/ | ||
| |
script/ | ||
| |
spec/ | ||
| |
tmp/ | ||
| |
vendor/ |
README
== Basic setup Before we get started: to use BagelBox you need to install the 'wget' utility. It comes pre-installed on most linux distro's but it's not on osx by default. For osx download wget here: http://www.statusq.org/archives/2008/07/30/1954/ Make sure you have sqlite installed. OSX comes pre-installed, on ubuntu 9.04: 'sudo apt-get install sqlite3' 'sudo apt-get install libsqlite3-ruby' 'sudo gem install sqlite3-ruby' Now we migrate the database: 'rake db:migrate RAILS_ENV=production' Now start the BagelBox server using: 'script/server -e production' == Welcome to BagelBox I've tried to make scraping as easy as possible but there is still some stuff I need to explain. The goal of this application is to help you get the content you want without actually manually downloading it. You can set up several sources of content. A content source provides us with stuff to download. Examples could be Pirate Bay, Demonoid or someone's ftp site. Content can be 'scraped' (gathered) from different kinds of sources. At the moment RSS, FTP and Directory sources are supported. I took the liberty to set up some default sources for you to get started. == Filtering content You probably don't want to download everything from every source so we'll have to filter what you want and what you don't want. We are going to set up filters for this purpose. A filter looks at the offered content and determines if you might want it or not. I added a default filter for you that states you do *not* want TS or CAM files. Who watches those anyway? You can add very specific filters that say you want 'episode 1 of season 1 of the X-Files' or very generic filters like 'Every episode of desperate housewives'. In the latter case you might not want the episodes you already have, so you can add negative filters too. These filters are exactly the same except when they match the content is rejected. == Building filters automatically We both know you are lazy and don't want to set these filters up manually. Who can blame you! Besides content sources we also have filter sources. These sources generate filters. You can have positive filter sources like the RSS stream from myepisodes.com or negative sources like your local movies directory (You don't want to download those again!). == Setting up timed scraping Go to the settings page and press the install button to enable timed scraping. == Advanced customization Most of the setup can be customized. Check out file types to see how you can add more stuff to scrape!








