Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
== Basic setup Before we get started: to use BagelBox you need to install the 'wget' utility. It comes pre-installed on most linux distro's but it's not on osx by default. For osx download wget here: http://www.statusq.org/archives/2008/07/30/1954/ Make sure you have sqlite installed. OSX comes pre-installed, on ubuntu 9.04: 'sudo apt-get install sqlite3' 'sudo apt-get install libsqlite3-ruby' 'sudo gem install sqlite3-ruby' Now we migrate the database: 'rake db:migrate RAILS_ENV=production' Now start the BagelBox server using: 'script/server -e production' == Welcome to BagelBox I've tried to make scraping as easy as possible but there is still some stuff I need to explain. The goal of this application is to help you get the content you want without actually manually downloading it. You can set up several sources of content. A content source provides us with stuff to download. Examples could be Pirate Bay, Demonoid or someone's ftp site. Content can be 'scraped' (gathered) from different kinds of sources. At the moment RSS, FTP and Directory sources are supported. I took the liberty to set up some default sources for you to get started. == Filtering content You probably don't want to download everything from every source so we'll have to filter what you want and what you don't want. We are going to set up filters for this purpose. A filter looks at the offered content and determines if you might want it or not. I added a default filter for you that states you do *not* want TS or CAM files. Who watches those anyway? You can add very specific filters that say you want 'episode 1 of season 1 of the X-Files' or very generic filters like 'Every episode of desperate housewives'. In the latter case you might not want the episodes you already have, so you can add negative filters too. These filters are exactly the same except when they match the content is rejected. == Building filters automatically We both know you are lazy and don't want to set these filters up manually. Who can blame you! Besides content sources we also have filter sources. These sources generate filters. You can have positive filter sources like the RSS stream from myepisodes.com or negative sources like your local movies directory (You don't want to download those again!). == Setting up timed scraping Go to the settings page and press the install button to enable timed scraping. == Advanced customization Most of the setup can be customized. Check out file types to see how you can add more stuff to scrape!