Skip to content

Commit

Permalink
Add Readme and subdirectory for the downloads
Browse files Browse the repository at this point in the history
  • Loading branch information
eduardschaeli committed Nov 2, 2012
1 parent 46bc6b5 commit c7f83c3
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 5 deletions.
15 changes: 15 additions & 0 deletions README.md
@@ -0,0 +1,15 @@
# Image scraper

This shellscript scrapes all images, recursively, from the urls listed
in the sites.txt file.

It will be used in an art project by Stefan Baltensperger.

Usage:
- Add urls to the sites.txt file like this
http://www.tagi.ch
The HTTP:// part ist important!
- Make the scraper.sh shell script executable:
chmod +x scraper.sh
- Run it like this:
./scraper.sh
7 changes: 4 additions & 3 deletions scraper.sh
@@ -1,18 +1,19 @@
#!/bin/sh
mkdir downloads
for line in `cat sites.txt`; do
# replace http://
stripped_url=`echo $line| cut -c8-`
target_folder=$stripped_url
target_folder="downloads/$stripped_url"

echo $stripped_url
mkdir $stripped_url
mkdir $target_folder
echo ""
echo ""
echo ""
echo "Scraping $stripped_url"
echo "-----------------------------------"
echo "> creating folder.."
mkdir $stripped_url
mkdir $target_folder
echo "> scraping $stripped_url"
wget -e robots=off --recursive -p \
-nd -nc -np --accept jpg,jpeg,png,gif -P $target_folder --wait 0.5 $stripped_url
Expand Down
4 changes: 2 additions & 2 deletions sites.txt
@@ -1,2 +1,2 @@
http://www.sysinf.ch
http://lab.sysinf.ch
http://www.YOURURL.ch
http://lab.YOUR-SECOND-URL.ch

0 comments on commit c7f83c3

Please sign in to comment.