Permalink
Browse files

A script for automated crawling

  • Loading branch information...
1 parent 28d4699 commit 17244a32c8bda25c99a0774efcd02deb23818a59 @pde committed Sep 29, 2012
Showing with 19 additions and 0 deletions.
  1. +19 −0 code/robocrawl
View
@@ -0,0 +1,19 @@
+#!/bin/bash
+#
+# Run and commit a crawl, merging the results into the local "data" branch
+#
+# Unfortunately this doesn't work in cron unless without a patch for
+# http://stackoverflow.com/questions/4399617/python-os-getlogin-problem
+#
+CRAWLER_CHECKOUT=~/tosback2_real_crawls/
+set -x
+cd $CRAWLER_CHECKOUT
+git checkout --force data
+python code/crawl.py > crawl.log
+TAG=`grep "Committing results to" crawl.log | cut -d" " -f 4`
+if [ "$TAG" != "" ] ; then
+ git checkout --force "$TAG" || exit 1
+ git merge -s ours data || exit 1
+ git checkout --force data || exit 1
+ git merge "$TAG"
+fi

0 comments on commit 17244a3

Please sign in to comment.