Skip to content
Browse files

missed checkin

1 parent caf7314 commit 9ea115ef75a2eaf13ffe9acf357f5004f67c9f2f Pete Skomoroch committed
Showing with 4 additions and 1 deletion.
  1. +4 −1 streaming/parse_tweets.sh
View
5 streaming/parse_tweets.sh
@@ -4,6 +4,9 @@
# bash /mnt/parse_tweets.sh 2010-02-1 parsed_tweets_feb
# hadoop distcp /user/root/parsed_tweets_feb/ s3n://where20/parsed_tweets_feb
+# optional -jobconf mapred.output.compress=true \
+# optional -jobconf mapred.reduce.tasks=0 \
+
DATELIMIT=$1
OUTPUT=$2
@@ -12,5 +15,5 @@ hadoop jar /home/hadoop/contrib/streaming/hadoop-streaming.jar \
-output $OUTPUT \
-mapper "parse_stream.py" \
-file 'parse_stream.py' \
- -jobconf mapred.output.compress=true \
+ -jobconf mapred.reduce.tasks=0 \
-jobconf mapred.job.name=parse_tweets_$DATELIMIT

0 comments on commit 9ea115e

Please sign in to comment.
Something went wrong with that request. Please try again.