Hadoop Map Reduce Implementation : AvgTemp

Project To calculate the average of Temperatures from a dataset using MapReduce in Hadoop

Steps to Run :

1.4. Get the dataset from http://www.ncdc.noaa.gov/orders/qclcd/ : For any specific year or whole.

 Once you have everything in place:

use start-all.sh (Deprecated : use start-dfs.sh and start-yarn.sh for starting namenodes, datanodes, resource managers)

Type in jps in terminal : (You get something like this)

4181 DataNode

5214 NameNode

5753 org.eclipse.equinox.launcher_1.3.0.v20140415-2008.jar

4643 NodeManager

4357 SecondaryNameNode

4514 ResourceManager

14302 Jps
Shoot up a new cluster either in local machine or virtual machine. I have used here my local machine installed with Ubuntu 14.04LTE Go to terminal : type in : ##hadoop fs -mkdir -p hdfs://localhost:9000//data/big/weather (To create a directory in your HDFS)
Now copy data to your cluser: For example you have a dataset residing in /home/anurag/data/small/ type in : hadoop fs -put /home/anurag//data/big/weather hdfs://localhost:9000//data/big/weather
Go to : http://localhost:50070/explorer.html to see whether files were copied or not. Now I have already exported the jar for using with our dataset, you can export your own from IDE Copy the jar to your filesystem. Mine resides in /home/anurag/workspace2/AvgTemp/bin/AvgTemp.jar
So to run the Job on your cluster : type in : hadoop jar /home/anurag/workspace2/AvgTemp/bin/AvgTemp.jar AvgTemp hdfs://localhost:9000/data/big/weather hdfs://localhost:9000/data/big/weather/output.

And voila!! You just ran your Map Reduce Job successfully.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.settings		.settings
bin		bin
images		images
src		src
.classpath		.classpath
.project		.project
README.md		README.md