Skip to content

Generate emulator input files

kylemarcus edited this page Sep 28, 2014 · 1 revision
  1. Create downsample files from Titan2d pileheight record output

The file to do this is downsample.py and I think the pileheight record path is hard coded so that will have to change. Also look out for numbers like 2048 that are hard coded (they might need to be changed).

Right now you give downsample.py a pileheight number and it outputs the corresponding downsample file that will be needed later.

  • uniq_coords.txt This comes from running emulator_netezza.sh

  • uniq_uniq.txt run map3.py on hadoop and then merge all the output files

  • uniq_phm.txt https://github.com/kylemarcus/emulator/blob/4abf1d0969cc3ce81bad9dc6ee8736feb88a83c7/mapreduce/hadoop/rohitOriginal/start_hadoop bin/hadoop jar contrib/streaming/hadoop-streaming.jar
    -file /user/shivaswa/my_hadoop/map1.sh -mapper map1.sh
    -file /user/shivaswa/my_hadoop/reduce1.py -reducer reduce1.py
    -file /user/shivaswa/my_hadoop/map1.py
    -file /user/shivaswa/my_hadoop/neighbor.py
    -file /user/shivaswa/my_hadoop/parser.py
    -file /user/shivaswa/my_hadoop/build_data_set.py
    -input /montserrat_take2_vol_dir_bed_int.phm -output /output_1

  1. Next copy the rsult file into home/local diirectory. part-0000 is the filename that was generated. bin/hadoop dfs -copyToLocal /output_1/part-00000 /user/shivaswa/my_hadoop/

Also rename the file both on hdfs and on local directory to "uniq_phm.txt" bin/hadoop dfs -mv /output_1/part-00000 /output_1/uniq_phm.txt

Clone this wiki locally