Permalink
Browse files

Added test data

  • Loading branch information...
1 parent 7e78c6f commit be55d71714fea39f874cb5b1cf5b8570c3eb31f8 @sonalgoyal committed Sep 17, 2012
Showing 1,421 changed files with 115,780 additions and 18 deletions.
@@ -13,7 +13,7 @@ HBASE_HOME=<path to HBase folder>
# add Hbase libs to CLASSPATH
for f in $HBASE_HOME/lib/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
-done
+done
Note: These lines will add hbase dependencies to your hadoop classpath each time you run hadoop.
So later you can comment these lines when you dont require
@@ -29,42 +29,34 @@ export HBASE_HOME=<path to HBase folder>
$bin/hadoop jar $HBASE_HOME/hbase-0.90.3.jar import stockDataSimple /exportSimple
$bin/hadoop jar $HBASE_HOME/hbase-0.90.3.jar import stockDataComposite /exportComposite
-Note:- Above process will load demo data in HBase If you want to load more data it can be done by following instructions given below.
+Note:- Above process will load demo data in HBase.
+If you want to load more data it can be done by following the instructions given below.
Instructions to download and populate stock data.
--------------------------------------------------
1. Fetch stock BSE data from the BSE website for a particular date range by executing downloadBseData.py.
- For example:
- nube@nube-desktop:~$ cd <BseStock folder path in crux>
-
-nube@nube-desktop:~/crux/testData/BseStock$ ls
-createTableData.py downloadBseData.py PopulateBseData.java README.txt stockIdsList.txt
-
-nube@nube-desktop:~/crux/testData/BseStock$ ./downloadBseData.py
-
-Note:- if you execute above command default value will be considered for arguments specified. Else you can specify arguments like in example done below.
Arguments required are
a. filePath to a file which has the list of stockIds. Data for these stocks will be downloaded.
b. startDate in MM/dd/yyyy format
c. endDate in MM/dd/yyyy format
-d. outputPath where these files will be saved after download.
-
-Make a directory called downloadedFiles in BseStock folder.
+d. outputPath where these files will be saved after download. This path should already exists on the filepath.
nube@nube-desktop:~/crux/testData/BseStock$ ./downloadBseData.py ./stockIdsList.txt 01/06/2011 05/06/2011 downloadedFiles/
The above command will fetch files for all stockIds listed in stockIdsList.txt kept parallel to script in BseStock folder.
-The files will be copied to 'downloadedFiles' named folder in same directory from where you run script, Each stock's data will be saved with the name of the stockIds.
+The files will be copied to 'downloadedFiles' named folder in same directory from where you run script.
+Each stock's data will be saved with the name of the stockIds.
-2.Then we run manipulation function on downloaded data so that it can concatenate stockId with date(yyyyMMdd), which can be used later as the rowkey while loading.
+2.Then we massage the downloaded data so that it can concatenate stockId with date(yyyyMMdd). This can be used later as the rowkey while loading.
For this make a directory named hbaseData in BseStock folder and enter the following command
nube@nube-desktop:~/crux/testData/BseStock$ ./createTableData.py downloadedFiles/ hbaseData/
where "downloadedFiles' denotes input folder and "hbaseData" denotes output folder.
+Make sure hbaseData already exists.
3. We now want to save this data in HBase. Let us create a table in HBase stockDataComposite with column families price, spread and stats
@@ -112,14 +104,16 @@ a.Create table in HBase. On the shell,
create 'stockDataComposite','price','spread','stats'
-b.Execute java program PopulateBseData.java, first compile by adding hbase-0.90.3.jar, hadoop-core-0.20.2.jar to claspath of java.
+b.Execute java class PopulateBseData found in crux/target/classes or you can compile it yourself by adding all hadoop lib jars and hbase jar.
nube@nube-desktop:~/crux/testData/BseStock$ export CLASSPATH=$CLASSPATH:~/hadoop-0.20.2/hadoop-0.20.2-core.jar:~/hbase-0.90.3/hbase-0.90.3.jar
nube@nube-desktop:~/crux/test/DataBseStock$ javac PopulateBseData.java
Once compiled, to run this program you need to add few more jars to java classpath they are commons-logging-1.1.1.jar, zookeeper-3.3.2.jar and log4j-1.2.16.jar, this code takes one argument inputpath i.e. outputPath of step 1 in this README.
Note:in this step all data inserted in table stockDataComposite as Float type except rowkey as composite key, numShares and numTrades as long type.
-nube@nube-desktop:~/crux/testData/BseStock$ java PopulateBseData ./outputHBase
+nube@nube-desktop:~/crux/testData/BseStock$ java PopulateBseData ./outputHBase
+You can also add hbase jar to hadoop classpath and do
+nube@nube-desktop:~/crux/target/classes$hadoop PopulateBseData ~/crux/testData/BseStock/hbaseData
inputPath: is the path to the directory where files generated by createTableData.py is kept.
@@ -0,0 +1,83 @@
+Date,Open Price,High Price,Low Price,Close Price,WAP,No.of Shares,No. of Trades,Total Turnover (Rs.),Spread High-Low,Spread Close-Open
+6-January-2011,800.00,807.00,788.10,798.65,797.968508552245963659,18767,754,14975475.00,18.90,-1.35
+7-January-2011,799.00,802.00,759.30,776.35,782.767109523277883219,17914,958,14022490.00,42.70,-22.65
+10-January-2011,777.00,782.00,726.30,741.20,755.692271049184232463,33588,1062,25382192.00,55.70,-35.80
+11-January-2011,744.00,763.00,726.10,741.45,738.894139721604049395,45834,1584,33866474.00,36.90,-2.55
+12-January-2011,737.00,754.70,725.20,747.30,737.127926421404682274,31096,1293,22921730.00,29.50,10.30
+13-January-2011,748.10,768.00,746.35,752.85,759.476763803680981595,13040,739,9903577.00,21.65,4.75
+14-January-2011,753.00,765.00,736.75,749.15,750.855336444377540267,13286,466,9975864.00,28.25,-3.85
+17-January-2011,744.10,753.80,740.20,746.65,746.217363225389796106,10839,502,8088250.00,13.60,2.55
+18-January-2011,750.00,757.90,742.35,749.10,748.894489145286169728,6587,404,4932968.00,15.55,-0.90
+19-January-2011,750.70,751.90,730.00,737.45,738.966980459278276993,14022,714,10361795.00,21.90,-13.25
+20-January-2011,730.00,744.90,726.10,739.05,736.294646247310467029,7901,460,5817464.00,18.80,9.05
+21-January-2011,739.00,744.00,734.00,741.20,739.850319106501794974,10028,566,7419219.00,10.00,2.20
+24-January-2011,741.00,751.80,732.55,736.10,741.430719779589025370,8711,557,6458603.00,19.25,-4.90
+25-January-2011,744.00,755.00,738.30,742.50,745.130573814176585539,5629,382,4194340.00,16.70,-1.50
+27-January-2011,750.00,750.00,706.15,716.25,725.665521327014218009,25320,1096,18373851.00,43.85,-33.75
+28-January-2011,719.00,720.00,701.20,707.85,706.862339007184001035,15451,745,10921730.00,18.80,-11.15
+31-January-2011,699.00,749.85,688.00,736.70,723.656276018250241449,30027,1582,21729227.00,61.85,37.70
+1-February-2011,753.00,753.00,712.10,717.25,728.904521768364430996,23796,1193,17345012.00,40.90,-35.75
+2-February-2011,725.00,729.80,695.10,702.20,714.793795620437956204,18632,809,13318038.00,34.70,-22.80
+3-February-2011,703.00,714.00,698.00,708.15,707.757686849574266792,8456,614,5984799.00,16.00,5.15
+4-February-2011,709.00,717.95,690.00,692.80,703.846820608834338811,14421,826,10150175.00,27.95,-16.20
+7-February-2011,697.00,703.90,660.30,666.75,675.906541134716813723,20287,1210,13712116.00,43.60,-30.25
+8-February-2011,677.70,677.70,636.30,646.05,650.265716406829992576,33675,2055,21897698.00,41.40,-31.65
+9-February-2011,654.00,654.00,605.00,618.60,627.125298329355608591,30168,1488,18919116.00,49.00,-35.40
+10-February-2011,615.00,655.90,595.80,647.90,622.397123224267454694,28087,1679,17481268.00,60.10,32.90
+11-February-2011,647.90,654.55,626.35,642.40,637.996272134203168685,12876,864,8214840.00,28.20,-5.50
+14-February-2011,649.00,669.00,649.00,663.95,661.689762424723711203,15111,869,9998794.00,20.00,14.95
+15-February-2011,669.95,673.00,659.00,662.50,664.805073545086335536,23455,948,15593003.00,14.00,-7.45
+16-February-2011,663.00,665.00,659.05,661.25,661.655689998468371879,19587,493,12959850.00,5.95,-1.75
+17-February-2011,660.50,674.00,660.00,666.50,667.352025506376594148,15996,798,10674963.00,14.00,6.00
+18-February-2011,665.20,676.70,657.05,662.30,667.217666849229324778,12781,546,8527709.00,19.65,-2.90
+21-February-2011,668.35,668.90,657.05,662.60,662.231461371393112007,16115,377,10671860.00,11.85,-5.75
+22-February-2011,663.00,666.60,657.10,663.50,660.033418643567280244,20318,433,13410559.00,9.50,0.50
+23-February-2011,663.50,715.20,640.00,698.10,677.662195096848562195,485810,15475,329215071.00,75.20,34.60
+24-February-2011,700.00,700.00,659.00,668.35,670.673983196486538094,104740,3886,70246393.00,41.00,-31.65
+25-February-2011,670.00,686.95,664.00,678.75,675.141393531582374647,23035,1240,15551882.00,22.95,8.75
+28-February-2011,677.80,686.00,661.65,663.65,675.406046010520311806,15779,729,10657232.00,24.35,-14.15
+1-March-2011,670.00,685.00,666.20,682.85,678.868177921706829221,10499,595,7127437.00,18.80,12.85
+3-March-2011,687.00,710.60,675.00,688.60,690.276422764227642276,17466,1109,12056368.00,35.60,1.60
+4-March-2011,690.00,747.70,688.00,726.75,732.882113091840229641,183939,8530,134805603.00,59.70,36.75
+7-March-2011,727.00,729.00,703.40,713.95,714.094279771190847633,19230,1254,13732033.00,25.60,-13.05
+8-March-2011,710.00,729.80,708.00,713.25,718.070509069631363370,10254,649,7363095.00,21.80,3.25
+9-March-2011,716.35,729.00,714.80,722.65,720.835706729595573276,12289,650,8858350.00,14.20,6.30
+10-March-2011,721.00,743.75,714.35,740.45,735.087432337020475405,25494,1358,18740319.00,29.40,19.45
+11-March-2011,736.50,749.00,720.00,734.70,737.781985670419651995,37126,1755,27390894.00,29.00,-1.80
+14-March-2011,734.25,750.00,729.00,746.25,739.281111111111111111,24300,1224,17964531.00,21.00,12.00
+15-March-2011,735.00,741.00,720.00,735.15,732.141721854304635761,17365,969,12713641.00,21.00,0.15
+16-March-2011,741.90,757.95,730.20,738.85,748.194850585425525717,17167,965,12844261.00,27.75,-3.05
+17-March-2011,737.00,748.00,734.00,735.95,738.951590594744121715,9399,623,6945406.00,14.00,-1.05
+18-March-2011,735.95,755.00,735.00,749.40,746.912622789783889980,20360,993,15207141.00,20.00,13.45
+21-March-2011,750.00,768.85,740.00,742.10,755.058908501567839904,28383,1548,21430837.00,28.85,-7.90
+22-March-2011,744.00,749.85,743.10,745.35,746.006209777577057694,8857,420,6607377.00,6.75,1.35
+23-March-2011,742.60,753.55,740.50,750.20,748.828295474870327311,5591,388,4186699.00,13.05,7.60
+24-March-2011,768.00,790.00,760.00,772.95,770.257684771785565155,62266,2478,47960865.00,30.00,4.95
+25-March-2011,779.00,785.00,775.50,783.60,781.449893830364515748,33908,1354,26497403.00,9.50,4.60
+28-March-2011,789.00,803.00,780.05,790.75,789.876635695870019944,20557,984,16237494.00,22.95,1.75
+29-March-2011,817.40,825.00,784.00,811.65,801.781754846526655896,19808,1117,15881693.00,41.00,-5.75
+30-March-2011,815.00,829.70,806.00,812.20,821.224328998782909486,15611,844,12820133.00,23.70,-2.80
+31-March-2011,815.00,826.20,783.00,797.25,813.554294645284044577,14716,920,11972265.00,43.20,-17.75
+1-April-2011,797.25,800.00,775.00,783.15,785.381359051755871309,13583,709,10667835.00,25.00,-14.10
+4-April-2011,782.00,795.00,781.35,790.90,787.544000600690794413,13318,760,10488511.00,13.65,8.90
+5-April-2011,791.00,797.50,786.00,789.85,791.998687951016837961,9146,528,7243620.00,11.50,-1.15
+6-April-2011,794.00,817.00,792.00,801.40,805.080822095997918744,23063,1133,18567579.00,25.00,7.40
+7-April-2011,802.00,809.00,793.10,805.85,801.403110047846889952,23408,1053,18759244.00,15.90,3.85
+8-April-2011,807.00,812.45,790.00,804.05,802.872924986768031564,20783,970,16686108.00,22.45,-2.95
+11-April-2011,804.00,804.00,790.00,801.05,798.237533195632930067,13556,584,10820908.00,14.00,-2.95
+13-April-2011,790.10,808.80,788.40,805.90,801.728220451527224435,15060,862,12074027.00,20.40,15.80
+15-April-2011,805.25,822.40,796.20,814.15,811.709764340518816222,21896,1068,17773197.00,26.20,8.90
+18-April-2011,817.00,848.90,808.50,825.00,827.859636483398859636,48581,3440,40218249.00,40.40,8.00
+19-April-2011,820.95,824.90,814.20,817.85,819.429788140104333013,9393,626,7696904.00,10.70,-3.10
+20-April-2011,818.05,835.00,818.05,831.80,829.585165640574611550,13644,791,11318860.00,16.95,13.75
+21-April-2011,840.00,841.80,825.05,827.50,831.504339440694310511,7259,480,6035890.00,16.75,-12.50
+25-April-2011,785.20,838.70,745.00,827.40,829.650652671242093930,7431,500,6165134.00,93.70,42.20
+26-April-2011,827.40,840.00,824.50,835.55,833.252734548688305913,11245,732,9369927.00,15.50,8.15
+27-April-2011,836.00,855.00,833.10,853.05,847.486095879680401065,25532,1490,21638015.00,21.90,17.05
+28-April-2011,854.10,895.00,854.10,876.00,880.407079646017699115,69269,3487,60984918.00,40.90,21.90
+29-April-2011,876.00,885.00,850.00,857.90,868.271323677328505330,19884,1156,17264707.00,35.00,-18.10
+2-May-2011,858.00,863.00,844.60,857.30,852.564409475182054072,11947,702,10185587.00,18.40,-0.70
+3-May-2011,858.00,862.90,841.00,845.70,852.496740547588005215,9204,633,7846380.00,21.90,-12.30
+4-May-2011,838.10,876.80,834.00,866.40,853.910320562939796716,38370,2136,32764539.00,42.80,28.30
+5-May-2011,870.10,880.95,857.10,871.45,865.954437588989084005,29498,1267,25543924.00,23.85,1.35
+6-May-2011,872.05,895.00,862.00,890.60,878.874508618082854550,16535,990,14532190.00,33.00,18.55
Oops, something went wrong.

0 comments on commit be55d71

Please sign in to comment.