-
Notifications
You must be signed in to change notification settings - Fork 3
data2sparse
Chih-Ming Chen edited this page May 25, 2018
·
5 revisions
Training rating data [InputFile].train
9::userA::movieA::5
4::userA::movieB::10
5::userB::movieB::3
4::userC::movieA::8
8::userC::movieC::11
3::userD::movieA::2
Testing rating data [InputFile].test
4::userB::movieC::8
7::userD::movieC::11
-
-task 'data2sparse'
: convert data to sparse data format -
-infile [InputFile].train,[InputFile].test
: input file names, split by ',' -
-outfile [OutputFile].train,[OutputFile].test
: output file names, split by ',' -
-target 0
: get column 0 as prediction target -
-cat 1,2
: categorical encoding on columns 1,2 -
-num 3
: numerical encoding on column 3 -
-sep '::'
: split data by '::' -
-header 0
: no header
python DataEncode.py -task 'data2sparse' -infile [InputFile].train,[InputFile].test -outfile [OutputFile].train,[OutputFile].test -target 0 -cat 1,2 -num 3 -sep '::' -header 0
Encoded training data [Outputfile].train
9 1:1 5:1 8:5
4 1:1 6:1 8:10
5 2:1 6:1 8:3
4 3:1 5:1 8:8
8 3:1 7:1 8:11
3 4:1 5:1 8:2
Encoded testing data [Outputfile].test
4 2:1 7:1 8:8
7 4:1 7:1 8:11