Alternating least square solver for user/item factors. The current version only fits for the case that number of users is much bigger than number of items, or number of items is much bigger than number of users. For example, user size is 10000000 and item size is 1000000, it could solve user factor using als method.
- Enter Paracel's home directory
cd paracel;
- Generate dataset
python ./tool/datagen.py -m als -o ./data/
- Set up link library path:
export LD_LIBRARY_PATH=your_paracel_install_path/lib
- Create a json file named
cfg.json
, see example in Parameters section below. - Run (100 workers, 20 servers, mesos mode in the following example)
./prun.py -w 100 -p 20 -c cfg.json -m mesos --ppn 10 --mem_limit 1000 your_paracel_install_path/bin/als
Default parameters are set in a JSON format file. For example, we create a cfg.json as below(modify your_paracel_install_path
):
{
"rating_input" : "./data/als_rating.dat",
"factor_input" : "./data/als_H.dat",
"output" : "./als_result/",
"pattern" : "fmap",
"lambda" : 0.1
}
als_rating.dat
: small training dataset from netflix movie rating data, each line presents a tetrad with "user_id,movie_id,rating".
als_H.dat
: movie factor input data
W_0
: user factor output data