the official code for PNMTF-2D
dataset
model
code
experiment.ini
dataset.ini
dataset dir: dataset_name/processed/
,with pre-processed files text.pkl
and label.pkl
text.pkl
: pre-processed documentslabel.pkl
: the corresponding labels of each document(shape as [2,3,4,1,1,...,2,3,4])
the tfidf.pkl
and vocab.pkl
will be generated by our code.
the parameter settings
the required information for each dataset
mpi4py
,python -m pip install mip4py
.
Python version recommended : 3.8.5
, some unexpected problem will happen with newer versions.
- running command:
mpiexec -n n_threads python PNMTF-2D-V1.py --data_name classic4 --exp_ini super1-4_PNMTF-2D-V1_CLASSIC4 --pr n_rows --pc n_cols
- main paras:
data_name
exp_ini
to specify the dataset and paraspc、pr
:the numbers of col and row threads, pr * pc = p
- main paras:
- command on Tianhe-2:
yhrun -N 8 -n 192 -p bigdata python3 -u ./PNMTF-2D-V1.py --data_name classic4 --exp_ini super1-4_PNMTF-2D-V1_CLASSIC4 --pr 32 --pc 6
- main paras:
data_name
exp_ini
to specify the dataset and paraspc、pr
:the numbers of col and row threads, pr * pc = p-N
:num of nodes;-n
:num of threads;-p
:computation region。
- main paras: