Preprocessing scripts #5

basma-b · 2018-10-23T15:49:13Z

Can you share please the scripts you used to preprocess data and also to train word embeddings ?

xyzhou-puck · 2018-10-24T07:00:45Z

We directly used the released dataset from sequential-matching-network (Wu et al., 2017), thus there is no preprocessing phase in our experiments.

We pre-train word-embeddings using a word2vec toolkit in c++, I can hardly find the script we used to pre-train word-embeddings, maybe the following command can help you:

./bin/word2vec -train $train_dat -output "$train_dat.w2v" -debug 2 -size 200 -window 10 -sample 1e-4 -negative 25 -hs 0 -binary 1 -cbow 1 -min-count 1

basma-b closed this as completed Jan 8, 2019

yangliuy mentioned this issue Mar 5, 2019

How did you generate the input data files like data.pkl, word2id and word_embedding.pkl ? #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preprocessing scripts #5

Preprocessing scripts #5

basma-b commented Oct 23, 2018

xyzhou-puck commented Oct 24, 2018

Preprocessing scripts #5

Preprocessing scripts #5

Comments

basma-b commented Oct 23, 2018

xyzhou-puck commented Oct 24, 2018