Skip to content

nzw0301/pytorch_skipgram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pytorch_skipgram

Skip-gram implementation with PyTorch. This repo supports two loss functions: negative sampling and noise contrastive estimation.

Requirement

  • PyTorch >= 1.0
  • numpy
  • hydra

Parameters

See conf/config.yaml.

Default parameters are as follows:

$ python -m pytorch_skipgram.main --cfg job

dataset:
  input_path: ../../../data/text8
  outout_file_name: text8.vec
experiments:
  gpu_id: -1
  seed: 7
parameters:
  batch: 512
  dim: 100
  epochs: 7
  loss: neg
  lr: 0.025
  lr_update_rate: 1000
  min_count: 5
  negative: 5
  noise: 0.75
  samples: 0.001
  window: 5

Run

Download two data sets: text8 and ptb

sh getdata.sh
python -m pytorch_skipgram.main # train on text8
python -m pytorch_skipgram.main dataset=ptb # train on penn treebank

Similarity task

for w, s in model.most_similar(positive=["king"], topn=10):
    print(w, s)

canute 0.7516068816184998
sweyn 0.7161520719528198
haakon 0.715397298336029
plantagenet 0.7071711421012878
kings 0.7037447094917297
valdemar 0.703365683555603
omri 0.699432373046875
capet 0.6928986310958862
conqueror 0.6921138763427734
eochaid 0.690447986125946

Analogical task

for w, s in model.most_similar(positive=["king", "woman"], negative=["man"], topn=10):
    print(w, s)

queen 0.649447500705719
daughter 0.6051150560379028
anjou 0.6023151874542236
consort 0.595568060874939
son 0.5846152305603027
marries 0.5731959342956543
aquitaine 0.5700898170471191
isabella 0.568467378616333
infanta 0.5641375780105591
princess 0.5628763437271118

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published