Skip to content
Temporary fixes to FinnWordNet 2.0
C Makefile Shell Yacc PostScript Lex Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
WNgrind-3.0-FiWN
data
Pipfile
Pipfile.lock
README.md
adjust-fiwn-offsets.py
fix-sense-index.sh
mk-cntlist.py
synset_map.tsv

README.md

FinnWordNet

This repository contains some changes/fixes to FinnWordNet.

The data directory contains the FiWN data files, and the WNgrind-3.0-FiWN directory contains the FiWN version of WNgrind.

Mapping/adjusting script

There is a script which can either create a false/en based synset id => true fi synset id mapping tsv, or apply the mapping to the tsvs in data. It needs pipenv.

Assuming you put the original data in data rather than the already mapped data included here, you can make a map tsv like so:

$ pipenv run python adjust-fiwn-offsets.py dump data synset_map.tsv

And you can also modify the original data with the new offsets (i.e. the following is the command which has been run to change the data in data to its current state):

$ pipenv run python adjust-fiwn-offsets.py fix data

Fake word count data script

You can create count data based on the counts in the English data like so:

$ pipenv run python mk-cntlist.py > data/dict/cntlist.rev
You can’t perform that action at this time.