Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add entries from wordnet #7

Open
fcbond opened this issue May 13, 2024 · 10 comments
Open

Add entries from wordnet #7

fcbond opened this issue May 13, 2024 · 10 comments
Assignees

Comments

@fcbond
Copy link
Collaborator

fcbond commented May 13, 2024

Format into tsv, add to end of spreadsheet, add wordnet field.

@fcbond fcbond self-assigned this May 13, 2024
@fcbond
Copy link
Collaborator Author

fcbond commented May 14, 2024

Hi,

I added an extra sheet with information from Wordnet. No new words, but I think the definitions are pretty good and could be used as the final description (for now).

Synset word POS def exe
80000987-n attap N Thatch made from palm fronds My grandfather's kampung house has an attap roof
80001021-n attap chee N The edible immature fruit of the nipa palm I like plenty of attap chee in my ice kachang

My spreadsheet-foo is weak, so I didn't merge, but the entries should be on the same lines as the lexicon. There are exactly 100 matches.

We also have 36 matches that are polysemous, I have not tried to match them.
AAA polysemous in WN 2 senses
action polysemous in WN 13 senses
ah polysemous in WN 5 senses
alphabet polysemous in WN 2 senses
amah polysemous in WN 2 senses
anyhow polysemous in WN 2 senses
arrow polysemous in WN 2 senses
auntie polysemous in WN 2 senses
banana polysemous in WN 2 senses
bang polysemous in WN 12 senses
barley polysemous in WN 2 senses
basket polysemous in WN 4 senses
bioscope polysemous in WN 2 senses
blank polysemous in WN 8 senses
blur polysemous in WN 7 senses
blur polysemous in WN 7 senses
bubble tea polysemous in WN 2 senses
bungalow polysemous in WN 2 senses
carry polysemous in WN 41 senses
cartoon polysemous in WN 3 senses
Caucasian polysemous in WN 4 senses
Chingay polysemous in WN 2 senses
chop polysemous in WN 11 senses
chop polysemous in WN 11 senses
chum polysemous in WN 3 senses
clown polysemous in WN 3 senses
cock polysemous in WN 8 senses
coconut polysemous in WN 3 senses
coffee shop polysemous in WN 2 senses
college polysemous in WN 3 senses
commando polysemous in WN 2 senses
composition polysemous in WN 9 senses
confine polysemous in WN 6 senses
confinement polysemous in WN 4 senses
cooling polysemous in WN 2 senses
cowboy polysemous in WN 3 senses
crab polysemous in WN 9 senses
crayfish polysemous in WN 4 senses
dirty polysemous in WN 13 senses
double polysemous in WN 21 senses
drop polysemous in WN 33 senses
dry polysemous in WN 19 senses
elephant polysemous in WN 2 senses
ex polysemous in WN 5 senses
exercise polysemous in WN 10 senses
face polysemous in WN 22 senses
fill up polysemous in WN 4 senses
fuck polysemous in WN 4 senses
G polysemous in WN 7 senses
ganja polysemous in WN 2 senses
gravy polysemous in WN 3 senses
green bean polysemous in WN 2 senses
hah polysemous in WN 2 senses
halal polysemous in WN 3 senses
hammer polysemous in WN 10 senses
Hari Raya Haji polysemous in WN 2 senses
Hari Raya Haji polysemous in WN 2 senses
hawker polysemous in WN 2 senses
helper polysemous in WN 2 senses
hex polysemous in WN 3 senses
Hokkien polysemous in WN 3 senses
horn polysemous in WN 12 senses
itchy polysemous in WN 2 senses
jut polysemous in WN 3 senses
kaki polysemous in WN 2 senses
kin polysemous in WN 3 senses
king polysemous in WN 7 senses
level polysemous in WN 19 senses
level polysemous in WN 19 senses
licence polysemous in WN 4 senses
lorry polysemous in WN 2 senses
market polysemous in WN 9 senses
mood polysemous in WN 3 senses
MRT polysemous in WN 2 senses
mug polysemous in WN 5 senses
mum polysemous in WN 4 senses
Nyonya polysemous in WN 2 senses
OB polysemous in WN 3 senses
off polysemous in WN 10 senses
on polysemous in WN 5 senses
one polysemous in WN 9 senses
one polysemous in WN 9 senses
orgy polysemous in WN 4 senses
pasang polysemous in WN 2 senses
pass up polysemous in WN 2 senses
pat polysemous in WN 7 senses
polytechnic in WN but not Singlish 0 senses
Pongal polysemous in WN 2 senses
power polysemous in WN 10 senses
promote polysemous in WN 5 senses
pulasan polysemous in WN 2 senses
pump polysemous in WN 11 senses
range polysemous in WN 17 senses
reach polysemous in WN 13 senses
runner polysemous in WN 10 senses
S'pore polysemous in WN 2 senses
scholar polysemous in WN 3 senses
screw polysemous in WN 10 senses
sergeant major polysemous in WN 2 senses
shake polysemous in WN 15 senses
shilling polysemous in WN 6 senses
Singapore polysemous in WN 3 senses
sinus polysemous in WN 3 senses
slang polysemous in WN 5 senses
slime polysemous in WN 2 senses
slipper polysemous in WN 2 senses
smoke polysemous in WN 10 senses
snake polysemous in WN 6 senses
solid polysemous in WN 18 senses
standard polysemous in WN 11 senses
steam polysemous in WN 7 senses
stone polysemous in WN 10 senses
sup polysemous in WN 3 senses
tackle polysemous in WN 8 senses
take polysemous in WN 46 senses
talc polysemous in WN 2 senses
tank polysemous in WN 8 senses
Taoism polysemous in WN 4 senses
Teochew polysemous in WN 3 senses
terror polysemous in WN 4 senses
truck polysemous in WN 3 senses
tuition polysemous in WN 2 senses
uncle polysemous in WN 3 senses
uncle polysemous in WN 3 senses
wanton polysemous in WN 12 senses
wash polysemous in WN 21 senses
wet polysemous in WN 9 senses
whack polysemous in WN 3 senses
what polysemous in WN 4 senses
wind polysemous in WN 15 senses
wonton polysemous in WN 2 senses
yam polysemous in WN 4 senses
zap polysemous in WN 5 senses
zap polysemous in WN 5 senses

@fcbond
Copy link
Collaborator Author

fcbond commented May 14, 2024

@siewyeng or @changukshin would you be able to merge them?

@siewyeng
Copy link
Collaborator

I'll add the definitions you listed on the sheet into the existing descriptions in column B now but I don't know how to add the polysemous information

@fcbond
Copy link
Collaborator Author

fcbond commented May 14, 2024 via email

@fcbond
Copy link
Collaborator Author

fcbond commented May 14, 2024

Sorry, that was a mis-paste.

I am referring to the data on the sheet called 'WN Data'.

This should be aligned. For example, the last entry: You tiao is on line 1767 in the WN Data sheet, and in the lexicon.

@siewyeng
Copy link
Collaborator

Okay done :D @changukshin I renamed the original Lexicon --> Lexicon-original and the new sheet that takes from both Lexicon-original and WN data is named Lexicon. I hope this means the python script does not have to be changed at all!

@fcbond
Copy link
Collaborator Author

fcbond commented May 14, 2024 via email

@changukshin
Copy link
Collaborator

@siewyeng Actually, I am wondering if we have any reason to maintain two sheets for that.

I think we could get rid of the 'Lexicon-original' sheet without any harm.

@siewyeng
Copy link
Collaborator

@changukshin True! yes we can. but right now the cells are filled formulas taking the value from either the lexicon-original sheet/WN data. If the content is copied over then I think it'll all be okay.

@changukshin
Copy link
Collaborator

@siewyeng Oh this issue is partially related to #4. Would you please check that issue as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants