Skip to content
This repository has been archived by the owner on Jan 3, 2024. It is now read-only.

Add Chinese pronunciations #85

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

claw89
Copy link

@claw89 claw89 commented May 1, 2021

Fixes #81

The list of pronunciation_div_classes in the function parse_pronunciations was missing the div class used on Chinese entries: "zhpron"

For Chinese entries, pronunciation div uses the class 'zhpron'
@kevinsung
Copy link

I've tested that this works. @suyash458 can you consider merging this after @claw89 fixes the conflicts?

@kevinsung
Copy link

Actually, upon further testing I have found an issue: not all possible pronunciations are fetched. As an example, consider the query word . In this case, there are three definitions with three different pronunciations. However, while all three definitions are fetched, only one of the pronunciations is fetched. Ideally, all pronunciations would be fetched and each matched with the corresponding definition.

Example:

from wiktionaryparser import WiktionaryParser

parser = WiktionaryParser()
result = parser.fetch("得", "chinese")

for word in result:
    print("DEFINITIONS:")
    print()
    for definition in word["definitions"]:
        print(definition)
        print()
    print("PRONUNCIATIONS:")
    print()
    for key, val in word["pronunciations"].items():
        print(key, val)
        print()

Output:

DEFINITIONS:

{'partOfSpeech': '', 'text': ['得', 'to get; to obtain; to gain, to acquire', 'to contract (disease); to become ill with', 'to result in; to produce', 'to be ready; finished', 'to suit; to fit', 'fit; proper', 'satisfied; contented', '(formal, often used in the negative) can; may; to be permitted', '(Cantonese) to only have; to just have', 'interjective particle expressing approval or prohibition; see 得了', 'interjective particle expressing frustration or helplessness', '(Cantonese) OK; good', '(Cantonese, often sarcastic) remarkable'], 'relatedWords': [], 'examples': ['他得了個壞名聲。 / 他得了个坏名声。\xa0 ―\xa0 Tā dé le ge huài míngshēng.\xa0 ―\xa0 He gained a bad reputation.', 'From: Article 8, Constitution of the Republic of China ', 'Fēi yī fǎdìng chéngxù zhī dàibǔ, jūjìn, shěnwèn, chùfá, dé jùjué zhī. [Pinyin]', 'Any arrest, detention, trial, or punishment which is not in accordance with the procedure prescribed by law can be resisted.', '得佢想去公園咋。 [Cantonese, trad.]得佢想去公园咋。 [Cantonese, simp.]dak1 keoi5 soeng2 heoi3 gung1 jyun4-2 zaa3. [Jyutping]Only he wants to go to the park.', 'dak1 keoi5 soeng2 heoi3 gung1 jyun4-2 zaa3. [Jyutping]', 'Only he wants to go to the park.', "得返一分鐘。 / 得返一分钟。 [Cantonese]\xa0 ―\xa0 dak1 faan1 jat1 fan1 zung1. [Jyutping]\xa0 ―\xa0 There's just one minute left.", "得了,別再說了。 / 得了,别再说了。\xa0 ―\xa0 Dé le, bié zài shuō le.\xa0 ―\xa0 OK! OK! That's enough.", '得喇,知道喇。 [Cantonese]\xa0 ―\xa0 dak1 laa3, zi1 dou3 laa3. [Jyutping]\xa0 ―\xa0 OK! Got it.', 'From: 2000, 梁左 and 梁欢, 《闲人马大姐》, episode 7', 'Wǒ yǒu ge biǎozhínür, jīnnián èrshíliù, tán le hǎo jǐ ge duìxiàng le dōu méi tán chéng. Tā mā a, yīzhí tuō wǒ bāngmáng ne. Dé, béng biérén le, jiù shi tā le. [Pinyin]', "I've got a niece who's 26 years old this year, who's already dated several guys but none worked out. Her mum's been nagging me to find a guy for her. Alright, we'll set the two up then. There's no need to find someone else.", 'Synonym: 行 ', "你噉做真係唔得。 [Cantonese, trad.]你噉做真系唔得。 [Cantonese, simp.]nei5 gam2 zou6 zan1 hai6 m4 dak1. [Jyutping]You really mustn't do it this way.", 'nei5 gam2 zou6 zan1 hai6 m4 dak1. [Jyutping]', "You really mustn't do it this way.", 'Synonym: 行 ', '你哋真係得。 / 你哋真系得。 [Cantonese]\xa0 ―\xa0 nei5 dei6 zan1 hai6 dak1. [Jyutping]\xa0 ―\xa0 You guys are really something.']}

{'partOfSpeech': '', 'text': ['得', 'Used after a verb or an adjective and before a degree complement.', 'Used after a verb to express possibility or capability.'], 'relatedWords': [], 'examples': ['alt. forms: 的 historical or nonstandard', '好得很\xa0 ―\xa0 hǎo de hěn\xa0 ―\xa0 very good', "他痛得直哭。\xa0 ―\xa0 Tā tòng de zhí kū.\xa0 ―\xa0 He is in so much pain that he won't stop crying.", '他跑得快。\xa0 ―\xa0 Tā pǎo de kuài.\xa0 ―\xa0 He runs fast.', '他跑得像一陣風。 / 他跑得像一阵风。\xa0 ―\xa0 Tā pǎo de xiàng yī zhèn fēng.\xa0 ―\xa0 He runs like wind.', '他畫得好。 / 他画得好。\xa0 ―\xa0 Tā huà de hǎo.\xa0 ―\xa0 He paints well.', '呢啲嘢我哋見得多。 [Cantonese, trad.]呢啲嘢我哋见得多。 [Cantonese, simp.]ni1 di1 je5 ngo5 dei6 gin3 dak1 do1. [Jyutping]This sort of thing is nothing new .', 'ni1 di1 je5 ngo5 dei6 gin3 dak1 do1. [Jyutping]', 'This sort of thing is nothing new .', 'alt. forms: 的 historical or nonstandard', '吃得\xa0 ―\xa0 chī de\xa0 ―\xa0 eatable, edible', '看得見 / 看得见\xa0 ―\xa0 kàn de jiàn\xa0 ―\xa0 able to see', '做不得\xa0 ―\xa0 zuò bù de\xa0 ―\xa0 must not be done', '這雙鞋穿得。 / 这双鞋穿得。\xa0 ―\xa0 Zhè shuāng xié chuān de.\xa0 ―\xa0 These shoes fit well.', "這個人批評不得。 / 这个人批评不得。\xa0 ―\xa0 Zhège rén pīpíng bù de.\xa0 ―\xa0 He's not a man to criticize.", "From: 《氹氹轉,菊花園》, traditional children's song", 'maai6 dak1 gei2 do1 cin4-2? [Jyutping]', 'How much can it sell for?']}

{'partOfSpeech': '', 'text': ['得', '(colloquial) to need (something)', '(colloquial) must; to have to', '(colloquial) (almost certainly) will'], 'relatedWords': [], 'examples': ['這份表格得多少時間才能填完? [MSC, trad.]这份表格得多少时间才能填完? [MSC, simp.]Zhè fèn biǎogé děi duōshào shíjiān cái néng tián wán? [Pinyin]How much time will one need to fill this form?', 'Zhè fèn biǎogé děi duōshào shíjiān cái néng tián wán? [Pinyin]', 'How much time will one need to fill this form?', '我得走了。\xa0 ―\xa0 Wǒ děi zǒu le.\xa0 ―\xa0 I must go .', '……是種病,得治! [MSC, trad.]……是种病,得治! [MSC, simp.]...... shì zhǒng bìng, děi zhì! [Pinyin]... is actually an illness. It has to be treated! ', '...... shì zhǒng bìng, děi zhì! [Pinyin]', '... is actually an illness. It has to be treated! ', "再不回去,就得趕不上末班車了。 [MSC, trad.]再不回去,就得赶不上末班车了。 [MSC, simp.]Zài bù huíqù, jiù děi gǎnbùshàng mòbānchē le. [Pinyin]If we don't go back now, we won't be able to catch the last bus.", 'Zài bù huíqù, jiù děi gǎnbùshàng mòbānchē le. [Pinyin]', "If we don't go back now, we won't be able to catch the last bus."]}

PRONUNCIATIONS:

text ['Mandarin(Pinyin): děi (dei)(Zhuyin): ㄉㄟˇ', 'Cantonese (Jyutping): dak', 'Mandarin', '(Standard Chinese)', 'Pinyin: děi', 'Zhuyin: ㄉㄟˇ', 'Wade–Giles: tei', 'Gwoyeu Romatzyh: deei', 'Tongyong Pinyin: děi', 'Sinological IPA : /teɪ̯²¹⁴/', 'Cantonese', '(Standard Cantonese, Guangzhou)', 'Jyutping: dak', 'Yale: dāk', 'Cantonese Pinyin: dak', 'Guangdong Romanization: deg', 'Sinological IPA : /tɐk̚⁵/']

audio ['//upload.wikimedia.org/wikipedia/commons/e/e4/Zh-d%C3%A9.ogg', '//upload.wikimedia.org/wikipedia/commons/0/0d/Zh-de.ogg', '//upload.wikimedia.org/wikipedia/commons/0/07/Zh-d%C4%9Bi.ogg']

@kevinsung kevinsung mentioned this pull request Nov 30, 2021
@kevinsung
Copy link

This issue probably can't be fixed without also addressing #9.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing pronunciations in Chinese
2 participants