-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduced eng_us_phonemic.phones to eng branch #336
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* minor change to latin extraction function, rescraped Latin * potential fix to lat scraping issue * raw scrape of latin * postprocessing of new latin data * updated changelog, fixed line length error * rescrape of latin * postprocessing of updated latin data
* [pox] Scraped Polabian. Note: The ISO 639-3 code is `pox`, the older ISO 639-2 code is `sla`. * Updated CHANGELOG.
* [mnc] Scraped Manchu. * Updated CHANGELOG. Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
#184) * Merged Whitelist functionality with src/scrape.py. Now checks for presence of whitelist and writes separate tsv as {original file name}_filtered.tsv. Update generate_summary to reflect if file is filtered through a whitelist. CHANGELOG and README update accordingly. * Style tweaks and cleanup. * Updated generalized_split and postprocess to reflect automatic whitelist processing in scrape. Fixed dialect issue in generate_summary. * Previous edits didn't cary. * Cleanup typo mistakes. Added error handling to scrape.py. * Style clean-up. * Fixed style issues. Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* [arc] Listing the correct scripts for Imperial Aramaic: 1. The original Aramaic script (`armi`). 2. The square script as in Biblical Aramaic (`hebr`). 3. Classical Syriac/Assyrian Neo-Aramaic (`syrc`) descended from (1). This correctly assigns the entries to their respective lexicons. Most of pronunciations are available for (2), with very minor number of entries for (1) and (3). * [arc] Listing the correct scripts for Imperial Aramaic (continuing the previous commit which was partial): 1. The original Aramaic script (`armi`). 2. The square script as in Biblical Aramaic (`hebr`). 3. Classical Syriac/Assyrian Neo-Aramaic (`syrc`) descended from (1). This correctly assigns the entries to their respective lexicons. Most of pronunciations are available for (2), with very minor number of entries for (1) and (3). * Updated CHANGELOG with #186.
* tentative solution for tone removal * updates changelog, ran white on test_config.py * remove print statement from test_config.py * partial replace of codepoints with chars, adds nfd/nfc conversion * reworks import statements * updates _TONES_REGEX * ran white on config.py * updates to conversions and adds comments * fixes to scrape.py comment length * converted test_config.py no_tone tests to nfd strings * modifies no_tone process not to skip removing superscript parentheses around non-tone superscript chars
* Flattens directory structure for data. The non-wiki data is moved to the new `wikipron-extras` (https://github.com/kylebgorman/wikipron-extras) repository. Closes #193. * Add PR number to changelog. * "Imperial"
* [geo] Rescrape post-bot. Closes #138. * Add changelog * Update changelog * [geo] Add whitelist and re-scrape. * Renames for merge. * Add link to guidelines
* Enforces consistent style in logging using %r. * Updates CHANELOG * Fixes a double-quoted logging var.
* [rum] Add whitelist and rescrape. * [eng] Adds English rescrape. * [dut] Adds Dutch rescrape. * [gre] Adds Greek rescrape. * [gre] Adds Greek rescrape. * Updates scrape path for phonetic filtering. Closes #195. * [rum] Adds Romanian rescrape. * [arm] Adds Armenian rescrape. * [gre] Adds Greek rescrape (second try). * [arm] Adds Armenian dialects + rescrapes. Closes #197. * Adds CHANGELOG changes. * [spa] Adds Spanish rescrape. * Postprocess and regenerate summaries.
* adds tuvan to languagecodes.py * updates changelog
* [aar, bdq, jje, lsi] discovers new languages and scrapes them. * Fall scrape.
Fills out the bibliography entry for the WikiPron paper.
* updated languages.json and json files for translating between wikitionary code and iso code * updates codes.py and languagecodes.py * modifies test_languagecodes.py to reduce redundancy with codes.py * small formatting fixes * updates changelog * logging statement formatting
Fixes formatting issue in table. Not sure why this had to be done manually...
* Uses %r everywhere in `data/src`. * [nep] Adds Nepali data. Closes #209. * Update changelog
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones
* [izh] Scrape and add Ingrian. * Updated CHANGELOG.
* [ban] Splitting Balinese into Latin and Balinese scripts. * Updated CHANGELOG. Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* [kir] Split Kyrgyz into Cyrillic and Arabic scripts. * Updated.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist
* [khb] Adding customized extractor for Lü. * [khb] Re-scraping and updating the data and summaries. * Updated CHANGELOG. * Reordered imports. * [khb] Adding scrape smoke test. * Resorted.
Closes #301.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
…295) * Support some Moksha pronunciations that reside under "p", rather than "li". * Scrape. * Attempt to fix the test. * Updated. * Split the PR into two items.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* updates segments version and adds test for vietnamese tones * updates changelog
* Create German Phonelist * Updated CHANGELOG.md * incorporate updates in README.md, and added missing ger_phone* files
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* adds afr phone list and rescrapes * Updated CHANGELOG.md
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Frequency code tire-kick: 1. Increases typing. 2. No longer overwrites the .tsv files: adds `_freq.tsv` suffix sintead. 3. Adds Khmer to JSON config. file. 4. Adds `shared_tasks` subdirectory for targeted config files. 5. Updates README.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
Also undertook a light reorg.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* scrape up to cantonese * raw partial scrape - excludes yue, rus, cmn * post-processing on partial scrape, src README fix * re-ran generate_summary.py after resolving conflicts * revert comment in scrape.py * updates changelog, resolves formatting error
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* cleaned up armenian phones * cleaned up armenian phones (with more tidying up) * cleaning up armenian (fixed changelog) I had written the update on the wrong spot on the changelog + I added the issue number * uncommented accidental gaps * uncommented accidental gaps * added voiceless allophones * added missing geminate affricates
# Conflicts: # data/phones/eng_us_phonemic.phones
kylebgorman
approved these changes
Jan 26, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This rather scary looking PR just brings the branch up to date and adds the new more restrictive |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reduced phonemic inventory in eng_us_phonemic.phones