-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[slv] Fixes Slovenian normalization. Re-scrapes Slovenian. #356
Conversation
Reverse pull
I don't know where this file came from...
Reverse pull request
… in Slovene phone lists. Re-scrapes Slovene.
Oh, I just remembered I forgot to do the postprocessing and summaries. I'll fix that right now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but lots of finicky complaints about src/normalize.py
; I'm trying to preserve a standard style (which is roughly PEP-8) throughout.
Can you run |
I don't know why the black test is still failing. Every line is under 79 characters now. |
Strange. It checks (and can change) things other than line length, so maybe something else is at play. Or maybe you need to add + commit again? |
Ah, I hadn't seen your previous comment when I wrote my last comment. I ran the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll let you push the big green button...it's very satisfying.
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog * Re-scrapes Armenian data. Fixes error in West Armenian phone list * Updates changelog * Attempts to fix data/phones/README.md * Fixes paths in data/phones/README.md * Fixes links in data/phones/HOWTO.md * Fixes paths in data/src/generate_phones_sumary.py * Updates changelog * Adds normalization instructions in data/phones/HOWTO.md * Fixes equal signs in changelog * Updates changelog * Updates data/src/normalize.py to make it more efficient. Additionally, adds a shebang to make it executable * Fixes spacing in data/src/normalize.py * Updates changelog. Fixes path typo for #356 * Updates data/src/normalize.py doc Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog * Re-scrapes Armenian data. Fixes error in West Armenian phone list * Updates changelog * Attempts to fix data/phones/README.md * Fixes paths in data/phones/README.md * Fixes links in data/phones/HOWTO.md * Fixes paths in data/src/generate_phones_sumary.py * Updates changelog * Adds script to change file unicode normalization. Fixes normalization in Slovene phone lists. Re-scrapes Slovene. * Updates changelog * Postprocessing after Slovene scrape * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py (again) * Fixes line length in data/src/normalize.py…I hope * Ran black on data/src/normalize * Adds normalization instructions in data/phones/HOWTO.md * Fixes equal signs in changelog * Updates changelog * Updates data/src/normalize.py to make it more efficient. Additionally, adds a shebang to make it executable * Fixes normalization command in step 5 * Fixes spacing in data/src/normalize.py * Updates changelog. Fixes path typo for #356 * Adds CG for Georgian. Fixes errors/misleading aspects of Georgian phonelist * Updates changelog * Fixes typo in changelog * Fixes taps in Georgian CG * Postprocessing after Georgian phonelist edits * Fixes typo in geo_phonemic.phones Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog * Re-scrapes Armenian data. Fixes error in West Armenian phone list * Updates changelog * Attempts to fix data/phones/README.md * Fixes paths in data/phones/README.md * Fixes links in data/phones/HOWTO.md * Fixes paths in data/src/generate_phones_sumary.py * Updates changelog * Adds script to change file unicode normalization. Fixes normalization in Slovene phone lists. Re-scrapes Slovene. * Updates changelog * Postprocessing after Slovene scrape * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py (again) * Fixes line length in data/src/normalize.py…I hope * Ran black on data/src/normalize * Adds normalization instructions in data/phones/HOWTO.md * Fixes equal signs in changelog * Updates changelog * Updates data/src/normalize.py to make it more efficient. Additionally, adds a shebang to make it executable * Fixes normalization command in step 5 * Fixes spacing in data/src/normalize.py * Updates changelog. Fixes path typo for #356 * Adds CG for Georgian. Fixes errors/misleading aspects of Georgian phonelist * Updates changelog * Fixes typo in changelog * Fixes taps in Georgian CG * Postprocessing after Georgian phonelist edits * Fixes typo in geo_phonemic.phones * Fixes typo in Georgian covering grammar * Updates changelog * Adds missing character in Georgian covering grammar * Updates changelog * Changes spaces to tabs Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog * Re-scrapes Armenian data. Fixes error in West Armenian phone list * Updates changelog * Attempts to fix data/phones/README.md * Fixes paths in data/phones/README.md * Fixes links in data/phones/HOWTO.md * Fixes paths in data/src/generate_phones_sumary.py * Updates changelog * Adds script to change file unicode normalization. Fixes normalization in Slovene phone lists. Re-scrapes Slovene. * Updates changelog * Postprocessing after Slovene scrape * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py (again) * Fixes line length in data/src/normalize.py…I hope * Ran black on data/src/normalize * Adds normalization instructions in data/phones/HOWTO.md * Fixes equal signs in changelog * Updates changelog * Updates data/src/normalize.py to make it more efficient. Additionally, adds a shebang to make it executable * Fixes normalization command in step 5 * Fixes spacing in data/src/normalize.py * Updates changelog. Fixes path typo for #356 * Adds CG for Georgian. Fixes errors/misleading aspects of Georgian phonelist * Updates changelog * Fixes typo in changelog * Fixes taps in Georgian CG * Postprocessing after Georgian phonelist edits * Fixes typo in geo_phonemic.phones * Fixes typo in Georgian covering grammar * Updates changelog * Adds missing character in Georgian covering grammar * Updates changelog * Changes spaces to tabs * Adds Japanese covering grammar(Hiragana, phonetic) * Updates changelog Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
* Added French phonemic phones list. Added filter French phonemic tsv. * Added French phonemic phones. * Updated Changelog. * Added phones * Added filtered phonemic wordlist * Added Serbo-Croatian phonemes and filtered TSV files. * Updated summaries for Serbo-Croatian phones. * Updated CHANGELOG. * Fixed formatting of Serbo-Croat phones file and CHANGELOG. * Updated fork to match upstream. * Updated fork to match upstream * Delete .DS_Store I don't know where this file came from... * Delete .DS_Store * Delete hbs_phonemic_phones.txt * Delete .DS_Store * [ita] Adds phoneme list, filtered phonemic TSV file * Updates CHANGELOG * Adds updated README and language summary * Updates CHANGELOG with issue number for Italian phone list * Adds Adyghe phones, filtered Adyghe data * Updated CHANGELOG * Adds Bulgarian phone list, filtered Bulgarian data * Postprocesses with filtered Bulgarian data * Updates changelog * Adds Icelandic phones, filtered TSV file * Updates changelog * Adds Slovenian phones, filtered Slovenian data * Updates changelog * Add normalization to list_phones.py * Updates changelog * Reformats list_phones.py * Adds Welsh phoneme lists, filtered Welsh TSV data * Updates changelog * Updates with instructions to re-scrape * Updates changelog * Updates * Updates data/phones/README.md * Adds Vietnamese phones, Vietnamese TSV files * Updates changelog * Adds Hindi file, new/updated TSV files * Updates changelog * Fixes Serbo-Croatian phones * Updates CHANGELOG * Revert "Adds Hindi file, new/updated TSV files" This reverts commit 964c3be. * Adds Portuguese .phones files, re-scraped TSV data * Rescrapes Portuguese data * Updates changelog * Adds Burmese phones, updated Burmese data * Updates changelog * Adds Japanese phone list. Rescrapes Japanese data * Updates changelog * Removes data/tsv/jpn_hira_phonemic.tsv * Adds Azerbaijani phones, updated TSV data * Updates changelog * Adds Turkish phones, rescraped Turkish data * Updates changelog * Adds Maltese phones, updated data * Updates changelog * Adds Latvian phones, updated Latvian data * Updates changelog * Adds Khmer phones and updated TSV data * Updates changelog * Adds Østnorsk (Bokmål) phones and updated TSV data * Updates changelog * Fixes typo * Update data/phones/README.md * Update changelog * Re-scrapes Armenian data. Fixes error in West Armenian phone list * Updates changelog * Attempts to fix data/phones/README.md * Fixes paths in data/phones/README.md * Fixes links in data/phones/HOWTO.md * Fixes paths in data/src/generate_phones_sumary.py * Updates changelog * Adds script to change file unicode normalization. Fixes normalization in Slovene phone lists. Re-scrapes Slovene. * Updates changelog * Postprocessing after Slovene scrape * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py * Fixes style in data/src/normalize.py (again) * Fixes line length in data/src/normalize.py…I hope * Ran black on data/src/normalize * Adds normalization instructions in data/phones/HOWTO.md * Fixes equal signs in changelog * Updates changelog * Updates data/src/normalize.py to make it more efficient. Additionally, adds a shebang to make it executable * Fixes normalization command in step 5 * Fixes spacing in data/src/normalize.py * Updates changelog. Fixes path typo for #356 * Adds CG for Georgian. Fixes errors/misleading aspects of Georgian phonelist * Updates changelog * Fixes typo in changelog * Fixes taps in Georgian CG * Postprocessing after Georgian phonelist edits * Fixes typo in geo_phonemic.phones * Fixes typo in Georgian covering grammar * Updates changelog * Adds missing character in Georgian covering grammar * Updates changelog * Changes spaces to tabs * Fix data/src/generate_phones_summary.py * Fixes data/src/generate_phones_summary.py (2nd attempt) * Updates changelog * Updates tests/test_data/test_summary.py * Stylistic fix to data/src/generate_phones_summary.py Co-authored-by: Kyle Gorman <kylebgorman@gmail.com>
Unreleased
inCHANGELOG.md
to reflect the changes in code or data.Also adds a script to change a file's unicode normalization