Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when aligning files #36

Closed
eddieantonio opened this issue Aug 5, 2020 · 1 comment
Closed

Issue when aligning files #36

eddieantonio opened this issue Aug 5, 2020 · 1 comment

Comments

@eddieantonio
Copy link
Contributor

eddieantonio commented Aug 5, 2020

I got this error when trying to align https://creeliteracy.org/wp-content/uploads/2020/07/Cover.m4a (from https://creeliteracy.org/2020/07/31/covid-safety-reminder-solomon-ratt-y-dialect/) in the studio.

The studio UI needs some work... it said this completed successfully and gave me a blank readlong widget :/

Here's the temp data:

readalongs-temp.zip

CompletedProcess(args=['readalongs', 'align', '--force-overwrite', '--save-temps', '--text-grid', '--text-input', '--language', 'crk', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/text.txt', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/sol.wav', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/aligned1596638807'], returncode=1, stdout=b'', stderr=b'INFO - Server initialized for eventlet.\nINFO - Words (<w>) not present; tokenizing\nTraceback (most recent call last):\n File "/Users/santoseadmin/Work/Studio/venv/bin/readalongs", line 11, in <module>\n load_entry_point(\'readalongs\', \'console_scripts\', \'readalongs\')()\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__\n return self.main(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 557, in main\n return super(FlaskGroup, self).main(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 782, in main\n rv = self.invoke(ctx)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke\n return _process_result(sub_ctx.command.invoke(sub_ctx))\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke\n return callback(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func\n return f(get_current_context(), *args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 412, in decorator\n return __ctx.invoke(f, *args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke\n return callback(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/readalongs/cli.py", line 217, in align\n results = align_audio(\n File "/Users/santoseadmin/Work/Studio/readalongs/align.py", line 123, in align_audio\n xml = convert_xml(xml)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 208, in convert_xml\n convert_words(xml_copy, word_unit, output_orthography)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 157, in convert_words\n all_indices = compose_tiers(indices)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 290, in compose_tiers\n reduced_indices = compose_indices(tiers[0], tiers[1])\n File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 278, in compose_indices\n if i2_idx in i2_dict and i2_dict[i2_idx] > highest_i2_found:\nTypeError: \'>\' not supported between instances of \'NoneType\' and \'int\'\n')

Here's that traceback from readalongs align

INFO - Server initialized for eventlet.
INFO - Words (<w>) not present; tokenizing
Traceback (most recent call last):
 File "/Users/santoseadmin/Work/Studio/venv/bin/readalongs", line 11, in <module>
 load_entry_point('readalongs', 'console_scripts', 'readalongs')()
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
 return self.main(*args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 557, in main
 return super(FlaskGroup, self).main(*args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 782, in main
 rv = self.invoke(ctx)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
 return _process_result(sub_ctx.command.invoke(sub_ctx))
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
 return ctx.invoke(self.callback, **ctx.params)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
 return callback(*args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
 return f(get_current_context(), *args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 412, in decorator
 return __ctx.invoke(f, *args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
 return callback(*args, **kwargs)
 File "/Users/santoseadmin/Work/Studio/readalongs/cli.py", line 217, in align
 results = align_audio(
 File "/Users/santoseadmin/Work/Studio/readalongs/align.py", line 123, in align_audio
 xml = convert_xml(xml)
 File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 208, in convert_xml
 convert_words(xml_copy, word_unit, output_orthography)
 File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 157, in convert_words
 all_indices = compose_tiers(indices)
 File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 290, in compose_tiers
 reduced_indices = compose_indices(tiers[0], tiers[1])
 File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 278, in compose_indices
 if i2_idx in i2_dict and i2_dict[i2_idx] > highest_i2_found:
TypeError: '>' not supported between instances of 'NoneType' and 'int'
@roedoejet
Copy link
Collaborator

roedoejet commented Aug 5, 2020

Hi @eddieantonio,

Thanks for posting this. This ended up being related to a few issues.

  1. I guess as of python 3.8, pickle uses a protocol (5) that is unsupported in previous versions. I came across that problem which should be unrelated to your issue, but nevertheless it's fixed as of roedoejet/g2p@e84fae9.

  2. Your rules used some characters that were not standard IPA characters. So, things like ʦ in place of t͡s for a palato-alveolar affricate. There are lots of gotchas related to this type of mismatch. Realistically though, we should have a check that notifies the user making a mapping when they use a non-IPA character. I've made an issue for that: Log non-ipa characters for ipa mappings roedoejet/g2p#48

  3. Your rules also had some feeding relationships. Like, "â -> aː" and also "a -> ʌ". Because the rules apply in order, initially this was producing just ʌ. But, even when re-ordered so that the long-vowel rule applies first, it was feeding the second and producing ʌː so this required ordering the rules, and then also applying an option in the configuration called prevent_feeding which turns characters into intermediate values in the Unicode PUA block before turning them into their outputs at the end. The default is that feeding is not prevented, because there are some cases where you genuinely do want feeding-type relationships. I've added an issue to write documentation on all of the configuration options here: Documentation on configuration options roedoejet/g2p#49

  4. There's something happening with NFC normalization and the indices. I'm not exactly sure what it is, but we have had issues in the past with this. The original ReadAlongs repo assumed NFD, and so we've had some difficulty rooting out all of that. Obviously, we didn't get all of it. For the meantime, I've made your rules normalize to NFD which solves the alignment problem, but I've also made an issue about this here: Studio doesn't work properly with NFC g2p mappings #37

  5. The "UI" is known to have a lot of bugs. It's on our roadmap to separate the UI out from this repo and develop it to be much more user-friendly. This version was basically written in a few hours for demo purposes but is far from production and basically it will all be scrapped when we actually build the UI. That's not a great answer, ideally there would be something better in the interim, but our focus is on getting the aligner up to snuff before turning to planning out the development of the UI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants