New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Italian (DEIT) #3

Merged
merged 2 commits into from Jul 10, 2018

Conversation

Projects
None yet
2 participants
@jubalh
Contributor

jubalh commented Jul 9, 2018

Trying to add DEIT support.
This is what happens:

tree texts
texts
├── GLOSSIKA-DEIT-F1-EBK.pdf
├── GLOSSIKA-DEIT-F2-EBK.pdf
└── GLOSSIKA-DEIT-F3-EBK.pdf

python3 ./split_text.py
Checking 'texts/GLOSSIKA-DEIT-F2-EBK.pdf'
Traceback (most recent call last):
  File "./split_text.py", line 336, in <module>
    split()
  File "./split_text.py", line 332, in split
    split_text('', glob.glob(location), 'output', '')
  File "./split_text.py", line 327, in split_text
    extract_sentences(book, book_info, language_pair, series, callback)
  File "./split_text.py", line 278, in extract_sentences
    sentence.sentence = phrase
UnboundLocalError: local variable 'sentence' referenced before assignment

Any idea? :-)

@jubalh jubalh referenced this pull request Jul 9, 2018

Open

Adding my own courses #1

@chickendude

This comment has been minimized.

Show comment
Hide comment
@chickendude

chickendude Jul 10, 2018

Owner

One thing i see is that your languages:
'languages': ['EN', 'CA'],
...are incorrect ;)

I also think that the ACCEPTED_PDFS variable isn't used anymore. I was using that for the web version but i changed how i was handling that. But that won't cause any problems anyway. Could you check if changing the languages to DE and IT works?

In the extract_sentences function, we have:

		if type == info['languages'][0]:
			sentence_num += 1
			sentences.append([])

		# if it's the first type, it's a new sentence
		if type in info['languages']:
			sentence = Sentence(index=sentence_num)
			sentences[-1].append(sentence)

If the languages are incorrect, it's never going to create the sentence variable.

Owner

chickendude commented Jul 10, 2018

One thing i see is that your languages:
'languages': ['EN', 'CA'],
...are incorrect ;)

I also think that the ACCEPTED_PDFS variable isn't used anymore. I was using that for the web version but i changed how i was handling that. But that won't cause any problems anyway. Could you check if changing the languages to DE and IT works?

In the extract_sentences function, we have:

		if type == info['languages'][0]:
			sentence_num += 1
			sentences.append([])

		# if it's the first type, it's a new sentence
		if type in info['languages']:
			sentence = Sentence(index=sentence_num)
			sentences[-1].append(sentence)

If the languages are incorrect, it's never going to create the sentence variable.

@chickendude

This comment has been minimized.

Show comment
Hide comment
@chickendude

chickendude Jul 10, 2018

Owner

Could you try that out? I pushed the latest master code and changed the 'languages' values.

Owner

chickendude commented Jul 10, 2018

Could you try that out? I pushed the latest master code and changed the 'languages' values.

@jubalh

This comment has been minimized.

Show comment
Hide comment
@jubalh

jubalh Jul 10, 2018

Contributor

Aiii!
That was some vim fuckup :-)
At some point I had it to DE IT, bug another issue, and now this.

Anyways, it works now! Thanks! Ready to be merged :-)

Contributor

jubalh commented Jul 10, 2018

Aiii!
That was some vim fuckup :-)
At some point I had it to DE IT, bug another issue, and now this.

Anyways, it works now! Thanks! Ready to be merged :-)

@jubalh jubalh changed the title from Add support for Italian (DEIT) (WIP, do not merge yet) to Add support for Italian (DEIT) Jul 10, 2018

@chickendude chickendude merged commit 1378c8a into chickendude:master Jul 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment