This notebook contains some rough preliminary code for checking English -> Greek translations. As of right now, it can only identify individual Greek words which are obviously incorrect. Word lists are pulled from Pharr.

# Installations and Imports

Gradio [documentation](https://gradio.app/docs/)

greek-accentuation [documentation](https://github.com/jtauber/greek-accentuation/blob/master/docs.rst)

greek-normalization [documentation](https://github.com/jtauber/greek-normalisation/blob/master/tests.rst)

(I'm also using a couple files from [greek-inflexion](https://github.com/jtauber/greek-inflexion/blob/master/README.md))

In [1]:
!pip install typing-extensions --upgrade
!pip install gradio
!pip install greek-accentuation==1.2.0
!pip install greek-normalisation
import pandas as pd
import re
from greek_accentuation.syllabify import *
from greek_normalisation.utils import *

Requirement already up-to-date: typing-extensions in /opt/anaconda3/lib/python3.8/site-packages (4.2.0)


# I. Check answer

In [2]:
def check(Greek):
    feedback = []
    feedback.append(valid_words(Greek))
    
    # returns feedback as a string
    return '\n'.join(feedback)

   ## Check whether the input contains valid Greek words

### 1. Screen out words that couldn't possibly be right

- Read in `paradigms.tsv` and `verbs.tsv` (from [here](https://github.com/jtauber/greek-inflexion/tree/master/homer-data)) as dataframe

    TODO:
- [x] Check that breathing marks are correct
- [x] Check each word against our vocabulary list for Homer
- [ ] Strip all the non-alphanumeric characters ([unicode reference](https://www.degruyter.com/document/doi/10.1515/9783110599572-009/html))
- [ ] Check that accents are correct (for simplification, I'm just ignoring accents right now)

In [3]:
# paradigms.tsv contains all forms from Pharr
paradigms = "lib/paradigms.tsv"
# verbs.tsv contains verbs from Pharr
verbs = "lib/verbs.tsv"

# convert to dataframes
df = pd.read_csv(paradigms, sep=r' +	*', on_bad_lines='skip', header=0, names=['Lemma', 'Type', 'Inflected'])
df2 = pd.read_csv(verbs, sep='	', on_bad_lines='skip', header=0, names=['Lemma', 'Type', 'Inflected'])
# merge the dataframes
df = pd.concat([df, df2])

# get a list of all the inflected forms
inflected_forms = df.loc[:, 'Inflected'].tolist()

# get a list of all the inflected forms without accents
inflected_no_accents = [strip_accents(element) for element in inflected_forms]

  df = pd.read_csv(paradigms, sep=r' +	*', on_bad_lines='skip', header=0, names=['Lemma', 'Type', 'Inflected'])


In [4]:
def valid_words(Greek):
    
    feedback = []
    
    # use a regex to strip non-alphanumeric characters from the input
    print(Greek)
#     Greek = re.sub(u'[\u0370-\u03FF]+', Greek) 
#     print(re.search(r'[\u0370-\u1FFF]+', '', Greek))
                    
    
#     # paradigms.tsv contains all forms from Pharr
#     paradigms = "lib/paradigms.tsv"
#     # verbs.tsv contains verbs from Pharr
#     verbs = "lib/verbs.tsv"

    # convert to dataframes
#     df = pd.read_csv(paradigms, sep=r' +	*', on_bad_lines='skip', header=0, names=['Lemma', 'Type', 'Inflected'])
#     df2 = pd.read_csv(verbs, sep='	', on_bad_lines='skip', header=0, names=['Lemma', 'Type', 'Inflected'])
#     # merge the dataframes
#     df = pd.concat([df, df2])
    
#     # get a list of all the inflected forms
#     inflected_forms = df.loc[:, 'Inflected'].tolist()
    
#     # get a list of all the inflected forms without accents
#     inflected_no_accents = [strip_accents(element) for element in inflected_forms]
    
    # Loop through each word entered by the user
    for word in Greek.split(' '):
        # add proper breathing marks if necessary
        correct_breathing = add_necessary_breathing(word)
        
        # strip all the accents
        no_accents = strip_accents(word)
        
        # look up each word. If not found then alert the user 
        if no_accents not in inflected_no_accents:
            # if the word doesn't contain the proper breathing marks, alert the user
            if not word == correct_breathing:
                feedback.append(word + ' does not have the correct breathing marks')
            
            # if the word doesn't contain proper breathing marks AND 
            if correct_breathing not in inflected_no_accents:
                feedback.append(word + ' could not be found')
                
        else:
            feedback.append(word + ' is a valid word')
            
    
    # return any feedback
    return '\n'.join(feedback)

### 2. Check whether the tenses/numbers are right

    Something will go here...

# Get Input

    Enter your Greek translation in the 'Greek' box and hit submit. Feedback will be displayed in the 'output' box.
   
    NOTE: I seem to be getting a warning about how I'm reading in the tsv files. But since it doesn't seem to be affecting the program's ability to run, I'm just ignoring it for now.

In [5]:
import gradio as gr

def greet(Greek):
    feedback = check(Greek)
    return "%s" % (feedback)

demo = gr.Interface(fn=greet, 
                    inputs=[gr.Textbox(lines=2, placeholder="Enter Greek translation here...")],
                    outputs="text")

demo.launch()

Running on local URL:  http://127.0.0.1:7926/

To create a public link, set `share=True` in `launch()`.


(<gradio.routes.App at 0x7f889a4964f0>, 'http://127.0.0.1:7926/', None)

## II. Exercises

Crosby and Shaeffer [exercises](https://docs.google.com/spreadsheets/d/1Nh6TbZ3ZGgjbJ9-113OojUbR2QA3xRDHKE7dQhfzib4/edit#gid=25338767)

Homeric Greek [quizzes](https://github.com/gregorycrane/Homerica/tree/master/quizzes)

### Experimenting with fill-in-the-blank quizzes (with feedback)

Define the quiz to read from below:

In [6]:
quiz = 'lib/a_hs.txt'

Define the number of the exercise below:

In [7]:
exercise = 2

In [8]:
# Get the first Homeric Greek quiz
with open(quiz) as f:
    # create list for holding the exercises
    exercises = f.readlines()
    
# Get the correct answer to the question
sent = exercises[exercise]

c_ans_end = sent.find('\t')
answer = sent[0:c_ans_end]
# print(answer)

# Get the translation of the correct answer
c_trans_end = sent.find('[')
# print(sent[c_ans_end+1:c_trans_end])

# Get the greek sentence
# NOTE: This doesn't work if the english translation begins with a lowercase letter
print("\n")
greek_sent = sent[c_trans_end:sent.find(re.findall('[A-Z]', sent)[0])]
# print(greek_sent)







Exception in callback None(<Task finishe...> result=None>)
handle: <Handle>
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
TypeError: 'NoneType' object is not callable


In [11]:
def check_blank(Greek):
    
    # make sure that only one word is entered
    if (len(Greek.split(' ')) > 1):
        return 'Answer should only be one word'
    
    feedback = []
    
    # If the answer is correct
    if Greek == answer:
            feedback.append('Correct!')
    
    else:
        # strip all the accents
        if strip_accents(Greek) == strip_accents(answer):
            feedback.append('Almost. Check the accents.')
    

    # look up each word. If not found then alert the user 
    if no_accents not in inflected_no_accents:
        feedback.append(Greek + ' could not be found')

    # if the word is valid, then we next need to check that it matches the correct answer
    else:
        feedback.append(Greek + ' is a valid word')
        
        
    # return any feedback
    return '\n'.join(feedback)
#     return "Hello WOrld"

In [12]:
def exercise(Greek):
    feedback = check_blank(Greek)
#     feedback = "hello world"
    return "%s" % (feedback)

blank_demo = gr.Interface(fn=exercise, 
                    inputs=[gr.Textbox(lines=2, placeholder="Fill in the blank here...", label=greek_sent)],
                    outputs="text")


blank_demo.launch()

Running on local URL:  http://127.0.0.1:7928/

To create a public link, set `share=True` in `launch()`.


(<gradio.routes.App at 0x7f889a91f7f0>, 'http://127.0.0.1:7928/', None)

Exception in callback None(<Task finishe...> result=None>)
handle: <Handle>
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
TypeError: 'NoneType' object is not callable
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.8/site-packages/gradio/routes.py", line 255, in run_predict
    output = await app.blocks.process_api(
  File "/opt/anaconda3/lib/python3.8/site-packages/gradio/blocks.py", line 538, in process_api
    predictions, duration = await self.call_function(fn_index, processed_input)
  File "/opt/anaconda3/lib/python3.8/site-packages/gradio/blocks.py", line 452, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/anaconda3/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/anaconda3/lib/python3.8/site-packages/anyio/_backends/_async