<a href="https://colab.research.google.com/github/Bubballoo3/Greek-Parser/blob/main/Final_Project_Greek_Parser.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Ancient Greek Helper
A tool that is able to reference conjugation/declension tables and dictionary entries in order to parse and define ancient greek words just like a real classicist.

At the moment, this is restricted to standard attic forms. With more tables, this restriction could be widened in the future

In [8]:
#Load auxillary files from github and change directories

!git clone https://github.com/Bubballoo3/Greek-Parser.git
import os
os.chdir("Greek-Parser")

Cloning into 'Greek-Parser'...
remote: Enumerating objects: 134, done.[K
remote: Counting objects: 100% (134/134), done.[K
remote: Compressing objects: 100% (56/56), done.[K
remote: Total 134 (delta 83), reused 115 (delta 75), pack-reused 0 (from 0)[K
Receiving objects: 100% (134/134), 1.47 MiB | 4.47 MiB/s, done.
Resolving deltas: 100% (83/83), done.


In [9]:
# Install langraph
%%capture --no-stderr
%pip install -U langgraph langsmith langchain_openai

The packages and setup are drawn from the LangGraph tutorial [here](https://langchain-ai.github.io/langgraph/tutorials/introduction/#setup)

In [10]:
# Import packages
import getpass
import os
from typing import Annotated

from typing_extensions import TypedDict
from langchain_core.messages import AIMessage
from langchain_core.tools import tool

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode
class State(TypedDict):
    # Messages have the type "list". The `add_messages` function
    # in the annotation defines how this state key should be updated
    # (in this case, it appends messages to the list, rather than overwriting them)
    messages: Annotated[list, add_messages]

# Get API key for openAI
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
from langchain_openai import ChatOpenAI


OpenAI API Key:··········


In [11]:
# Then we define our llm
llm=ChatOpenAI(model_name="gpt-4o-mini")

With the pieces in place for LangGraph, we begin to make the individual agents involved.

The first one will check for an entry in LSJ. Ensure that the lsj.json file is uploaded to the current colab session

In [12]:
import json
@tool
def checkLSJ(string: str):
    """Search the Liddel and Scott Greek-English Lexicon for a specific word

       It returns a (python) dictionary with two keys, "definition" and "mention" with list values.

       The "definition" list holds at most one dictionary entry and is focused on that word.

       The "mention" list holds all the places where the searched word was mentioned in the definition of a different word."""
    results={"definition":[],"mention":[]}
    with open('./resources/lsj.json', 'r') as file:
        data = json.load(file)
    for key, value in data.items():
        response_type = "none"
        if key == string:
            response_type = "definition"
        else:
            for smkey, smvalue in value.items():
                if smkey == 'd':
                    if string in smvalue:
                        response_type = "mention"
                elif isinstance(smvalue, list):
                    for item in smvalue:
                        if string == item:
                            response_type="definition"
        if response_type != "none":
            if len(results[response_type]) < 5:
                results[response_type].append(value['d'])
    if len(results) > 0:
        return results
    else:
        return "Not found"

In [13]:
# Collect tools and create tooled LLM
tools = [checkLSJ]
llm_with_tools = llm.bind_tools(tools)

In [14]:
# Define models
def translator(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

We then define our graph. As a single agent system, this is pretty simple. More or less it is
```
     START
       |
       |
      \|/            (If tool calls)
  Translator Agent  ------------------>    Tool Node
       |         /|\                           /
       |          \_________________________/
       |
       | (If no tool calls or errors occur with the tools)
       |
      \|/
      END                                            
```
As such we have to create the translator and tool_node nodes first, and then create the edges. The only complicated edge are the two leaving the translator, which are controlled using a conditional edge, and decides which edge to follow.

In [15]:
# Define nodes and graphs
graph_builder = StateGraph(State)
tool_node = ToolNode(tools)
graph_builder.add_node("translator",translator)
graph_builder.add_node("tool_node",tool_node)
graph_builder.add_edge(START, "translator")

# The route_tools function decides when to send an AI response to a tool, and when to finish
def route_tools(state: State):
    """Use in the conditional_edge to route to the tool_node if there are tool calls"""
    if isinstance(state, list):
        ai_message = state[-1]
    elif messages := state.get("messages", []):
        ai_message = messages[-1]
    else:
        raise ValueError(f"No messages found in input state to tool_edge: {state}")
    lastmessage=state["messages"][-2]
    if hasattr(messages[-2],"status") and "Error:" in messages[-2].content:
        return END
    elif hasattr(ai_message, "tool_calls") and len(ai_message.tool_calls) > 0:
        return "tools"
    return END

graph_builder.add_conditional_edges(
    "translator",
    route_tools,
    {"tools":"tool_node",END:END},
)
graph_builder.add_edge("tool_node", "translator")

graph=graph_builder.compile()

In [17]:
# We then access a stream of messages from a graph instance.
searchword = input("Enter a word to search for: ")
events=graph.stream({"messages": [("system","""
    You are a classicist skilled in Ancient Greek and interpreting entries and abbreviations from the LSJ Greek-English Lexicon.
    When given a word (the word will not have accents, be prepared for this), you first use the checkLSJ tool to check if that word exists as an entry. The following
    rules should help you parse these dictionary entries.
    1. Words in <b> brackets are the title of the entry. When you search, you may get an entry with your word in <i> brackets, in which case you should use the context to help you find the right word.
    2. If the entry contains "v." followed by a different word, it is a referral to a different word and you should search that word and show the result.

    After searching the dictionary, use the result to interpret and parse the word.
    If you can't find direct results, see if you know the root word. If not, guess what part of speech it is and what conjugation/declension it is. From there, try to approximate the dictionary form (nominative singular for nouns, first person singular present active for verbs). Continue to search the dictionary until you find the entry for the root word. The root word should always be in the dictionary, and you should access the definition before finishing.
    Note that the same word has many different forms in greek, so you will likely have to convert it either to the nominative singular (for nouns), the nominative
    masculine singular (for adjectives), or the first person present active indicative (for verbs). When you are done you should give a description of the process you used to get the form, then give your parse of the word, and quote the FULL dictionary definition EXACTLY QUOTED FROM LSJ of the root word, and add newlines when appropriate. Make sure you always provide your parsing and quote the final entry in its entirety. NEVER simulate an entry you did not look up.

    REMEMBER!! You must include the full parsing of the searched word and definition of the root word in every answer. Acknowledge when you do not think you are correct.""",),
                            ("user","ελθουσα"),
                            ("assistant","""
The word "ελθουσα" is a form of the verb "ἔρχομαι," which means "to come" or "to go." In this case, "ελθουσα" is a feminine participle, specifically the aorist active participle form. To understand its meaning and grammatical properties, I checked the root entry for "ἔρχομαι," as well as the aorist participle form "ἐλθών."

### Parsing Process:
1. **Identify the root**: The word "ελθουσα" derives from the verb "ἔρχομαι" (to come/go), specifically from its aorist stem.
2. **Determine the form**: The suffix "-ουσα" indicates that it is a feminine aorist active participle in the nominative singular form.
3. **Check for definitions**: Upon checking "ἔρχομαι," I found that it indeed relates to "ελθουσα," which corresponds to the aorist active participle form "ἐλθών" for masculine and "ελθουσα" for feminine.

### Parsed Form:
- **Word**: ελθουσα
- **Root**: ἔρχομαι
- **Form**: Aorist active participle, feminine singular nominative
- **Meaning**: "having come" or "having gone"

### Definition from LSJ:
The entry for "ἔρχομαι" describes it as follows:

> "<b>ἔρχομαι</b> Il.13.256, etc. (Act. <i>ἔρχω</i> as barbarism, Tim.Pers.167): impf. <i>ἠρχόμην</i> Hp.Epid.7.59, Arat.102, (<i>δι-</i>) Pi.O.9.93; freq. in later Prose, LXXGe.48.7, Ev.Marc.1.45, Luc.Jud.Voc.4, Paus.5.8.5, etc.; in Att. rare even in compds., <i>ἐπ-ηρχόμην</i> Th.4.120 (perh. fr. <i>ἐπάρχομαι</i>), <i>προσ-</i> ib.121 (perh. fr. <i>προσάρχομαι</i>), <i>περι-</i> Ar.Th.504 cod.: from <i>ἐλυθ-</i> (cf. <i>ἐλεύθω</i>) come fut. <i>ἐλεύσομαι</i>, Hom., Ion., Trag. (A.Pr.854, Supp.522, S.OC1206, Tr.595), in Att. Prose only in Lys.22.11, freq. later, D.H.3.15, etc.: aor., Ep. and Lyr. <i>ἤλῠθον</i> Il.1.152, Pi.P.3.99, etc., used by E. (not A. or ) in dialogue (Rh.660, El.598, Tr.374, cf. Neophr.1.1); but <i>ἦλθον</i> is more freq. even in Hom., and is the only form used in obl. moods, <i>ἐλθέ</i>, <i>ἔλθω</i>, <i>ἔλθοιμι</i>, <i>ἐλθεῖν</i>, <i>ἐλθών</i> [...]"
"""),
                            ("user",f"{searchword}")]})
for event in events:
    for value in event.values():
        message=value["messages"][-1].content
        if len(message) > 0:
            print("Model:", message)

Enter a word to search for: πηγης
Model: {"definition": [], "mention": []}
Model: {"definition": ["<b>πηγή</b>, Dor. <i>πᾱγά</i>, <i>ἡ</i>, running water, used by Hom. always in pl., streams, <i>πηγαὶ</i> <i>ποταμῶν</i> Il.20.9, cf. Hdt.1.189, A.Pr.89,434 (lyr.), Pers.311, E.HF1297, Rh.827 (lyr.); <i>κρουνὼ</i> <i>δ’</i> <i>ἵκανον</i> <i>καλλιρρόω</i>, <i>ἔνθα</i> <i>δὲ</i> <i>πηγαὶ</i> <i>δοιαὶ</i> <i>ἀναΐσσουσι</i> Il.22.147: sg., <i>καλλιρρόου</i> <i>ἔψαυσα</i> <i>π</i>. A.Pers.202, cf. 613. &nbsp;&nbsp;&nbsp;&nbsp;<b>2.</b> metaph., of tears, <i>πηγαὶ</i> <i>κλαυμάτων</i>, <i>δακρύων</i>, streams . . , Id.Ag.888, S.Ant.803, Tr.852 (lyr.): abs., <i>παρειὰν</i> <i>νοτίοις</i> <i>ἔτεγξα</i> <i>παγαῖς</i> A.Pr.402 (lyr.), cf. E.Alc.1068, etc. ; also <i>πηγαὶ</i> <i>γάλακτος</i>, <i>βοτρύων</i>, S.El.895, E.Cyc.496 (lyr.); <i>πόντου</i> <i>πηγαῖς</i> with sea<i>-</i> water, Id.IT1039; <i>πηγαὶ</i> <i>τροφῆς</i> <i>τῷ</i> <i>γεννωμένῳ</i>, of mother&#39;s milk, Pl.Mx.237e; <i>π</i>. <i>μ