# Query Construction: Text-to-cyper

![text-to-cypher](../images/images-text-to-cypher.png)

**Text-to-Cypher** is a technique that translates natural language queries into Cypher, the query language used for interacting with graph databases like Neo4j. This allows users to retrieve and manipulate data stored in graph structures without needing to write complex Cypher queries themselves.

**How It Works:**

1. **User Input:**
   - The user poses a question or request in natural language.
   - *Example:*
     - "Find all colleagues of John Doe who have worked on project X."

2. **Natural Language Processing (NLP):**
   - The system analyzes the input to understand its intent and identify key components like entities, relationships, and conditions.
   - *Example Analysis:*
     - Entity: John Doe
     - Relationship: Colleagues
     - Project: Project X

3. **Cypher Query Generation:**
   - Based on the analysis, the system constructs a Cypher query that reflects the user's request.
   - *Generated Cypher Query:*
     - ```cypher
       MATCH (p:Person {name: 'John Doe'})-[:WORKED_ON]->(:Project {name: 'Project X'})<-[:WORKED_ON]-(colleague)
       RETURN colleague.name;
       ```

4. **Execution and Results:**
   - The Cypher query is executed against the graph database, and the results are returned to the user in an understandable format.
   - *Example Output:*
     - - Jane Smith
     - - Bob Johnson

**Benefits:**

- **Accessibility:** Enables users without knowledge of Cypher to interact with graph databases.
- **Efficiency:** Speeds up data retrieval by automating query formulation.
- **User-Friendly:** Allows for intuitive interaction with complex data structures.

**Advanced Applications:**

- **Knowledge Graphs:** Facilitates querying of interconnected data in knowledge graphs, enhancing data exploration and discovery.
- **Natural Language Interfaces:** Powers conversational agents and chatbots that provide insights from graph databases based on user queries.

**Challenges:**

- **Complex Queries:** Handling intricate queries involving multiple relationships and conditions requires sophisticated parsing and understanding.
- **Ambiguity:** Interpreting ambiguous natural language inputs accurately demands advanced NLP capabilities.
- **Schema Awareness:** The system must understand the database schema to generate correct and efficient Cypher queries.

**Recent Advances:**

Recent advancements have focused on improving the accuracy and efficiency of Text-to-Cypher systems:

- **Large Language Models (LLMs):** Integrating LLMs into Text-to-Cypher systems has enhanced their ability to generate accurate Cypher queries from natural language inputs. 

- **Synthetic Data Generation:** Frameworks like SynthCypher have been developed to create synthetic datasets for training Text-to-Cypher models, addressing the scarcity of annotated data. 

**Example in Practice:**

Imagine a researcher exploring a social network graph database who wants to find connections without writing Cypher queries. They can simply ask:

- "Who are the mutual friends of Alice and Bob?"

The system translates this into:

- ```cypher
  MATCH (a:Person {name: 'Alice'})-[:FRIEND]-(mutual)-[:FRIEND]-(b:Person {name: 'Bob'})
  RETURN mutual.name;
  ```

This query retrieves the names of individuals who are friends with both Alice and Bob, providing the researcher with the desired information without requiring them to write Cypher code.

By implementing Text-to-Cypher techniques, systems can make graph databases more accessible, allowing users to interact with complex data structures through simple natural language queries. 

## Setup

In [14]:
%run "../Z - Common/setup.ipynb"

!pip install -qU langgraph

Text-to-cypher is a task in natural language processing (NLP) where the goal is to automatically generate graph queries from natural language text. Similar to text-to-sql the task involves converting the text input into a structured representation and then using this representation to generate a semantically correct SQL query that can be executed on a database.

> **TODO!**: A text-to-cypher example will require Neo4j installing. Figure out simplest way to do this.