Introduce Schema Inference to simplify graph definition


### Is your feature request related to a problem? Please describe.

Currently, defining a graph schema using `client.set_schema()` is a very manual and verbose process. The user must construct a large, deeply nested dictionary that explicitly maps every vertex, edge, attribute, and ID from the source tables.

The current process has several drawbacks:

1.  **High Initial Friction:** It requires developers to manually inspect their source data schemas (e.g., in Databricks or using another tool) and then meticulously transcribe every table name, column name, and data type into the PuppyGraph JSON format.
2.  **Error-Prone:** This manual transcription is highly susceptible to typos in field names (`from_field`), attribute names, or data types, which can lead to frustrating debugging sessions.
3.  **Difficult to Maintain:** If a column is added or renamed in the source Delta table, the developer must find and manually update the corresponding entry in the large schema dictionary. This brittleness can make schema evolution challenging.
4.  **Cognitive Overhead:** The current approach forces the user to focus on low-level mapping details rather than the high-level conceptual model of their graph (i.e., "this table is a node, this other table defines the relationship between them").

While the *querying* is "Zero-ETL," the initial *setup* feels like a manual data mapping task.

### Describe the solution you'd like

I propose the introduction of a **schema inference** mechanism. Since the client is already configured with connection details to a data catalog (e.g., a Unity Catalog), it should be able to use that connection to automatically inspect the schemas of the underlying tables.

This could be exposed through a more intuitive, high-level API, such as a **`SchemaBuilder`** class. This would allow users to define their graph conceptually, while the builder handles the low-level details of attribute mapping.

### Example of the Proposed API

Here's a comparison of the current approach versus how it could look with a `SchemaBuilder`.

**Current Approach (Manual & Verbose):**

```python
# User has to write this entire dictionary by hand
client.set_schema({
    "catalogs": [...], # a lot of boilerplate
    "vertices": [
        {
            "table_source": {"catalog_name": "imdb_catalog", "schema_name": "public", "table_name": "movies"},
            "label": "Movie",
            "attributes": [
                {"name": "title", "from_field": "title", "type": "String"},
                {"name": "release_year", "from_field": "release_year", "type": "Integer"},
            ],
            "id": [{"name": "movie_id", "from_field": "movie_id", "type": "String"}]
        },
        # ... and so on for Actors ...
    ],
    "edges": [
        {
            "table_source": {"catalog_name": "imdb_catalog", "schema_name": "public", "table_name": "acted_in"},
            "label": "ACTED_IN",
            "from_label": "Actor",
            "to_label": "Movie",
            "from_id": [{"name": "actor_id", "from_field": "actor_id", "type": "String"}],
            "to_id": [{"name": "movie_id", "from_field": "movie_id", "type": "String"}],
            # ... etc ...
        }
    ]
})
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Schema Inference to simplify graph definition #6

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Example of the Proposed API

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Introduce Schema Inference to simplify graph definition #6

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Example of the Proposed API

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions