Skip to content

feat: spark sql schema translator#54

Merged
gummiorri merged 4 commits intomainfrom
37-create-translator-for-spark-ddl
Jan 30, 2026
Merged

feat: spark sql schema translator#54
gummiorri merged 4 commits intomainfrom
37-create-translator-for-spark-ddl

Conversation

@giuseppegrieco
Copy link
Copy Markdown
Member

No description provided.

@giuseppegrieco giuseppegrieco linked an issue Jan 29, 2026 that may be closed by this pull request
@giuseppegrieco giuseppegrieco changed the title feat: add spark sql schema translator feat: spark sql schema translator Jan 29, 2026
@giuseppegrieco
Copy link
Copy Markdown
Member Author

@cubic-dev-ai review this

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Jan 30, 2026

@cubic-dev-ai review this

@giuseppegrieco I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/translate/sparksql/translator.go">

<violation number="1" location="internal/translate/sparksql/translator.go:57">
P2: The `inlineStruct` function can cause infinite recursion if the schema contains circular type references. Consider adding a `visited` set to detect cycles and return an error instead of crashing.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread internal/translate/sparksql/translator.go Outdated
Copy link
Copy Markdown
Member

@gummiorri gummiorri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gummiorri gummiorri merged commit 8573956 into main Jan 30, 2026
9 checks passed
@gummiorri gummiorri deleted the 37-create-translator-for-spark-ddl branch January 30, 2026 15:45
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 41 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/translate/databrickssql/databrickssql.go.tmpl">

<violation number="1">
P2: Databricks SQL identifiers are emitted without quoting/escaping. Schema property names (and port names) can include characters or reserved words that are invalid as bare identifiers, which will make the generated `CREATE TABLE` fail. Consider quoting identifiers with backticks and escaping any backticks in the name.</violation>
</file>

<file name="cmd/daco/internal/app.go">

<violation number="1" location="cmd/daco/internal/app.go:35">
P2: The "spark-scala" registry entry is wired to the PySpark translator, so Spark Scala requests will be translated using PySpark logic. This likely produces incorrect output. Map it to the Scala translator instead.</violation>
</file>

<file name="internal/translate/gotypes/resolver.go">

<violation number="1">
P2: Non-acronym words are not properly lowercased, causing incorrect PascalCase conversion when input contains uppercase characters. For example, `"HTTP_SERVER"` produces `"HTTPSERVER"` instead of `"HTTPServer"`.</violation>
</file>

<file name="internal/translate/databrickssql/translator.go">

<violation number="1">
P2: Circular type references silently produce invalid SQL output. When `visited[name]` is true (indicating a cycle), the code `continue`s without modifying the type or returning an error. This leaves the field's Type as the definition name (e.g., "Person"), which is not a valid Databricks SQL type. Consider returning an error when a circular reference is detected.</violation>
</file>

<file name="internal/translate/prepare.go">

<violation number="1">
P2: ToSnakeCase doesn’t handle camelCase boundaries, so strings like "myFieldName" become "myfieldname" instead of "my_field_name", contradicting the function’s documented snake_case contract.</violation>
</file>

<file name="internal/translate/databrickspyspark/databrickspyspark.go.tmpl">

<violation number="1">
P2: Definition names are emitted as Python identifiers without sanitizing `$defs` names, so schemas with hyphens/spaces/digit-leading names will produce invalid Python code. Consider normalizing `$defs` names (e.g., ToSnakeCase) in the resolver before rendering the template.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread cmd/daco/internal/app.go
@@ -34,6 +35,7 @@ func registerTranslators() translate.Register {
translators["spark-scala"] = &pyspark.Translator{}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The "spark-scala" registry entry is wired to the PySpark translator, so Spark Scala requests will be translated using PySpark logic. This likely produces incorrect output. Map it to the Scala translator instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At cmd/daco/internal/app.go, line 35:

<comment>The "spark-scala" registry entry is wired to the PySpark translator, so Spark Scala requests will be translated using PySpark logic. This likely produces incorrect output. Map it to the Scala translator instead.</comment>

<file context>
@@ -9,13 +9,32 @@ import (
+	translators["python"] = &python.Translator{}
+	translators["databricks-sql"] = &databrickssql.Translator{}
+	translators["scala"] = &scala.Translator{}
+	translators["spark-scala"] = &pyspark.Translator{}
+	translators["databricks-scala"] = &databricksscala.Translator{}
+	translators["protobuf"] = &protobuf.Translator{}
</file context>
Suggested change
translators["spark-scala"] = &pyspark.Translator{}
translators["spark-scala"] = &scala.Translator{}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create translator for spark-DDL

2 participants