Skip to content

Add STRUCT/ARRAY/MAP nested column support to RAB (CSA-371)#2392

Merged
cmgrote merged 10 commits intomainfrom
feature/rab-complex-type-nested-columns
Apr 16, 2026
Merged

Add STRUCT/ARRAY/MAP nested column support to RAB (CSA-371)#2392
cmgrote merged 10 commits intomainfrom
feature/rab-complex-type-nested-columns

Conversation

@bladata1990
Copy link
Copy Markdown
Collaborator

Summary

  • Adds support for complex SQL types (STRUCT, ARRAY, MAP<K,STRUCT>) in the Relational Assets Builder package
  • Child fields of complex-typed columns are emitted as separate Column assets linked via parentColumn hierarchy
  • Sub-columns are correctly excluded from the table's flat Columns list — they only appear inside the parent column's nested view

Key design decisions

Why sub-columns don't appear in the flat Columns list:
Atlan auto-creates the table_columns relationship from tableQualifiedName server-side. Sub-columns have all table/view reference fields cleared (tableQualifiedName, tableName, table, viewQualifiedName, viewName, view, materializedView) so they are NOT added to table_columns. Navigation is via the parentColumn chain.

QN format (matching Databricks connector):

  • STRUCT field: tableQN/parentCol/fieldName
  • ARRAY: tableQN/parentCol/items/fieldName
  • MAP<K,STRUCT>: tableQN/parentCol/values/fieldName
  • Deeply nested: tableQN/parentCol/outerField/innerField

Fields set on sub-columns:

  • parentColumnQualifiedName / parentColumn / parentColumnName
  • columnHierarchy: newline-delimited JSON ancestor entries (enables breadcrumb display)
  • columnDepthLevel: 1 for direct fields, 2+ for deeper nesting
  • nestedColumnOrder: ordinal within parent
  • subType=nested

Files changed

  • ComplexTypeParser.kt — NEW: bracket-aware recursive parser for STRUCT/ARRAY/MAP type strings
  • AssetXformer.kt — added nested column fields to BASE_OUTPUT_HEADERS
  • ColumnXformer.kt — overrides mapRow() to recursively emit child column rows for complex types
  • ComplexTypeParserTest.kt — NEW: unit tests for the parser
  • assets-complex.csv — NEW: test fixture with STRUCT/ARRAY/MAP types
  • build.gradle.kts (RAB + AIM) — minor build fixes

Test plan

  • Tested on fs3.atlan.com with Redshift connector (ci_noqn_test table)
  • plain_col, struct_col, nested_struct_col appear in table's flat Columns list (3 total)
  • city, zip appear only inside struct_col's nested view
  • outer, label appear only inside nested_struct_col's nested view
  • inner, count appear only inside outer's nested view (depth=2)
  • columnHierarchy breadcrumbs display correctly in the UI
  • Run unit tests: ./gradlew :samples:packages:relational-assets-builder:test -PpackageTests

🤖 Generated with Claude Code

Expands complex SQL types (STRUCT, ARRAY<STRUCT>, MAP<K,STRUCT>) into
child Column assets linked via parentColumn hierarchy. Sub-columns are
excluded from the table's flat Columns list by clearing all
tableQualifiedName/tableName/table/view refs — navigation is via
parentColumn chain only.

New fields on sub-columns:
- parentColumnQualifiedName / parentColumn / parentColumnName
- columnHierarchy: newline-delimited JSON ancestor entries (enables
  breadcrumb display in the UI, e.g. struct_col > city)
- columnDepthLevel: 1 for direct fields, 2+ for deeper nesting
- nestedColumnOrder: ordinal position within parent
- subType=nested

QN format (matching Databricks connector):
- STRUCT field:      tableQN/parentCol/fieldName
- ARRAY<STRUCT>:     tableQN/parentCol/items/fieldName
- MAP<K,STRUCT>:     tableQN/parentCol/values/fieldName
- Deeply nested:     tableQN/parentCol/outer/inner

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
@bladata1990 bladata1990 requested a review from cmgrote as a code owner April 8, 2026 13:44
bladata1990 and others added 3 commits April 8, 2026 20:58
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
The dependencies { include(project(":samples:packages:asset-import")) }
filter was accidentally removed from the shadowJar block, causing the fat
jar to bundle all transitive SDK dependencies instead of just asset-import.
This conflicted with the base container image jars and caused
ClassNotFoundException: com.atlan.pkg.rab.Importer at runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
@bladata1990
Copy link
Copy Markdown
Collaborator Author

@cmgrote Gentle nudge on this, can you check and approve

@bladata1990
Copy link
Copy Markdown
Collaborator Author

Tested sample asset in customer environment- https://mastercard-pov.atlan.com/assets/e2b15ecb-62be-4304-b661-9cc27f5d1b18/overview

Copy link
Copy Markdown
Collaborator

@cmgrote cmgrote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to drop the changes to the CI (.github/workflows/merge.yml in particular) for non-main branches. Want to avoid creating a permanent footprint for this that needs to be maintained.

Also, it looks like the logic should handle new headings in the CSV (for parent column qualifiedName, etc) — but the provided test file and tests don't seem to exercise this path at all. Please extend the test file and test scenario to test these additions, too.

@bladata1990
Copy link
Copy Markdown
Collaborator Author

@cmgrote We have not added any new columns to the files because the Struct itself is defined in the 'dataType' column attached a sample file
Struct_example_rab.csv

… tag logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
@bladata1990 bladata1990 requested a review from cmgrote April 16, 2026 10:27
Adds ComplexTypeColumnsRABTest covering the full pipeline from CSV input
to nested child columns in Atlan, verifying parentColumnQualifiedName,
columnDepthLevel, synthetic QN nodes (/items/, /values/), and depth-2
recursion. Also updates assets-complex.csv with adminRoles/adminUsers so
the connection can be created during test setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: bladata1990 <balakrishnan.r@atlan.com>
@bladata1990
Copy link
Copy Markdown
Collaborator Author

@cmgrote updated the E2E tests now

@bladata1990 bladata1990 requested a review from cmgrote April 16, 2026 11:59
cmgrote and others added 4 commits April 16, 2026 16:35
…fy CI comment

The compileJava task dependency on genCustomPkg is already covered by
sourcesJar; removing the duplicate from asset-import and RAB builds.
Also tightens the comment in custom-package-container.yml to accurately
describe the branch-tag logic (manual run, not all non-main branches).

Signed-off-by: Chris (He/Him) <cgrote@gmail.com>
Signed-off-by: Chris (He/Him) <cgrote@gmail.com>
Signed-off-by: Chris (He/Him) <cgrote@gmail.com>
@cmgrote cmgrote enabled auto-merge April 16, 2026 15:38
@cmgrote cmgrote merged commit ac9a292 into main Apr 16, 2026
7 checks passed
@cmgrote cmgrote added the packages Changes related to custom packages label Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

packages Changes related to custom packages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants