Skip to content

JdbcIO should make the 'COMMENT' metadata available as Description #20671

@damccorm

Description

@damccorm

Currently, creating a PCollection<Row> using JdbcIO.readRows() does not make available:

 - a column's COMMENT or REMARKS metadata as the corresponding Beam field Description

 - a table COMMENT or REMARKS metadata as an appropriately named Beam schema Option

 

Making this metadata available would strongly benefit IDEs providing in-line help to create Beam pipelines, as well as semi-automated data labelling tools using metadata to infer data properties.

 

Sketch of the proposed changes in JdbcIO: 


@FunctionalInterface
interface BeamFieldConverter extends Serializable {
    Schema.Field create(int
index, ResultSetMetaData md, DatabaseMetaData dmd) throws SQLException;
}
private static String getComment(int
index, ResultSetMetaData md, DatabaseMetaData dmd) throws SQLException {
    String comment = null;

   if(dmd instanceof DatabaseMetaDataUsingInfoSchema) {
        ResultSet rs = dmd.getColumns(md.getCatalogName(index),
md.getSchemaName(index), md.getTableName(index), md.getColumnName(index));
        if(rs.next()
 
         && md.getTableName(index).equals(rs.getString("TABLE_NAME"))
           && md.getColumnName(index).equals(rs.getString("COLUMN_NAME"))

       ) {
            comment = rs.getString("REMARKS");
        }
    }
    return comment;
}

 

Imported from Jira BEAM-10946. Original Jira may contain additional context.
Reported by: ylandrin.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions