AWS Glue Create Table Action

GitHub Action to create or update AWS Glue Data Catalog tables using JSON metadata.

Features

Create new Glue tables or update existing ones
Accept full table metadata as JSON (TableInput format)
Automatic table existence detection
Support for cross-account catalog access
Comprehensive error reporting

Usage

- name: Create Glue table
  uses: predictr-io/aws-glue-create-table@v0
  with:
    database-name: 'my_database'
    table-name: 'my_table'
    table-input: |
      {
        "Name": "my_table",
        "StorageDescriptor": {
          "Columns": [
            {"Name": "id", "Type": "bigint"},
            {"Name": "name", "Type": "string"},
            {"Name": "timestamp", "Type": "timestamp"}
          ],
          "Location": "s3://my-bucket/my-data/",
          "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
          "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
          "SerdeInfo": {
            "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
            "Parameters": {
              "field.delim": ","
            }
          }
        },
        "PartitionKeys": [
          {"Name": "year", "Type": "string"},
          {"Name": "month", "Type": "string"}
        ]
      }

Authentication

This action requires AWS credentials to be configured. Use the official AWS configure credentials action:

- uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
    aws-region: us-east-1

- uses: predictr-io/aws-glue-create-table@v0
  with:
    database-name: 'my_database'
    table-name: 'my_table'
    table-input: '{"Name": "my_table", ...}'

Inputs

Input	Required	Default	Description
`database-name`	Yes	-	Name of the Glue database
`table-name`	Yes	-	Name of the table to create/update
`table-input`	Yes	-	Table metadata as JSON (TableInput object)
`catalog-id`	No	current account	AWS account ID for cross-account access

Outputs

Output	Description
`table-name`	Name of the created/updated table
`database-name`	Name of the database containing the table
`table-arn`	ARN of the created/updated table

Table Input Format

The table-input must be a valid JSON object matching the AWS Glue TableInput structure. See AWS Glue TableInput documentation for full details.

Minimal Example (CSV in S3)

{
  "Name": "my_table",
  "StorageDescriptor": {
    "Columns": [
      {"Name": "col1", "Type": "string"},
      {"Name": "col2", "Type": "int"}
    ],
    "Location": "s3://my-bucket/data/",
    "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
    "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
    "SerdeInfo": {
      "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
      "Parameters": {"field.delim": ","}
    }
  }
}

Parquet Example

{
  "Name": "parquet_table",
  "StorageDescriptor": {
    "Columns": [
      {"Name": "id", "Type": "bigint"},
      {"Name": "value", "Type": "double"}
    ],
    "Location": "s3://my-bucket/parquet-data/",
    "InputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",
    "OutputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",
    "SerdeInfo": {
      "SerializationLibrary": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
    }
  }
}

Examples

Create partitioned table

- uses: predictr-io/aws-glue-create-table@v0
  with:
    database-name: 'analytics'
    table-name: 'events'
    table-input: |
      {
        "Name": "events",
        "StorageDescriptor": {
          "Columns": [
            {"Name": "event_id", "Type": "string"},
            {"Name": "user_id", "Type": "string"},
            {"Name": "event_time", "Type": "timestamp"}
          ],
          "Location": "s3://my-bucket/events/",
          "InputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",
          "OutputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",
          "SerdeInfo": {
            "SerializationLibrary": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
          }
        },
        "PartitionKeys": [
          {"Name": "date", "Type": "string"}
        ]
      }

Use with environment variable

- name: Prepare table metadata
  run: |
    cat > table.json <<EOF
    {
      "Name": "my_table",
      "StorageDescriptor": {
        "Columns": [{"Name": "id", "Type": "bigint"}],
        "Location": "s3://my-bucket/data/"
      }
    }
    EOF

- uses: predictr-io/aws-glue-create-table@v0
  with:
    database-name: 'mydb'
    table-name: 'my_table'
    table-input: ${{ steps.prep.outputs.table_json }}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
dist		dist
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
action.yml		action.yml
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AWS Glue Create Table Action

Features

Usage

Authentication

Inputs

Outputs

Table Input Format

Minimal Example (CSV in S3)

Parquet Example

Examples

Create partitioned table

Use with environment variable

License

About

Uh oh!

Releases 2

Packages

Contributors 2

Uh oh!

Languages

License

predictr-io/aws-glue-create-table

Folders and files

Latest commit

History

Repository files navigation

AWS Glue Create Table Action

Features

Usage

Authentication

Inputs

Outputs

Table Input Format

Minimal Example (CSV in S3)

Parquet Example

Examples

Create partitioned table

Use with environment variable

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Uh oh!

Languages

Packages