shema

Derive macro to generate database schema code from Rust struct

All parameters are specified via shema

Struct parameters

firehose_schema - Enables firehose schema generation
firehose_partition_code - Enables code generation to access partition information
firehose_parquet_schema - Enables parquet schema generation similar to AWS Glue's one
parquet_code - Specifies to generate parquet code to write struct per schema. This requires parquet and serde_json crates to be added as dependencies

Field parameters

json - Specifies that field is to be encoded as json object (automatically derived for std's collections)
enumeration - Specifies that field is to be encoded as enumeration (Depending on database, it will be encoded as string or object)
index - Specifies that field is to be indexed by underlying database engine (e.g. to be declared a partition key in AWS glue schema)
firehose_date_index - Specifies field to be used as timestamp within firehose schema which will produce year, month and day fields. Requires to be of timestamp type. E.g. time::OffsetDateTime
rename - Tells to use different name for the field. Argument MUST be string specified as rename = "new_name"

Firehose date index

If specified firehose output will expect RFC3339 encoded string as output during serialization

You should configure HIVE json deserializer with possible RFC3339 formats.

Terraform Reference: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kinesis_firehose_delivery_stream#timestamp_formats-1

Schema output

Following constants will be declared for affected structs:

SHEMA_TABLE_NAME - table name in lower case
SHEMA_FIREHOSE_SCHEMA - Firehose glue table schema. If enabled.
SHEMA_FIREHOSE_PARQUET_SCHEMA - Partquet schema compatible with firehose data stream. If enabled.

Following methods will be defined for affected structs

shema_firehose_partition_keys_ref - Returns tuple with references to partition keys
shema_firehose_partition_keys - Returns tuple with owned values of partition keys
shema_firehose_s3_path_prefix - Returns fmt::Display type that writes full path prefix for S3 destination object
shema_is_firehose_s3_path_prefix_valid - Returns true if shema_firehose_s3_path_prefix is valid or not (i.e. no string is empty among partitions)

Following parquet crate traits are implemented:

RecordWriter - Enables write via SerializedFileWriter

Firehose specifics

Firehose schema expects flat structure, so any complex struct or array must be serialized as strings

mod prost_wkt_types {
    pub struct Struct;
}

use std::fs;
use shema::Shema;

#[derive(Shema)]
#[shema(firehose_schema, firehose_parquet_schema, firehose_partition_code)]
pub(crate) struct Analytics<'a> {
    #[shema(index, firehose_date_index)]
    ///Special field that will be transformed in firehose as year,month,day
    r#client_time: time::OffsetDateTime,
    r#server_time: time::OffsetDateTime,
    r#user_id: Option<String>,
    #[shema(index)]
    ///Index key will go into firehose's partition_keys
    r#client_id: String,
    #[shema(index)]
    r#session_id: String,
    #[shema(json)]
    r#extras: Option<prost_wkt_types::Struct>,
    #[shema(json)]
    r#props: prost_wkt_types::Struct,
    r#name: String,

    byte: i8,
    short: i16,
    int: i32,
    long: i64,
    ptr: isize,

    float: f32,
    double: f64,
    boolean: bool,
    #[shema(rename = "stroka")]
    strka: &'a str,

    array: Vec<String>,
}

assert_eq!(Analytics::SHEMA_TABLE_NAME, "analytics");

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shema

Struct parameters

Field parameters

Firehose date index

Schema output

Following constants will be declared for affected structs:

Following methods will be defined for affected structs

Following parquet crate traits are implemented:

Firehose specifics

About

Uh oh!

Releases

Packages

Languages

License

DoumanAsh/shema

Folders and files

Latest commit

History

Repository files navigation

shema

Struct parameters

Field parameters

Firehose date index

Schema output

Following constants will be declared for affected structs:

Following methods will be defined for affected structs

Following parquet crate traits are implemented:

Firehose specifics

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages