-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server: keep Postgres static schema #371
Comments
Can you clarify what Postgres statics schema DDL is? |
Like the |
We don't have a plan to store SQL static files inside Iglu server - the maintenance burden would be enormous. The linked tickets explain our plan around being able to generate SQL "files" on Iglu server endpoints based on schemas in the server. |
Yes but the links in the tickets are not working. Can I ask for the working links so that I can review the plan details, please? |
Sorry @szareiangm, I afraid there's no working links for these tickets. And unfortunately we don't have any ETA on full Postgres support yet (though many open source users are interested in it). But PRs for schema-ddl can significantly speed-up this process. |
Thanks @chuwy . After spending some time, I understood that Postgres and Redshift have similarities, and the
I will do it in steps with separate pull requests with separate issues. |
Hey @szareiangm, Thanks for your interest in this and sorry for slightly delayed response. I have few points here: I like the plan overall, but little bit hazy on first step. Although abstracting over SQL DDLs sounds like a good idea in theory, I'm not 100% sure that this is a good idea in practice. Problem is that SQL DDLs usually differ in a most important for us nuance: data type definitions. As an example, you can take a look at Snowflake AST which is quite different. And we definitely want have data types as sealed hierarchy. Same for columns definition: column options can be very different. I'm not saying that I disagree with this step, but at least if we'll take this path we need be sure that we don't loose any preciseness on DDLs. E.g. we cannot have definitions like: case class Column(name: String, sqlType: SqlType, sqlOptions: SqlOptions) As its very lossy type, we won't be able to inspect it as pretty much anything will be able to extend case class Column[T <: SqlDdl](name: String, sql: SqlType[T], sqlOptions: SqlOptions[T]) In summary, this is a very hard trade-off between boilerplate (until now, I was leaning towards boilerplate) and possibly leaky abstractions. One more very important point is that if we'll implement your roadmap, we'll still be missing one very important building block in Postgres support. Namely, we don't have a Postgres support in RDB Shredder (snowplow/snowplow-rdb-loader#47). |
@chuwy Good morning from Toronto! About Postgres support in RBD Shredder, I gotta spend some time and understand the dynamics of it. I am not sure if that blocks this plan. |
Hey @szareiangm!
Sure, please feel free to ask questions. In short: enriched events is our canonical TSV, containing all common fields (e.g.
Sorry, I'm not entirely sure yet. Most specific suggestion is in previous comment. However, right now I'm totally fine with boilerplate solution. This is super-important for our use case to prevent constructing following non-sense objects, like: Column("foo", redshiftType, postgresOptions) And current single hierarchy does not prevent it. It can be not a big deal for object-construction, but will be a huge problem for parsers. |
Thanks for the kind explanation, invaluable and enlightening. I think that would be the next step in my plan. There is room for progress in every aspect of the codebase when we extending it to new data sinks. I will get back to you for shredding when I read its code more throughly, however for now, I have to have schema generated out of iglu-server as the first step.
I think since Redshift is based on Postgres 8 with some commercial additions, the difference in types and options that we are worried about are potentially minimal. However, we can decided on the solution for the potential issues as we move forward. Let's move on and and have a look at my PR for first step, and then I can submit the second step PR. I appreciate your positive and open attitude. |
Migrated to snowplow/iglu-server#13 |
Hello team,
We had this conversation in Gitter, but I want to make it official here. I was wondering if you folks agree to keep Postgres statics schema DDL in the Iglu server?
I saw #193 and #194 but they seem to be a bit far for now.
The text was updated successfully, but these errors were encountered: