Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to discover PostgreSQL enums #24

Merged
merged 32 commits into from
Nov 23, 2021
Merged

Ability to discover PostgreSQL enums #24

merged 32 commits into from
Nov 23, 2021

Conversation

charleschege
Copy link
Contributor

This PR attempts to close issue #22

Fix a few typos I found while digging through the source code
to allow the ` #[derive(sqlx::FromRow, Debug)]` to be used
to derive an sqlx row for handling PostgreSQL enums
@tyt2y3
Copy link
Member

tyt2y3 commented Nov 15, 2021

I don't really think it completely resolved the issue

@billy1624 can you point out the missing pieces?

@billy1624
Copy link
Member

billy1624 commented Nov 15, 2021

I think there are some misunderstanding. To make things clear I will explain it in steps

Tasks

  1. Discovery: query enum info from Postgres with query such as... Instead of parsing raw SQL enum CREATE TYPE name AS ENUM(field_one, field_two, ...).

    # Following SQL are for reference.
    # I just found it online for demo purpose.
    select col.table_schema,
        col.table_name,
        col.ordinal_position as column_id,
        col.column_name,
        col.udt_name,
        string_agg(enu.enumlabel, ', ' 
                    order by enu.enumsortorder) as enum_values
    from information_schema.columns col
    join information_schema.tables tab on tab.table_schema = col.table_schema
                                    and tab.table_name = col.table_name
                                    and tab.table_type = 'BASE TABLE'
    join pg_type typ on col.udt_name = typ.typname
    join pg_enum enu on typ.oid = enu.enumtypid
    where col.table_schema not in ('information_schema', 'pg_catalog')
        and typ.typtype = 'e'
    group by col.table_schema,
            col.table_name,
            col.ordinal_position,
            col.column_name,
            col.udt_name
    order by col.table_schema,
            col.table_name,
            col.ordinal_position;

    image

  2. Definition: parse the query result and store enum info (enum name, variants) in some struct. Just like what you did in src/postgres/def/types.rs.

  3. Writer: write enum info into sea_query::extension::postgres::TypeCreateStatement which will build SQL statement for creating corresponding enum on Postgres db.

@tyt2y3 correct me if something seems off

@billy1624
Copy link
Member

Hey @charleschege, you need any help?

@charleschege
Copy link
Contributor Author

Hey @charleschege, you need any help?

Sorry, yesterday I had some engagements. I am working on this today.

From what I understand:

  1. I need to discover enum fields in a database from a SelectStatement
  2. Parse the query results which are of type PgRow into an enum
  3. Write the enums to a TypeCreateStatement for enum creation in a PostgreSQL db

Correct me if I am wrong

@billy1624
Copy link
Member

billy1624 commented Nov 19, 2021

Hey @charleschege, I have tested some marginal cases where the enum variants contain special characters such as |. The enum and its variants are...

image

I run cargo run --manifest-path ./tests/writer/postgres_enum/Cargo.toml on my console and added some dbg!() for debugging.

This is what I get. So basically joining all variants using | by SQL then splitting it on Rust is not working on some extreme case.

[src/postgres/discovery/mod.rs:210] self.executor.get_enums(query).await = [
    EnumRow {
        name: "mpaa_rating",
        values: "new_value|HI|,.!@#$%^&*()_-+=[]{}\\|.<>/?  ``\\|new_value|HI|,.!@#$%^&*()_-+=[]{}\\|.<>/?  |new_value|HI||new_value||new_value|NC-17|R|PG-13|PG|G",
    },
]
[tests/writer/postgres_enum/src/main.rs:10] SchemaDiscovery::new(connection, "public").discover_enums().await = [
    Enum(
        EnumDef {
            values: [
                "new_value",
                "HI",
                ",.!@#$%^&*()_-+=[]{}\\",
                ".<>/?  ``\\",
                "new_value",
                "HI",
                ",.!@#$%^&*()_-+=[]{}\\",
                ".<>/?  ",
                "new_value",
                "HI",
                "",
                "new_value",
                "",
                "new_value",
                "NC-17",
                "R",
                "PG-13",
                "PG",
                "G",
            ],
            attr: StringAttr {
                length: Some(
                    11,
                ),
            },
            typename: "mpaa_rating",
        },
    ),
]

@charleschege
Copy link
Contributor Author

@billy1624 Yeah! I agree that is an extreme case. I will rewrite the select statement creation and parser

@billy1624
Copy link
Member

billy1624 commented Nov 19, 2021

Can you tick the box and give write access to maintainers? Just now I want to fix the CI errors but I don't have the permission to push commits. The checkbox should be around the right corner on this page.

https://user-images.githubusercontent.com/30400950/142591667-4514447a-4cda-4fe8-b3ca-448013d0801c.png

@charleschege
Copy link
Contributor Author

charleschege commented Nov 19, 2021

Can you tick the box and give write access to maintainers? Just now I want to fix the CI errors but I don't have the permission to push commits. The checkbox should be around the right corner on this page.

https://user-images.githubusercontent.com/30400950/142591667-4514447a-4cda-4fe8-b3ca-448013d0801c.png

Done

Copy link
Member

@billy1624 billy1624 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A proper test case would be like what tests/live/postgres/src/main.rs does. You can add your test cases to there directly.

Basically you write the TypeCreateStatement statement and run it on Postgres to create the enum. Then, you run the enum discover and writer you just wrote. Finally, you assert the discovered TypeCreateStatement match the originally TypeCreateStatement that you write by hand at the very beginning.

tests/writer/postgres_enum/src/main.rs Outdated Show resolved Hide resolved
@charleschege
Copy link
Contributor Author

dbb03b0 seems to solve all the requirements to discover enums even in cases where enums have special characters. @tyt2y3 @billy1624

Copy link
Contributor Author

@charleschege charleschege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/writer/postgres_enum has been removed and tests moved to the tests/live/postgres directory

@charleschege
Copy link
Contributor Author

@tyt2y3 @billy1624 what else is needed to get this PR merged?

src/postgres/writer/column.rs Show resolved Hide resolved
tests/writer/postgres_enum/src/main.rs Outdated Show resolved Hide resolved
src/postgres/def/types.rs Outdated Show resolved Hide resolved
src/postgres/writer/column.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@charleschege charleschege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even easier to read. Thanks!

@billy1624
Copy link
Member

Hey @tyt2y3, please the overall structure now

@billy1624
Copy link
Member

Ready for review @tyt2y3

@charleschege
Copy link
Contributor Author

@billy1624 Thanks for reviewing the code and making some changes. Through some improvements you made I have learnt more about the internals of this repository. Thanks again.

@billy1624
Copy link
Member

@billy1624 Thanks for reviewing the code and making some changes. Through some improvements you made I have learnt more about the internals of this repository. Thanks again.

No problems! Please correct me if I break your original code

@charleschege
Copy link
Contributor Author

@billy1624 Sometimes assertion fails on the enums. If they are re-arranged, assertion always panics. I had been able to achieve some correctness by always sorting the structure first. I suppose that was not important to begin with 😂. But what if I wrote some code as a user and needed to do assertion on the enum, even though the enums are the same, comparison would fail due to their ordering, how would this be done to ensure that assertion always succeeds?

@charleschege
Copy link
Contributor Author

@billy1624 Sometimes assertion fails on the enums. If they are re-arranged, assertion always panics. I had been able to achieve some correctness by always sorting the structure first. I suppose that was not important to begin with joy. But what if I wrote some code as a user and needed to do assertion on the enum, even though the enums are the same, comparison would fail due to their ordering, how would this be done to ensure that assertion always succeeds?

To provide an example output of this:

[tests/live/postgres/src/main.rs:83] &enum_create_statements = [
    "CREATE TYPE \"crazy_enum\" AS ENUM (E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology', 'Chemistry', 'Math', 'Physics')",
]
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `"CREATE TYPE \"crazy_enum\" AS ENUM ('Chemistry', 'Math', 'Physics', E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology')"`,
 right: `"CREATE TYPE \"crazy_enum\" AS ENUM (E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology', 'Chemistry', 'Math', 'Physics')"`', tests/live/postgres/src/main.rs:85:5

Comment on lines +32 to +44
pub fn query_enums(&self) -> SelectStatement {
Query::select()
.column((PgType::Table, PgType::TypeName))
.column((PgEnum::Table, PgEnum::EnumLabel))
.from(PgType::Table)
.inner_join(
PgEnum::Table,
Expr::tbl(PgEnum::Table, PgEnum::EnumTypeId).equals(PgType::Table, PgType::Oid),
)
.order_by((PgType::Table, PgType::TypeName), Order::Asc)
.order_by((PgEnum::Table, PgEnum::EnumLabel), Order::Asc)
.take()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated Vec<EnumDef> will be in ascending order of typename and variants of it will also be in ascending order.

Comment on lines +16 to +25
let create_enum_stmt = Type::create()
.as_enum(Alias::new("crazy_enum"))
.values(vec![
Alias::new("Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\|.<>/? ``"),
Alias::new("Biology"),
Alias::new("Chemistry"),
Alias::new("Math"),
Alias::new("Physics"),
])
.to_string(PostgresQueryBuilder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected result should be in the same order, i.e. ascending order

@billy1624
Copy link
Member

billy1624 commented Nov 23, 2021

@billy1624 Sometimes assertion fails on the enums. If they are re-arranged, assertion always panics. I had been able to achieve some correctness by always sorting the structure first. I suppose that was not important to begin with joy. But what if I wrote some code as a user and needed to do assertion on the enum, even though the enums are the same, comparison would fail due to their ordering, how would this be done to ensure that assertion always succeeds?

To provide an example output of this:

[tests/live/postgres/src/main.rs:83] &enum_create_statements = [
    "CREATE TYPE \"crazy_enum\" AS ENUM (E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology', 'Chemistry', 'Math', 'Physics')",
]
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `"CREATE TYPE \"crazy_enum\" AS ENUM ('Chemistry', 'Math', 'Physics', E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology')"`,
 right: `"CREATE TYPE \"crazy_enum\" AS ENUM (E'Astro0%00%8987,.!@#$%^&*()_-+=[]{}\\\\|.<>/? ``', 'Biology', 'Chemistry', 'Math', 'Physics')"`', tests/live/postgres/src/main.rs:85:5

Yes, I notice the issue and understand why you used the sorting method. I think it's fine as long as the generated result is deterministic, i.e. sorted in a particular order. Then, user can always rearrange the expected output while keeping the semantic to match the generated output.

@tyt2y3 tyt2y3 merged commit 3eed4c4 into SeaQL:master Nov 23, 2021
tyt2y3 pushed a commit that referenced this pull request Nov 23, 2021
* Correct type from Alias::new("ploygon") to Alias::new("polygon")

Add enum type for writer

* Add  type

* Add support for PostgreSQL types

Fix a few typos I found while digging through the source code

* Add tests for checking the correctness of PostgreSQL enum parsing and processing

* Gate `PgEnum` and its impl block under the `sqlx` feature
to allow the ` #[derive(sqlx::FromRow, Debug)]` to be used
to derive an sqlx row for handling PostgreSQL enums

* Add support for fetching enums

* Move types into types files under postgres::def::Type

Add documentation on types

Output the result by calling `discover_enum` on `SchemaDiscovery`

* Add method to parse a raw SQL create query for enum creation

Add missing documentation for the added types for enum creation

* Test case to discover all the enums in a given database schema

Write into SQL and then parse the raw SQL asserting that all
discovered enums are the same as their raw SQL queries

* Solve issues with github actions not detecting types

* Add feature flag for new enum type and associated functions

* Switch to using re-exported sqlx types within sea-schema

* Fix CI errors

* Support enums with crazy names that include special characters

* Add support for ordered Vecs

* Add tests for discovery of enums

Remove redundant module postgres_enum from the tests dir

* Ensure a custom enum type is dropped first from the database so that CI runs successfully

* Refactoring

* Ensures the writer derives the enum name from the EnumDef

* Refactoring

* Refactoring

* Refactoring

* Testing enum column...

* Refactoring

* Refactoring

* Refactoring

* Fix test

* Refactoring

* Refactoring

* Refactoring

* Cleanup

Co-authored-by: Billy Chan <ccw.billy.123@gmail.com>
@billy1624 billy1624 linked an issue Dec 7, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Discovering Postgres enum
3 participants