Skip to content

Implement safe enums: literals + poly variants gen#152

Merged
jongleb merged 5 commits intoygrek:ahrefsfrom
jongleb:type-safe-enums
Mar 10, 2025
Merged

Implement safe enums: literals + poly variants gen#152
jongleb merged 5 commits intoygrek:ahrefsfrom
jongleb:type-safe-enums

Conversation

@jongleb
Copy link
Copy Markdown
Collaborator

@jongleb jongleb commented Feb 7, 2025

Description

This PR introduces two main enums related features:

1. String Literal Types and Unions

The primary feature is that any string literal now has a StringLiteral type. When used in expressions like:

ROW..VALUES
SELECT .. UNION
WHERE IN (...)
@param { ...} | { ... } | {...}

or any other cases where StringLiteral can accept multiple values, the type will be inferred as a Union.
There are two types of Unions:

Open Union (resulting from combining StringLiterals or other Unions)
Closed Union (literal/Enum)

The relationships between these types can be seen here

Also check tests

This PR adds a new flag type-safe-enums (name is negotiable). When this flag is enabled parameters to enum fields and enum literals (new syntax too) now are checked.

Type Inference and Subtyping Example
INSERT INTO `tblname` (`col1`, `col2`, `col3`, `col4`, `col5`) 
SELECT @param3 { A { 'A' } | B { 'B' } | C { 'C' } }
-- rest of the query omitted for clarity
Enum Type Inference

In this case, the matching values are inferred as:

StringLiteral 'A'
StringLiteral 'B'
StringLiteral 'C'

These are then combined into an enum type A | B | C. The system verifies that this enum is either the same type as or a subtype of the closed enum A | B | C

Subtyping Rules

For example, this would be valid:

@param3 { A { 'A' } | B { 'B' } }

Because A | B is a valid subtype of A | B | C - you can safely write a value of type A | B where A | B | C is expected.

String Operations and Type Coercion

String operations work naturally:

SELECT CONCAT(col3, 'test') FROM tblname;

This is valid because Enum is a subtype of Text, so the enum value is automatically coerced to text. The reverse operation (text to enum) is not allowed.
Here's a complete example demonstrating insertion with string concatenation:

INSERT INTO tblname (col3) 
SELECT CONCAT(
  @param3 { A { 'A' } | B { 'B' } | C { 'C' } },
  '_suffix'
);

This will fail type checking because:

col3 expects an enum of type A | B | C
CONCAT() produces a Text type
You cannot assign a Text value to an enum field - even if the text matches one of the enum variants (like 'A', 'B', or 'C')

INSERT INTO tblname (col3) VALUES ('A');  -- Ok
INSERT INTO tblname (col3) VALUES (CONCAT('A', '')); -- Fails: Text to Enum

2. Optional and flag-controlled. It adds the ability to generate polymorphic variants when a parameter is inferred as an enum.

Let's make an example.

DDL:
CREATE TABLE `some_table` (
  `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `status` enum('pending','sending','sent','cancelled')  NOT NULL,
  `status_b` enum('a','b','c','d')  NOT NULL
);

Example (flag is disabled)

INSERT INTO `some_table` (
  `status`,
  `status_b`
) VALUES (
  'pending',
  @status_b
);

Here nothing is changed

  let insert_some_table_0 db ~status_b =
    let set_params stmt =
      let p = T.start_params stmt (1) in
      T.set_param_Text p status_b;
      T.finish_params p
    in
    T.execute db ("INSERT INTO `some_table` (\n\
  `status`,\n\
  `status_b`\n\
) VALUES (\n\
  'pending',\n\
  ?\n\
)") set_params

This is what is generated

Example (flag is enabled)

INSERT INTO `some_table` (
  `status`,
  `status_b`
) VALUES (
 'pending',
  @status_b
);

This is the result

    module Enum_0 = T.Make_enum(struct
      type t = [`A | `B | `C | `D]
      let inj = function | "a" -> `A | "b" -> `B | "c" -> `C | "d" -> `D | s -> failwith (Printf.sprintf "Invalid enum value: %s" s)
      let proj = function  | `A -> "a"| `B -> "b"| `C -> "c"| `D -> "d"
    end)

  let insert_some_table_0 db ~status_b =
    let set_params stmt =
      let p = T.start_params stmt (1) in
      Enum_0.set_param p status_b;
      T.finish_params p
    in
    T.execute db ("INSERT INTO `some_table` (\n\
  `status`,\n\
  `status_b`\n\
) VALUES (\n\
  'pending',\n\
  ?\n\
)") set_params

@jongleb jongleb requested a review from ygrek February 10, 2025 09:05
@jongleb jongleb self-assigned this Feb 10, 2025
@jongleb jongleb marked this pull request as ready for review February 10, 2025 09:05
@ygrek
Copy link
Copy Markdown
Owner

ygrek commented Feb 10, 2025

Why need new syntax? we know from type that enum is expected.
Also I would consider dropping the flag, make this behaviour default and allow to opt-out on per-field basis.

@jongleb
Copy link
Copy Markdown
Collaborator Author

jongleb commented Feb 10, 2025

Why need new syntax? we know from type that enum is expected.

because type inference is based on syntax itself. much of the simplicity of the hm for example type system is that it allows types to be automatically inferred from syntax. in this case when we don't have special syntax we make our Text type more cunning and including the value in order to check the context of use at further stages and if the context is enum then only then consider it as enum value, I therefore decided to have a special syntax.

Also I would consider dropping the flag, make this behaviour default and allow to opt-out on per-field basis.

got it

@jongleb
Copy link
Copy Markdown
Collaborator Author

jongleb commented Feb 10, 2025

take for example

SELECT .. FROM .. WHERE enum_field = 'A'

in this case I test the left side of the binary operator for the field type, this is a later stage then

SELECT .. FROM .. WHERE enum_field = SPECIAL_SYNTAX_'A'

but if you insist I can do it this way checking the left part and the contents each time

@ygrek
Copy link
Copy Markdown
Owner

ygrek commented Feb 11, 2025

special syntax is definitely not a requirement by HM
it may be a requirement imposed by hacky implementation of something HM-like in sqlgg :)
but i expect it should be able to figure it out through unification same as all other variables/parameters where we don't know type beforehand

@jongleb jongleb marked this pull request as draft February 20, 2025 19:39
@jongleb jongleb marked this pull request as ready for review February 23, 2025 17:14
@jongleb
Copy link
Copy Markdown
Collaborator Author

jongleb commented Feb 28, 2025

@ygrek I considered your comment, it now doesn't require writing special syntax for the user it's regular strings. I am going to merge it, I will back and fix if there are any other comments

@jongleb jongleb merged commit c0603ca into ygrek:ahrefs Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants