Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[looking for feedback] Reworking STRING & BINARY DataTypes #14259

Open
ephys opened this issue Mar 18, 2022 · 13 comments
Open

[looking for feedback] Reworking STRING & BINARY DataTypes #14259

ephys opened this issue Mar 18, 2022 · 13 comments
Assignees
Labels
breaking change For issues and PRs. Changes that break compatibility and require a major version increment. type: feature For issues and PRs. For new features. Never breaking changes.

Comments

@ephys
Copy link
Member

ephys commented Mar 18, 2022

Goals

Proposition

  • Rework DataTypes.STRING based on the table below. Add DataTypes.BINARY.
  • Deprecate DataTypes.CITEXT, DataTypes.TEXT, DataTypes.CHAR, DataTypes.BLOB

Latest design

STRING DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Oracle Snowflake
STRING VARCHAR(255) VARCHAR(255) VARCHAR(255) TEXT1 VARCHAR(255) VARCHAR2(255 CHAR) VARCHAR(255)
STRING('max') VARCHAR VARCHAR(MAX) TEXT VARCHAR
STRING(256) VARCHAR(256) VARCHAR(256) VARCHAR(256) TEXT1 VARCHAR(256) VARCHAR(256 CHAR) VARCHAR(256)
STRING(256).FIXED CHAR(256) CHAR(256) CHAR(256) CHARACTER(256) CHAR(256 CHAR)
STRING.FIXED.LENGTH(256) CHAR(256) CHAR(256) CHAR(256) CHARACTER(256) CHAR(256 CHAR)
STRING.FIXED
STRING('max').FIXED
STRING(n).NATIONAL NVARCHAR(n) NVARCHAR(n) NVARCHAR(n) NVARCHAR(n)
STRING(n).FIXED.NATIONAL NCHAR(n) NCHAR(n) NCHAR(n) NCHAR(n)
STRING(n).INSENSITIVE CITEXT1 TEXT COLLATE NOCASE1
STRING('max').INSENSITIVE CITEXT TEXT COLLATE NOCASE
STRING(n).COLLATION('name') VARCHAR(n) COLLATE name VARCHAR(n) COLLATE name VARCHAR(n) COLLATE name TEXT COLLATE name VARCHAR(n) COLLATE name VARCHAR2(n CHAR) COLLATE name VARCHAR(n) COLLATE name

CLOB

These are available in dialects that support Character Large Objects (Sometimes called TEXT, sometimes called CLOB. Equivalent of a BLOB for characters).

DataType MySQL/MariaDB DB2 Oracle
STRING.CLOB('tiny') TINYTEXT
STRING.CLOB TEXT1 CLOB
STRING(n).CLOB TEXT1 CLOB(n) CLOB1
STRING(n).CLOB.NATIONAL NCLOB(n) NCLOB1
STRING.CLOB('medium') MEDIUMTEXT
STRING.CLOB('long') LONGTEXT

MySQL/MariaDB extensions

DataType MySQL/MariaDB
STRING(n).MYSQL_BIN_COLLATION VARCHAR(n) BINARY
STRING(n).FIXED.MYSQL_BIN_COLLATION CHAR(n) BINARY

About NATIONAL

Current mssql implementation uses NVARCHAR by default. In order to not require users to add .NATIONAL everywhere, we could add the following global option:

const sequelize = new Sequelize({
  string: { 
    // Use .NATIONAL by default when using DataTypes.STRING
    national: true,
  },
});

Users can then opt out of using NATIONAL by writing DataTypes.STRING.NATIONAL(false).

We should also add a LENGTH option to allow users to re-use the same STRING type with different lengths:

const NVARCHAR = DataTypes.STRING.NATIONAL;

const User = sequelize.define('User', {
  name: NVARCHAR.LENGTH(100),
});

BINARY DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Oracle Snowflake
BINARY
BINARY('max') BYTEA VARBINARY(MAX) BLOB LONG RAW VARBINARY
BINARY(256) BYTEA2 VARBINARY(256) VARBINARY(256) BLOB2 VARBINARY(256) RAW(256) VARBINARY(256)
BINARY(256).FIXED BINARY(256) BINARY(256) BINARY(256)
BINARY.FIXED
BINARY.LENGTH(8000) BYTEA2 VARBINARY(8000) VARBINARY(8000) BLOB2 VARBINARY(8000) RAW(8000) VARBINARY(8000)

While STRING defaults to 255 characters because it's a commonly used type, BINARY requires an explicit length.,

BLOB

These are available in dialects that support Binary Large Objects (BLOB).

DataType MySQL/MariaDB DB2 Oracle
BINARY.BLOB('tiny') TINYBLOB
BINARY.BLOB BLOB BLOB
BINARY(256).BLOB BLOB2 BLOB(256) BLOB2
BINARY.BLOB('medium') MEDIUMBLOB
BINARY.BLOB('long') LONGBLOB
BINARY(256).BLOB('long') LONGBLOB2

Pending Questions

  • Is mssql's VARCHAR(MAX) more suited to STRING.CLOB or STRING('max')?
  • Do we want to keep STRING.INSENSITIVE? It's better to use a collation than CITEXT for this. Maybe rename to STRING.CITEXT for postgres? Or keep CITEXT a separate class?

Footnotes

  1. String length check constraint 2 3 4 5 6 7 8

  2. Byte Count check constraint 2 3 4 5 6 7

@ephys ephys added the type: feature For issues and PRs. For new features. Never breaking changes. label Mar 18, 2022
@ephys ephys self-assigned this Mar 18, 2022
@ephys ephys changed the title Consider deprecating & replacing DataTypes.STRING [RFC] Consider deprecating & replacing DataTypes.STRING Mar 30, 2022
@ephys ephys added the RFC Request for comments regarding breaking or large changes label Mar 30, 2022
@ephys ephys removed the type: feature For issues and PRs. For new features. Never breaking changes. label Mar 30, 2022
@ephys ephys mentioned this issue May 13, 2022
42 tasks
@ephys
Copy link
Member Author

ephys commented Sep 30, 2022

Still working on this proposal and after further work on #14505, here is an updated/alternative proposal:

  • Keep STRING, but merge CHAR & TEXT with it
  • STRING would receive new options to support the different types (FIXED for CHAR, UTF16 for NCHAR/NVARCHAR)
  • Add BINARY type, remove BLOB and STRING.BINARY

String DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Snowflake
STRING TEXT TEXT VARCHAR(MAX) TEXT CLOB(2147483647) VARCHAR
STRING.UTF16 NVARCHAR(MAX)
STRING(256) VARCHAR(256) VARCHAR(256) VARCHAR(256) TEXT VARCHAR(256) VARCHAR(256)
STRING(256).FIXED CHAR(256) CHAR(256) CHAR(256) CHARACTER(256)
STRING(256).FIXED.UTF16 NCHAR(256)
STRING.FIXED
STRING('tiny') VARCHAR(256) TINYTEXT VARCHAR(256) TEXT VARCHAR(256) VARCHAR(256)
STRING('medium') VARCHAR(2^24) MEDIUMTEXT VARCHAR(MAX) TEXT CLOB(2^24) VARCHAR(2^24)
STRING('long') VARCHAR(2^32) LONGTEXT VARCHAR(MAX) TEXT CLOB(2^32) VARCHAR(2^32)
STRING('max') TEXT LONGTEXT VARCHAR(MAX) TEXT CLOB(2147483647) VARCHAR(16777216)
STRING(8001) VARCHAR(8000) VARCHAR(8000) VARCHAR(MAX) TEXT VARCHAR(8000) VARCHAR(8001)
STRING(32705) VARCHAR(32705) VARCHAR(32705) VARCHAR(MAX) TEXT CLOB(32705) VARCHAR(32705)
STRING(2^16) VARCHAR(65536) TEXT VARCHAR(MAX) TEXT CLOB(65536) VARCHAR(65536)
STRING(2^16+1) VARCHAR(2^16+1) MEDIUMTEXT VARCHAR(MAX) TEXT CLOB(2^16+1) VARCHAR(2^16+1)
STRING(2^24+1) VARCHAR(2^24+1) LONGTEXT VARCHAR(MAX) TEXT CLOB(2^32+1) VARCHAR(2^32+1)
STRING.COLLATION_BINARY TEXT BINARY TEXT COLLATION BINARY
STRING(n).COLLATION_BINARY VARCHAR(n) BINARY TEXT COLLATION BINARY
STRING(n).FIXED.COLLATION_BINARY CHAR(n) BINARY

Note: Dialects that do not support specifying a byte length on the text type could enforce it using a check constraint.

VARCHAR max value of length arg:

  • MSSQL: 8000
  • DB2: 32704
  • MySQL/MariaDB: 65535
  • Snowflake: 16777216
  • Postgres: ∞
  • SQLite: ∞

Binary DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Snowflake
BINARY BYTEA BLOB VARBINARY(MAX) BLOB BLOB(2147483647) VARBINARY
BINARY(256) BYTEA VARBINARY(256) VARBINARY(256) BLOB VARBINARY(256) VARBINARY(256)
BINARY(256).FIXED BINARY(256) BINARY(256) BINARY(256)
BINARY.FIXED
BINARY('tiny') BYTEA TINYBLOB VARBINARY(256) BLOB VARBINARY(256) VARBINARY(256)
BINARY('medium') BYTEA MEDIUMBLOB VARBINARY(MAX) BLOB BLOB(2^24) VARBINARY(2^24)
BINARY('long') BYTEA LONGBLOB VARBINARY(MAX) BLOB BLOB(2^32) VARBINARY(2^32)
BINARY('max') BYTEA LONGBLOB VARBINARY(MAX) BLOB BLOB(2147483647) VARBINARY(16777216)
BINARY(8001) BYTEA VARBINARY(8000) VARBINARY(MAX) BLOB VARBINARY(8000) VARBINARY(8001)
BINARY(32705) BYTEA VARBINARY(32705) VARBINARY(MAX) BLOB BLOB(32705) VARBINARY(32705)
BINARY(2^16) BYTEA BLOB VARBINARY(MAX) BLOB BLOB(65536) VARBINARY(65536)
BINARY(2^16+1) BYTEA MEDIUMBLOB VARBINARY(MAX) BLOB BLOB(2^16+1) VARBINARY(2^16+1)
BINARY(2^24+1) BYTEA LONGBLOB VARBINARY(MAX) BLOB BLOB(2^32+1) VARBINARY(2^32+1)

Note: Dialects that do not support specifying a byte length on the text type could enforce it using a check constraint.

@WikiRik
Copy link
Member

WikiRik commented Sep 30, 2022

Sounds good! I guess we want to do this for v7 but in a future PR, right? Or do you want to tackle this in #14505 as well? (I'd prefer it in a future PR, but I can imagine that doing the same work twice is also not that efficient)

@ephys
Copy link
Member Author

ephys commented Sep 30, 2022

Oh no definitely its own PR

@ephys
Copy link
Member Author

ephys commented Sep 30, 2022

Another thing is that currently, STRING.BINARY is VARCHAR BINARY in MySQL/MariaDB. That is not the same thing as VARBINARY. From what I see VARCHAR BINARY is VARCHAR with a binary collation (utf8mb4_bin).

I propose to keep STRING.BINARY & CHAR.BINARY but only in MySQL/MariaDB & SQLite (as TEXT COLLATION BINARY). Other dialects should throw instead.

I don't like the name, it's ambiguous (as can be seen by the fact that all other dialects have implemented it incorrectly). I propose to deprecate it & rename it to STRING.BINARY_COLLATION, STRING.FIXED.BINARY_COLLATION

I've updated the table in my previous post

@ephys
Copy link
Member Author

ephys commented Sep 30, 2022

In DB2, there is also (VAR)CHAR FOR BIT DATA, which also doesn't behave like BINARY (it behaves like (VAR)CHAR. Can't find any documentation on whether it would be correct to assign it to STRING.COLLATION_BINARY

@ephys ephys changed the title [RFC] Consider deprecating & replacing DataTypes.STRING [RFC] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT Sep 30, 2022
@WikiRik WikiRik mentioned this issue Nov 21, 2022
5 tasks
@WikiRik
Copy link
Member

WikiRik commented Nov 24, 2022

@ephys as mentioned during the monthly, I'll try to look into this. Do you think the tables are near completion?

@ephys
Copy link
Member Author

ephys commented Nov 26, 2022

In my opinion it's ready, but I'd love a thorough review so we don't need to introduce another breaking change in the future

@ephys
Copy link
Member Author

ephys commented Dec 20, 2022

Another proposal because I can't make my mind up

I found the previous one to be too complicated. This one should provide more control over which type is used, while still providing a common group of options that can be used everywhere.

STRING DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Snowflake
STRING TEXT LONGTEXT VARCHAR(MAX) TEXT CLOB(2147483647) VARCHAR
STRING(256) VARCHAR(256) VARCHAR(256) VARCHAR(256) TEXT1 VARCHAR(256) VARCHAR(256)
STRING(256).FIXED CHAR(256) CHAR(256) CHAR(256) CHARACTER(256)
STRING.FIXED
STRING(8001) VARCHAR(8000) VARCHAR(8000) VARCHAR(MAX) TEXT VARCHAR(8000) VARCHAR(8001)
STRING(32705) VARCHAR(32705) VARCHAR(32705) VARCHAR(MAX) TEXT CLOB(32705) VARCHAR(32705)
STRING.COLLATION_BINARY TEXT BINARY TEXT COLLATE BINARY
STRING(n).COLLATION_BINARY VARCHAR(n) BINARY TEXT COLLATE BINARY1
STRING.INSENSITIVE CITEXT VARCHAR(n) BINARY TEXT COLLATE NOCASE
STRING(n).INSENSITIVE CITEXT1 VARCHAR(n) BINARY TEXT COLLATE NOCASE1
STRING(n).FIXED.COLLATION_BINARY CHAR(n) BINARY

SQL Server extensions

DataType SQL Server
STRING.N NVARCHAR(MAX)
STRING(256).FIXED.N NCHAR(256)

MySQL / MariaDB extensions

MySQL / MariaDB use VARCHAR if you specify a width, or TEXT by default. This allows using more specific types.

DataType MySQL/MariaDB
STRING.TINYTEXT TINYTEXT
STRING.TEXT TEXT
STRING.MEDIUMTEXT MEDIUMTEXT
STRING.LONGTEXT LONGTEXT
STRING(n).LONGTEXT LONGTEXT1

DB2 extensions

DB2 automatically uses CLOB if the size is >= 32705, but this makes the DataType always use CLOB no matter the size

DataType DB2
STRING(n).CLOB CLOB(n)

DB2 extensions

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Snowflake
STRING.CLOB TINYTEXT VARCHAR(256) TEXT CLOB(32705) VARCHAR(256)
STRING(256).CLOB TINYTEXT VARCHAR(256) TEXT CLOB(256) VARCHAR(256)

BINARY DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Snowflake
BINARY BYTEA LONGBLOB VARBINARY(MAX) BLOB BLOB(2147483647) VARBINARY
BINARY(256) BYTEA2 VARBINARY(256) VARBINARY(256) BLOB2 VARBINARY(256) VARBINARY(256)
BINARY(256).FIXED BINARY(256) BINARY(256) BINARY(256)
BINARY.FIXED
BINARY(8000) BYTEA2 VARBINARY(8000) VARBINARY(8000) BLOB2 VARBINARY(8000) VARBINARY(8000)
BINARY(8001) BYTEA2 VARBINARY(8001) BLOB2 VARBINARY(8001) VARBINARY(8001)
BINARY(32705) BYTEA2 VARBINARY(32705) BLOB2 BLOB(32705) VARBINARY(32705)

MySQL / MariaDB & DB2 extensions

MySQL / MariaDB use VARBINARY if you specify a width, or BLOB by default. This allows using more specific types.

DB2 automatically uses BLOB if the size is >= 32705, but this makes the DataType always use BLOB no matter the size

DataType MySQL/MariaDB DB2
BINARY.TINYBLOB TINYBLOB
BINARY.BLOB BLOB
BINARY(256).BLOB BLOB2 BLOB(256)
BINARY.MEDIUMBLOB MEDIUMBLOB
BINARY.LONGBLOB LONGBLOB
BINARY(256).LONGBLOB LONGBLOB2

Footnotes

  1. String length check constraint 2 3 4 5

  2. Byte Count check constraint 2 3 4 5 6 7 8 9 10

@ephys ephys pinned this issue Dec 20, 2022
@ephys ephys changed the title [RFC] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT [looking for feedback] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT Dec 20, 2022
@ephys ephys changed the title [looking for feedback] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT [looking for feedback] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT, DataTypes.CITEXT Dec 20, 2022
@ephys ephys changed the title [looking for feedback] Reworking DataTypes.STRING, DataTypes.CHAR, DataTypes.BLOB, DataTypes.TEXT, DataTypes.CITEXT [looking for feedback] Reworking STRING & BINARY DataTypes Dec 20, 2022
@ephys ephys added type: feature For issues and PRs. For new features. Never breaking changes. breaking change For issues and PRs. Changes that break compatibility and require a major version increment. and removed RFC Request for comments regarding breaking or large changes labels Dec 20, 2022
@bminer
Copy link

bminer commented Dec 20, 2022

My $0.02 is that VARCHAR(255) should remain the default for DataTypes.STRING for a few reasons:

  1. TEXT has a slightly larger storage requirement: 2 bytes to represent the string length rather than 1 with VARCHAR(255) (this is true in MySQL, not sure on other DBs).
  2. In MySQL, for example, VARCHAR fields can have a default value whereas TEXT fields cannot.
  3. Most string-like fields are going to be VARCHAR(255) -- People / company names, mailing addresses, phone numbers, emails, etc. Most of these fields are not going to exceed 255 characters.
  4. STRING.TEXT is easy enough to tell Sequelize to generate a TEXT field.

There is little justification for changing the current default behavior IMO.

@ephys
Copy link
Member Author

ephys commented Dec 21, 2022

Thank you very much for taking the time to give your feedback :)

You raise important issues that indicate that TEXT is not suitable as a default for MySQL:

  • Poor indexability
  • No default value

Note though that VARCHAR(255) is more often than not going to use a 2 byte length prefix too, due to characters often needing more than one byte

I personally dislike the default length of 255 as I think devs should think about which length suits their column. Still, with all the changes we already have, I'm willing to drop this one

Here is yet another design update:

  • STRING defaults to VARCHAR(255) once again
  • Added STRING(MAX) for a VARCHAR type with no explicit character limit
    • STRING(MAX) does not produce a CLOB
  • I've renamed .TEXT -> .CLOB. TEXT has very different meanings depending on the dialect: TEXT in postgres and sqlite is equivalent to VARCHAR while MySQL's is a CLOB. Also aligns nicely with BINARY.BLOB
  • Renamed STRING.N -> STRING.NATIONAL
  • BINARY now requires a length (instead of defaulting to max length)
  • Added BINARY(MAX)
  • Added Oracle Data Types
  • Renamed COLLATION_BINARY -> MYSQL_BIN_COLLATION, as it's a mysql only option
  • Added STRING.COLLATION options

Latest design

STRING DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Oracle Snowflake
STRING VARCHAR(255) VARCHAR(255) VARCHAR(255) TEXT1 VARCHAR(255) VARCHAR2(255 CHAR) VARCHAR(255)
STRING('max') VARCHAR VARCHAR(MAX) TEXT VARCHAR
STRING(256) VARCHAR(256) VARCHAR(256) VARCHAR(256) TEXT1 VARCHAR(256) VARCHAR(256 CHAR) VARCHAR(256)
STRING(256).FIXED CHAR(256) CHAR(256) CHAR(256) CHARACTER(256) CHAR(256 CHAR)
STRING.FIXED.LENGTH(256) CHAR(256) CHAR(256) CHAR(256) CHARACTER(256) CHAR(256 CHAR)
STRING.FIXED
STRING('max').FIXED
STRING(n).NATIONAL NVARCHAR(n) NVARCHAR(n) NVARCHAR(n) NVARCHAR(n)
STRING(n).FIXED.NATIONAL NCHAR(n) NCHAR(n) NCHAR(n) NCHAR(n)
STRING(n).INSENSITIVE CITEXT1 TEXT COLLATE NOCASE1
STRING('max').INSENSITIVE CITEXT TEXT COLLATE NOCASE
STRING(n).COLLATION('name') VARCHAR(n) COLLATE name VARCHAR(n) COLLATE name VARCHAR(n) COLLATE name TEXT COLLATE name VARCHAR(n) COLLATE name VARCHAR2(n CHAR) COLLATE name VARCHAR(n) COLLATE name

CLOB

These are available in dialects that support Character Large Objects (Sometimes called TEXT, sometimes called CLOB. Equivalent of a BLOB for characters).

DataType MySQL/MariaDB DB2 Oracle
STRING.CLOB('tiny') TINYTEXT
STRING.CLOB TEXT1 CLOB
STRING(n).CLOB TEXT1 CLOB(n) CLOB1
STRING(n).CLOB.NATIONAL NCLOB(n) NCLOB1
STRING.CLOB('medium') MEDIUMTEXT
STRING.CLOB('long') LONGTEXT

MySQL/MariaDB extensions

DataType MySQL/MariaDB
STRING(n).MYSQL_BIN_COLLATION VARCHAR(n) BINARY
STRING(n).FIXED.MYSQL_BIN_COLLATION CHAR(n) BINARY

About NATIONAL

Current mssql implementation uses NVARCHAR by default. In order to not require users to add .NATIONAL everywhere, we could add the following global option:

const sequelize = new Sequelize({
  string: { 
    // Use .NATIONAL by default when using DataTypes.STRING
    national: true,
  },
});

Users can then opt out of using NATIONAL by writing DataTypes.STRING.NATIONAL(false).

We should also add a LENGTH option to allow users to re-use the same STRING type with different lengths:

const NVARCHAR = DataTypes.STRING.NATIONAL;

const User = sequelize.define('User', {
  name: NVARCHAR.LENGTH(100),
});

BINARY DataType

DataType postgres MySQL/MariaDB SQL Server SQLite DB2 Oracle Snowflake
BINARY
BINARY('max') BYTEA VARBINARY(MAX) BLOB LONG RAW VARBINARY
BINARY(256) BYTEA2 VARBINARY(256) VARBINARY(256) BLOB2 VARBINARY(256) RAW(256) VARBINARY(256)
BINARY(256).FIXED BINARY(256) BINARY(256) BINARY(256)
BINARY.FIXED
BINARY.LENGTH(8000) BYTEA2 VARBINARY(8000) VARBINARY(8000) BLOB2 VARBINARY(8000) RAW(8000) VARBINARY(8000)

While STRING defaults to 255 characters because it's a commonly used type, BINARY requires an explicit length.,

BLOB

These are available in dialects that support Binary Large Objects (BLOB).

DataType MySQL/MariaDB DB2 Oracle
BINARY.BLOB('tiny') TINYBLOB
BINARY.BLOB BLOB BLOB
BINARY(256).BLOB BLOB2 BLOB(256) BLOB2
BINARY.BLOB('medium') MEDIUMBLOB
BINARY.BLOB('long') LONGBLOB
BINARY(256).BLOB('long') LONGBLOB2

Pending Questions

  • Is mssql's VARCHAR(MAX) more suited to STRING.CLOB or STRING('max')?
  • Do we want to keep STRING.INSENSITIVE? It's better to use a collation than CITEXT for this. Maybe rename to STRING.CITEXT for postgres? Or keep CITEXT a separate class?

Footnotes

  1. String length check constraint 2 3 4 5 6 7 8

  2. Byte Count check constraint 2 3 4 5 6 7

@1valdis
Copy link

1valdis commented Jan 12, 2023

First I was annoyed that .STRING would be 255 by default, then I was comforted that it was changed to TEXT in pg (which I mainly use), but then it was again limited to 255, what's that emotional rollercoaster? 😁

I'm used to just use pg's TEXT everywhere and rely on validation on the client code. If you're required to increase max text length of something, why bother with migrations and constraints if with TEXT you can simply change client code validation and be done with it? Makes for less confusion between different text types, and also I like its name better than VARCHAR.

Given that, I don't see pg's TEXT at all in the latest proposal which makes me sad. Even though I know that in pg TEXT and VARCHAR without length specifier are the same thing, I got used to assuming that VARCHAR is something limited in length and TEXT is unlimited.

@ephys
Copy link
Member Author

ephys commented Jan 14, 2023

what's that emotional rollercoaster?

It's a very difficult balancing act between trying to have data types that mean (roughly) the same thing in all dialects, and have good defaults for each dialect. It's a headache.

The design isn't set in stone yet. We're very much still debating this proposal

I'm used to just use pg's TEXT everywhere and rely on validation on the client code.

Note that you can reuse a configured data type like this:

// datatypes.ts
export const STRING = DataTypes.STRING.MAX;

// user.model.ts
import { STRING } from './datatypes.js';

class User extends Model {
  @Attribute(STRING)
  declare username: string;
}

@jthoward64
Copy link

One option to balance between having a default length for VARCHAR and encouraging developers to choose a reasonable size would be to make the default a Sequelize constructor option, and if that option is not provided, simple disallow the default constructor. That way not only do you increase the visibility of the fact that STRING columns have a max length (which some devs miss initially), but also give a bit of extra flexibility. IIRC DataTypes have access to a context object assembled by the Sequelize instance; this value should fit in there.

@ephys ephys unpinned this issue Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change For issues and PRs. Changes that break compatibility and require a major version increment. type: feature For issues and PRs. For new features. Never breaking changes.
Projects
Status: v8
Development

No branches or pull requests

5 participants