Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaterializedMySQL: Support ENUM data type #18310

Closed
pesto93 opened this issue Dec 21, 2020 · 9 comments · Fixed by #25676
Closed

MaterializedMySQL: Support ENUM data type #18310

pesto93 opened this issue Dec 21, 2020 · 9 comments · Fixed by #25676
Labels
comp-mysql easy task Good for first contributors feature question Question?

Comments

@pesto93
Copy link

pesto93 commented Dec 21, 2020

I was checking out this new feature - on MYSQL I have a table customer

CREATE TABLE `addresses` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `street` varchar(255) NOT NULL,
  `city` varchar(255) NOT NULL,
  `state` varchar(255) NOT NULL,
  `zip` varchar(255) NOT NULL,
  `type` enum('SHIPPING','BILLING','LIVING') NOT NULL,
  PRIMARY KEY (`id`),
  KEY `address_customer` (`customer_id`),
  CONSTRAINT `addresses_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=latin1

But replication is failing when using MaterializeMySQL

2020.12.21 11:56:08.925135 [ 48 ] {} <Debug> executeQuery: (internal) /*Materialize MySQL step 1: execute MySQL DDL for dump data*/ EXTERNAL DDL FROM MySQL(inventory, inventory) CREATE TABLE `addresses` ( `id` int(11) NOT NULL AUTO_INCREMENT, `customer_id` int(11) NOT NULL, `street` varchar(255) NOT NULL, `city` varchar(255) NOT NULL, `state` varchar(255) NOT NULL, `zip` varchar(255) NOT NULL, `type` enum('SHIPPING','BILLING','LIVING') NOT NULL, PRIMARY KEY (`id`), KEY `address_customer` (`customer_id`), CONSTRAINT `addresses_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=latin1
2020.12.21 11:56:08.925526 [ 48 ] {} <Error> MaterializeMySQLSyncThread(inventory): Query EXTERNAL DDL FROM MySQL(inventory, inventory)  CREATE TABLE `addresses` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `street` varchar(255) NOT NULL,
  `city` varchar(255) NOT NULL,
  `state` varchar(255) NOT NULL,
  `zip` varchar(255) NOT NULL,
  `type` enum('SHIPPING','BILLING','LIVING') NOT NULL,
  PRIMARY KEY (`id`),
  KEY `address_customer` (`customer_id`),
  CONSTRAINT `addresses_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=latin1 wasn't finished successfully: Code: 50, e.displayText() = DB::Exception: Unknown data type family: enum. Maybe you meant: ['Enum','Enum8'], Stack trace (when copying this message, always include the lines below):

0. DB::DataTypeFactory::findCreatorByName(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) const @ 0xd52a936 in /usr/bin/clickhouse
1. DB::DataTypeFactory::get(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::IAST> const&) const @ 0xd529bf8 in /usr/bin/clickhouse
2. DB::DataTypeFactory::get(std::__1::shared_ptr<DB::IAST> const&) const @ 0xd5299e0 in /usr/bin/clickhouse
3. ? @ 0xdcc8388 in /usr/bin/clickhouse
4. DB::MySQLInterpreter::InterpreterCreateImpl::getRewrittenQueries(DB::MySQLParser::ASTCreateQuery const&, DB::Context const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xdcc4907 in /usr/bin/clickhouse
5. DB::MySQLInterpreter::InterpreterMySQLDDLQuery<DB::MySQLInterpreter::InterpreterCreateImpl>::execute() @ 0xd922fac in /usr/bin/clickhouse
6. DB::InterpreterExternalDDLQuery::execute() @ 0xd922027 in /usr/bin/clickhouse
7. ? @ 0xdca2baa in /usr/bin/clickhouse
8. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool) @ 0xdca177d in /usr/bin/clickhouse
9. ? @ 0xd8d13e4 in /usr/bin/clickhouse
10. ? @ 0xd8e9128 in /usr/bin/clickhouse
11. DB::commitMetadata(std::__1::function<void ()> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xd8eed2e in /usr/bin/clickhouse
12. DB::MaterializeMetadata::transaction(DB::MySQLReplication::Position const&, std::__1::function<void ()> const&) @ 0xd8ef221 in /usr/bin/clickhouse
13. DB::MaterializeMySQLSyncThread::prepareSynchronized(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xd8caff9 in /usr/bin/clickhouse
14. DB::MaterializeMySQLSyncThread::synchronization(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xd8ca67a in /usr/bin/clickhouse
15. ? @ 0xd8e87ca in /usr/bin/clickhouse
16. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x7b7293d in /usr/bin/clickhouse
17. ? @ 0x7b76463 in /usr/bin/clickhouse
18. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
19. clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
 (version 20.11.4.13 (official build))
2020.12.21 11:56:08.925748 [ 48 ] {} <Error> MaterializeMySQLSyncThread: Code: 50, e.displayText() = DB::Exception: Unknown data type family: enum. Maybe you meant: ['Enum','Enum8'], Stack trace (when copying this message, always include the lines below):

@pesto93 pesto93 added the question Question? label Dec 21, 2020
@filimonov
Copy link
Contributor

filimonov commented Dec 23, 2020

Hm. @BohuTANG i thought #3985 was addressed already in #11996 ?

@BohuTANG
Copy link
Contributor

hi,

#11996 is addressed MySQL engine enum type.
MaterializedMySQL enum is not releated to it.

@Lonsss
Copy link

Lonsss commented Dec 27, 2020

Maybe someone has an idea what caused this error?

SQL Error [50]: ClickHouse exception, code: 50, host: ***, port: 8123; Code: 50, e.displayText() = DB::Exception: Unknown data type family: enum. Maybe you meant: ['Enum','Enum8'] (version 20.13.1.5273 (official build))

@tavplubix
Copy link
Member

Maybe someone has an idea what caused this error?

Short answer: MaterializeMySQL does not support ENUM data type.

Long answer:
The error happens because data type names in ClickHouse are case-sensitive, and data type names in MySQL are case-insensitive, so ClickHouse does not have enum or ENUM data type, but it has Enum. We could add case-insensitive alias for Enum data type name (like we added BIGINT alias for Int64, for example), but there is another issue. ClickHouse requires Enum values to be declared as 'name' = value, but MySQL assignes values to names implicitly. We also need support of syntax like Enum('a', 'b', 'c') in ClickHouse.

@tavplubix tavplubix changed the title MaterializedMySQL Unknown data type family: enum MaterializedMySQL: Support ENUM data type Dec 29, 2020
@tavplubix tavplubix added easy task Good for first contributors feature labels Dec 29, 2020
@alexey-milovidov
Copy link
Member

It's a good idea to do both :)

@Lonsss
Copy link

Lonsss commented Jan 6, 2021

HI! when can we expect support for this functionality (enum)?)

@stigsb
Copy link
Contributor

stigsb commented Jan 11, 2021

It seems that there are two approaches that could be explored to implement this, either mapping to ClickHouse enums, or mapping to LowCardinality(String).

In both cases one has to account for MySQL's case insensitivity, which means either upper- or lower-casing the enum values when passed as strings.

Keeping the numerical values in sync seems doable (since MySQL won't let you remove an enum field that is present in a row), but one has to parse the enum type from information_schema.columns and implicitly assign numeric values from their order. This may cause funkiness down the line if there are materialized views that have outdated enum values, but that could be documented with a caveat.

The LowCardinality(String) approach seems more generic, but AFAIK it's going to consume more space, and you will lose the ability to do numerical comparisons.

@stigsb
Copy link
Contributor

stigsb commented Jun 3, 2021

@tavplubix what do you think about adding optional case-insensitivity for enum value lookups in CH, and enable this transparently in StorageMaterializeMySQL::read()? We have a summer intern who can implement this.

@stigsb
Copy link
Contributor

stigsb commented Jun 28, 2021

This issue is addressed by #25676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp-mysql easy task Good for first contributors feature question Question?
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants