Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: IMPORT MYSQLDUMP fails on fulltext index definition #29625

Closed
rolandcrosby opened this issue Sep 6, 2018 · 6 comments
Closed

importccl: IMPORT MYSQLDUMP fails on fulltext index definition #29625

rolandcrosby opened this issue Sep 6, 2018 · 6 comments
Assignees
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. good first issue

Comments

@rolandcrosby
Copy link

I tried importing the data from the CorpWatch API's MySQL database dumps. Most worked fine, but two failed with the error message Error: pq: could not read definition for table "company_locations" (possible unsupported type?). The offending statement in both table definitions ended up being the FULLTEXT index definition at the end - when I removed those indexes, the tables imported happily.

Here's one of the two schemas that wouldn't import because of this issue - the other was company_names.sql. (A third table in this dataset, filings.sql, couldn't be imported because it contained zero dates; see #29298 for that issue).

CREATE TABLE `company_locations` (
  `location_id` int(11) NOT NULL AUTO_INCREMENT,
  `cw_id` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  `type` varchar(15) DEFAULT NULL,
  `raw_address` varchar(500) DEFAULT NULL,
  `street_1` varchar(300) DEFAULT NULL,
  `street_2` varchar(300) DEFAULT NULL,
  `city` varchar(100) DEFAULT NULL,
  `state` varchar(40) DEFAULT NULL,
  `postal_code` varchar(11) DEFAULT NULL,
  `country` varchar(100) DEFAULT NULL,
  `country_code` char(2) DEFAULT NULL,
  `subdiv_code` char(3) DEFAULT NULL,
  `min_year` int(11) DEFAULT NULL,
  `max_year` int(11) DEFAULT NULL,
  `most_recent` tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`location_id`),
  KEY `country_code` (`country_code`,`cw_id`) USING BTREE,
  KEY `subdiv_code` (`subdiv_code`,`cw_id`) USING BTREE,
  KEY `year` (`min_year`,`max_year`),
  KEY `cwindex` (`cw_id`,`min_year`,`max_year`) USING BTREE,
  KEY `most_recent` (`most_recent`),
  KEY `postal_code` (`postal_code`),
  FULLTEXT KEY `raw_address` (`raw_address`)
) ENGINE=MyISAM AUTO_INCREMENT=1789464 DEFAULT CHARSET=utf8 COMMENT='allows each company to have multiple locations associated wi';
@rolandcrosby rolandcrosby added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-disaster-recovery labels Sep 6, 2018
@dt dt removed their assignment Dec 6, 2018
@rolandcrosby
Copy link
Author

We should just ignore these fulltext indexes; I think this would be a pretty simple change.

@damienhollis
Copy link
Contributor

@rolandcrosby it seems like the vitess parsing code just doesn't return a TableSpec when there is a FULLTEXT index. Did you have an idea how this could be fixed without changing the vitess parser?

@rolandcrosby
Copy link
Author

@damienhollis I hadn't looked that deeply into the problem. Check with @dt, I think he has the most experience with the Vitess integration.

@dt
Copy link
Member

dt commented Jan 21, 2020

My memory is a little fuzzy here, but I believe that vitess' parser's usual behavior when it sees anything it doesn't know how to parse is to nil the whole statement (or maybe just the node?) it was parsing, so I think, even if we ultimately just want to ignore it, we'd still need to teach the parser how to parse FULLTEXT. In the past, we've just gone ahead and added the appropriate AST nodes / fields and parser rules to correctly capture such things in the AST, even if both we (and vitess) then ignore them, and indeed the vitess maintainers have welcomed such patches upstream too. In theory, I think we could also just teach the parser to parse and discard the extra qualifier rather than add it to the AST, but if we later decide to actually handle it we'd need to revisit that.

All that said, I'm still on the fence if we should do anything with fulltext other than error -- we don't support actual fulltext indexing, so if your schema includes that, claiming we successfully imported it if we don't seems like it has the potential to be confusing. I guess ideally I'd add an option like "elide_fulltext" to explicitly choose to ignore it.

@damienhollis
Copy link
Contributor

As a user that hit this issue, two things would be useful:

  1. the error message stating the actual problem
  2. the ability to ignore fulltext indexes

I tend to agree that just ignoring them by default could be surprising.

@dt
Copy link
Member

dt commented Jan 21, 2020

@damienhollis Yeah, completely agree on both.

cc @mwang1026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. good first issue
Projects
None yet
Development

No branches or pull requests

4 participants