Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to define index key length while creating schema #9293

Closed
eidng8 opened this issue Jun 17, 2015 · 38 comments
Closed

Ability to define index key length while creating schema #9293

eidng8 opened this issue Jun 17, 2015 · 38 comments
Labels

Comments

@eidng8
Copy link

eidng8 commented Jun 17, 2015

Sorry, that I have to open the issue again. Since the issue doesn't get cleared.

Following #9256, what @GrahamCampbell said is not the case. I've quoted here for convience:

I've just tried out v5.1, and got the infamous ERR1071 while doing artisan migrate, right after laravel new. I can't find out a solution to it, yet. Because table charset is utf8mb4, and the email column is (by default) varchar(255), creating a unique on it will definitely cause error. I can't find out a way to specify the key length while calling

$table->string('email')->unique();

from the 2014_10_12_000000_create_users_table.php file.

After a little digging, it seems that we'll have to modify several index related functions in the Illuminate\Database\Schema\Blueprint class, and eventually (maybe not), the Illuminate\Support\Fluent.

Any hint?

@GrahamCampbell answered

Yeh, it's possible. Pass a second param to string with the length.

$table->string('email', 128)->unique();

Continue the talk

Passing in the length to string() changes the length of the column, but not the KEY length. The email column does need to be long enough to hold the data, truncating it to 128 may break the thing.

From what I've dug up:

\Illuminate\Database\Schema\Blueprint::string is defined as:

public function string($column, $length = 255)
{
    return $this->addColumn('string', $column, compact('length'));
}

which just wraps up a call to \Illuminate\Database\Schema\Blueprint::addColumn

protected function addColumn($type, $name, array $parameters = [])
{
    $attributes = array_merge(compact('type', 'name'), $parameters);
    $this->columns[] = $column = new Fluent($attributes);
    return $column;
}

which in turn creates a Fluent instance.

I didn't dig deeper into it, except checking the generated SQL. I can't find any key length related process in the code.

A bit more to clarify, that the key length is specifically MySQL, which makes a partial index using only the start of the column, given length in bytes.

@GrahamCampbell
Copy link
Member

If 128 is too short for you, use something else!

@Malkaduhimi
Copy link

@GrahamCampbell, I think you are misunderstanding @eidng8. As he points out, he does not want to change the length of the column. He wants to change the length of the index prefix, see https://dev.mysql.com/doc/refman/5.0/en/column-indexes.html.

@williamjulianvicary
Copy link

@GrahamCampbell I also think you are misunderstanding @eidng8 and I have a similar requirement for this.

MySQL (and I'm sure other storage engines) support limiting an index not the column by a length. For example for performance purposes you may choose to only index the first 7 characters of varchar column, you may still want a 256 limit but the INDEX needs to be limited.

MySQL SQL code for this could be:
..... .... COLUMN message VARCHAR(300) , ADD UNIQUE INDEX index4 (message(100) ASC);

@TheWildHorse
Copy link

@GrahamCampbell Please take a look at this again, I think you misunderstood the ticket and closed a perfectly valid feature request. If you re-open the ticket I will implement this myself.

@Malkaduhimi
Copy link

@taylorotwell we can not get an answer from Graham, please look at it.

@reedmaniac
Copy link

Indeed, I just received a "Specified key was too long, max key length is 767 bytes" error during a migration. I suspect this is because we recently standardized on utf8mb4 going forward. Going to talk to DevOps about increasing the max but it would be nice to specify a key length.

@adamthehutt
Copy link
Contributor

+1 on this. It's a problem.

@salopot
Copy link

salopot commented Feb 20, 2017

+1

@NBZ4live
Copy link

NBZ4live commented Mar 21, 2017

+1

Especially critical for BLOB/TEXT columns.

@ozzylaundrapp
Copy link

+1, Ran into this problem trying to create an index on a TEXT column.

@williamjulianvicary
Copy link

I pinged @taylorotwell on Twitter to see if this could be re-opened - this would be very helpful for more complex data sets where performance is critical.

@taylorotwell taylorotwell reopened this Jun 9, 2017
@ivandokov
Copy link
Contributor

I recently faced this issue. A good use case is when you want to store an IPv4/6 address in database you can use inet_pton to make it binary and inet_ntop to revert it to readable format.
For column type you should use binary and if you want to index this column the length of the index is required.

PS: @taylorotwell it will be nice if you add this common case as native $casts type for IP addresses for Model and change the implementation for $table->ipAddress in migrations to use binary for Laravel 5.5

@karlhepler
Copy link

+1

@themsaid themsaid added the bug label Oct 26, 2017
@myquote-dev
Copy link

+1

@myquote-dev
Copy link

myquote-dev commented Nov 2, 2017

As a temporary workaround. It is possible to use the following raw syntax for defining indexes etc. $table->index([DB::raw('column(100)')])

@rbkkm
Copy link

rbkkm commented Feb 4, 2018

That whole migration hassle is so bad, I can't describe how bad :((

@jblotus
Copy link

jblotus commented Feb 15, 2018

This is an issue for one of my projects as well. I have a data that is 255 long and changing the default length to 191 as the laravel manual suggests isn't enough. Setting the length of the column to 255 is fine but then I cannot index it with without being able to specify and index length of 255. Will try @myquote-dev workaround

@mrtoorich
Copy link

$table->index([DB::raw('value(100), name')], 'parameters_value_name_index');

@ryangurn
Copy link

I think that this would also be helpful in this. It has been causing issues for me with the collation not matching up when migrating on systems with slightly different mysql configurations.

https://phplaravel.wordpress.com/2016/07/06/using-mysql-utf8mb4-character-set-and-collation-with-laravel/

@driesvints driesvints changed the title Ability to define index key length while creating schema, Re-open. Ability to define index key length while creating schema Apr 8, 2019
@trianity
Copy link

@myquote-dev Thanks for the suggestion. It works, and it can be used in only the TEXT/BLOB fields.

$table->index([DB::raw('column(100)')])

@driesvints
Copy link
Member

driesvints commented Jan 21, 2020

We're reconsidering #25200 as a fix for this in Laravel 7. If anyone's up to it you may send in a PR to the master branch. Support for all DB engines is probably wanted.

@eidng8
Copy link
Author

eidng8 commented Jan 22, 2020

Just noticed that this issue is still alive. And just in case this bothers you and you haven't stumbled upon a new option since early MySQL 5.6:

https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix

By enabling this, index prefix is enlarged to 3072, enough to fit most of your needs. So if you can change your MySQL server setting, you can go for it.

@driesvints
Copy link
Member

@eidng8 the problem is that it can't be set through the schema builder

@eidng8
Copy link
Author

eidng8 commented Jan 24, 2020

Yes, of course. That's why I opened this issue. I just want to leave a note in case it could help.

@eidng8
Copy link
Author

eidng8 commented Jan 25, 2020

I would like to pick up this. But I'm having a long trip to Moscow next week. So it would be a bit later on the task.

Before diving into code, I need some input from you. There are differences among supported databases:

MySQL

You've mostly have the idea. It's the prefix length of the index.

PostgreSQL

There is no such thing as prefix or index length. They use expression index instead. So if we should implement this, the index will be translated from something like (say, prefix length is 10)

-- MySQL
CREATE INDEX a_index ON a_table (a_column(10));

to something like:

-- PostgreSQL
CREATE INDEX ON a_table ((substring(a_column 1 for 10)));

However, people with proper PostgreSQL background may specifically do this instead of letting the framework to decide on it. So I'm not sure if this is the right way for us.

SQLite

SQLite doesn't seem to have limit specifically on index. There isn't even a clause or expression for this kind of thing. So the parameter shall be ignored by SQLiteGrammar.

SQL Server

This is the worst. It has a 900 or 1700 bytes limit on key length. Depending on the version of SQL Server and whether the index were clustered or not. However, there is nothing in T-SQL to work around this limitation. If it has to be an index for wide columns, many would use an extra hash column to hold the hash (e.g. SHA) value of the wide column, then create an index on the hash column. However, it doesn't seem proper for a framework to make this kind of decision. So I tend to prefer ignoring this parameter in SqlServerGrammar.

Important: I'm not a database professional. All of the above are just what I know. They may not be all true. But I do need your decisions about handling differences among those databases.

@driesvints
Copy link
Member

@staudenmeir is it true that index length is only supported on MySQL? In that case I guess your PR is good to go? We could override the methods on the different grammers to throw a descriptive exception when people attempt to set the length on different engines maybe?

@staudenmeir
Copy link
Contributor

staudenmeir commented Jan 27, 2020

@driesvints Only the MySQL implementation can easily be used in queries.

PostgreSQL and SQLite also support prefix indexes, but they can only be used in a way that requires raw expressions in Laravel:
https://stackoverflow.com/a/21824039/4848587
https://sqlite.org/expridx.html

Of course, we could support the index creation regardless of how complex it is to use them in queries.

@driesvints
Copy link
Member

@staudenmeir thanks! Let's get this only implemented for MySQL then and document it clearly that it's only available there.

@eidng8
Copy link
Author

eidng8 commented Feb 3, 2020

Should I do anything on the PR then? or is it good enough?

@driesvints
Copy link
Member

@eidng8 it looks good

@driesvints
Copy link
Member

Since this is a feature request I'm closing this. Anyone's free to attempt the PR of staudenmeir again.

@vesper8
Copy link

vesper8 commented Dec 9, 2020

I wanted to add an index on my text column and ran into this issue.

I was able to work around it by removing the ->index() from the $table->text('some_column') and then added $table->index([DB::raw('some_column(750)')])

I experimented with the number inside the parentheses.. my understanding is that you can only index the first n characters inside a text field. If you go a little higher than 750 you run into this error:

SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 3072 bytes

@myquote-dev
Copy link

I wanted to add an index on my text column and ran into this issue.

I was able to work around it by removing the ->index() from the $table->text('some_column') and then added $table->index([DB::raw('some_column(750)')])

I experimented with the number inside the parentheses.. my understanding is that you can only index the first n characters inside a text field. If you go a little higher than 750 you run into this error:

SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 3072 bytes

The most likely reason for this is that 'some_column' is using the utf8mb4 character set which is 4 bytes per character. Therefore the maximum number of characters you can use for your index is 3072 bytes / 4 bytes = 768 characters. utf8mb4 text indexes over 768 will exceed the 3072 byte limit.

@vesper8
Copy link

vesper8 commented Dec 9, 2020

@myquote-dev hrm interesting to know.. thanks for explaining

For a few years now I have just been defaulting to using utf8mb4 everywhere, even though in 90% of my projects I only deal with English.. do you think I should re-think my strategy and use regular utf8 instead? Which in this case would allow me to have a larger indexable text/varchar field?

@williamjulianvicary
Copy link

750 should be ample for an index, what you’re after is a low cardinality in the indexes that also doesn’t bloat them (my understanding is the optimiser built in to MySQL will first look to your indexes, but if those indexes are just as big as your data it won’t help quite so much).

If 750 characters isn’t enough length to identify a single record in your table or provide ordering (I expect it is but perhaps not!) then there are other issues with the architecture.

I think you may be stressing over micro optimisations that will probably have little difference on performance - it’s easy enough to test, but I’d expect 750 or 1500 length to provide identical performance. (And for that matter, 250 or less would probably be plenty)

@vesper8
Copy link

vesper8 commented Dec 9, 2020

@williamjulianvicary thanks for you input. Well perhaps you can give me your quick advice regarding my specific problem. I am dealing with a database table containing 7 million Chess games. The column I wanted to index in order to speed up queries is a column that contains the Chess line. Here are a few examples of what those values are:

d4d5c4c6Nc3Nf6e3e6Nf3Nbd7Bd3dxc4Bxc4b5Be2Bb7O-OBe7e4
d4e6c4Nf6Nf3Bb4+Nbd2b6a3Bxd2+Bxd2Bb7Bg5d6e3Nbd7Be2O-OO-Oh6Bh4Qe7b4c5Qb3Rab8Rfd1g5Bg3Ne4Nd2Nxg3hxg3d5cxd5Bxd5Bc4Bxc4Qxc4b5Qc3c4Ne4Rfd8a4f5Nc5Nxc5dxc5Rxd1+Rxd1Rd8Rd4e5Rd2a6g4fxg4axb5axb5Rxd8+Qxd8Qxe5g3fxg3
d4Nf6Nf3e6c4b6a3Bb7Nc3d5Bg5Be7Qa4+c6Bxf6Bxf6cxd5exd5g3O-OBg2c5Rd1c4O-Oa6Ne5b5Qc2Qd6e4Bxe5dxe5Qxe5f4Qd6Nxd5Nd7Qf2Rad8Rd2Bxd5Rxd5Qc7Rfd1Nb6R5d4Na4e5Rxd4Qxd4c3bxc3Nxc3Re1Rd8Qe3Nd1Qe2Qc5+Kf1Nc3Qe3Qc4+Kg1a5f5b4e6fxe6fxe6Re8e7Qc8axb4axb4Qd4b3Qb4Kh8Rf1h6Be4Ne2+Kg2g5Qd6

I perform queries such as, show me all lines that begin with d4d5c4c6Nc3Nf6

e.g. SELECT * FROM games WHERE line LIKE "d4d5c4c6Nc3Nf6%"

I know I could throw that up on an Elastic Search instance somewhere etc but right now I'm working with it on my local dev so CPU/RAM is limited and I just wanted to see if adding an index would speed things up. It looks like it did a little bit but it's still quite slow

@williamjulianvicary
Copy link

You could try a shorter index on that field (say 255 characters so you sit within the index length limits of UTF8 or shorter) - that may help ensure your index is able to be used.

If you're requesting X hundred k records with that query though, it's not going to be that quick.

@PavelMove
Copy link

PavelMove commented Jul 28, 2022

@driesvints, Just hiding the feature request doesn't make framework better. PR was rejected, but this feature is still required and would be required. How do people who want improve framework would know, that this is still needed?
You had to link PR to this issue - yes. But not closing it, just because PR exists - it's a worst decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests