New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DROP COLUMN doesn't remove field from mapping #322
Comments
Yep, this is expected. Elasticsearch doesn't provide a way to remove a field from the index mapping, so there's not really anything ZDB can do. I suppose this could be better documented by ZDB, but that's about it. |
Also worth nothing that ES doesn't provide a way to rename a field in a mapping either. :( |
Is it possible to control the order of operations at the ES level? right now it looks like it drops the index, makes a new one and populates it. |
You mean during a REINDEX? The problem there is that ES doesn't provide a way to rename an index, either. They want you to do that using aliases, and ZDB uses aliases for other reasons. What's the actual problem we're trying to solve here? Do you have a production workflow where you drop/re-add columns with the same name but different types? |
Hmm, I guess maybe you could use ES' "reindex" api to do this. I haven't actually used that, but I think it would necessitate a full rewrite of the index. |
it is basically a user defined schema allowing companies to define a custom table to store data + search through it with full operator support for the appropriate data type. Indexing that is pretty difficult. operator support for date / numeric ranges in json columns is mostly non-existent. But ES makes this pretty easy. The problem is people tend to change their mind. Or someone tries to add a field that were previously deleted and it is of a different type.
a full index write under the hood is fine. more the problem is that nothing is really going to work while that is happening because the index is just gone. If zombo stood up a new index before removing the old one - I think we'd be in business. I can probably get around the type conflicts in general by using a type suffix like
yes during a reindex. our manual process in the past for reindexing ES / fixing mappings in the past has been
applications queries the alias obviously. our template mapping had a This is basically a |
Lets look at this example: create table foo (id text);
insert into foo (id) values (1), (2), (3);
create index idxfoo on foo using zombodb ((foo.*));
alter table foo alter column id type bigint using id::bigint; -- rewrites index
begin;
alter table foo alter column id type text using id::text; -- takes exclusive lock, blocks concurrent reads
commit; -- lock released This uses So maybe you can look at doing this pattern for your case of "user decides to change field type". You just need proper conversion functions/casts from type a to type b. That said, in the case of
The same, I suppose, could be done for If you're looking for some kind of "REINDEX CONCURRENTLY" feature, that ain't gunna happen from ZDB until it's an actual feature that PG supports first. Thoughts? |
Ya that is an idea. There isn't always a clean mapping like going for a number to a timestamp or a timestamp to a boolean for example. That would make sense to end users anyway After thinking about it, I think I could just use an internal representation for the column names for zombodb to index and use column aliases in the query. this would minimize the likely hood of conflicts.
that would be fantastic. |
Yeah, that's a good idea. Maybe you could use I'll think about the work to intercept |
ya, there is a table that tracks what is user facing ( type, name, id ) and a trigger on that table that manages a partition ( add / remove columns ) for that company with a zombo index on it. so I think this is pretty feasible. Thanks! |
And maybe another general benefit is that you can programmatically create sensical Postgres column names that are all lowercase, no spaces, no unicode, etc, but let users just name them whatever they want. How could that possibly cause a problem?! |
actually now that I understand what is going a little better and have a better approach to it I'm not sure I would want every drop column to always trigger a reindex. could it be a setting of some type? |
From ZDB's perspective, I like the idea of it leaving the index in a state that exactly matches the table schema, and doesn't require manual a I'll think on it more. I've got things to sort out like: BEGIN;
ALTER TABLE foo DROP COLUMN bar; -- ZDB rewrites the index and mapping
ALTER TABLE foo ADD COLUMN bar text; -- ZDB adds "bar" to index mapping
ABORT; -- now what? I've gotta be able to properly handle transaction rollbacks with this. |
Is this closed fixed or won't do? |
Won’t do. I just don’t see it as overly valuable relative to the implementation completely plus the operational overhead of reindexing the entire index. |
ZomboDB version: 10-1.0.0b9
Postgres version: 10.5
Elasticsearch version: 6
Problem Description:
When removing a column from a table, the related field is left in the index mapping in elastic search.
This means you can't drop a column and add it back with a different column type. The only thing that seems to get a round the problem is a full
REINDEX
.or add a suffix to the column names.
Is this expected?
The text was updated successfully, but these errors were encountered: