Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to rename columns #8

Closed
melezhik opened this issue Oct 17, 2011 · 29 comments
Closed

failed to rename columns #8

melezhik opened this issue Oct 17, 2011 · 29 comments
Labels

Comments

@melezhik
Copy link

cat schema.yaml

    schema public:
      description: standard public schema
      table t1:
        columns:
        - c1:
            not_null: true
            type: integer
        - c2:
            type: smallint
        - c3:
            default: 'false'
            type: boolean
        - c4:
            type: text
        foreign_keys:
          t1_c2_fkey:
            columns:
            - c2
            references:
              columns:
              - c21
              schema: s1
              table: t2
        primary_key:
          t1_pkey:
            access_method: btree
            columns:
            - c1
    schema s1:
      table t2:
        columns:
        - c21:
            not_null: true
            type: integer
        - c22:
            type: character varying(16)
        primary_key:
          t2_pkey:
            access_method: btree
            columns:
            - c21

yamltodb -H melezhik.x --user test -1 test test.yml | psql --user=test -h melezhik.x test

BEGIN
CREATE SCHEMA
CREATE TABLE
CREATE TABLE
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "t2_pkey" for table "t2"
ALTER TABLE
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
ALTER TABLE 
ALTER TABLE
COMMIT

alter table t1 rename column c4 to cc4;
yamltodb -H melezhik.x --user test -1 test test.yml

BEGIN;
ALTER TABLE t1 DROP COLUMN cc4;
COMMIT;

which is wrong, because only drop column cc4, but also has to create column c4!

@jmafc
Copy link
Member

jmafc commented Oct 17, 2011

This is a limitation of any "schema diff" program, like Pyrseas, apgdiff and others. I blogged about this here: http://pyrseas.wordpress.com/2011/04/21/sql-database-version-control-and-renames/ . If you rename an object (column, table, etc.), there is nothing in the PostgreSQL catalogs that keeps the previous name.

The solution we suggest is to edit the YAML output and add an 'oldname: c4' under the 'cc4' column spec. Then yamltodb will generate the correct ALTER TABLE RENAME COLUMN statement. We think this is better than dropping the column and recreating it with the new name.

Having said, I will take a look at why it's generating ALTER TABLE DROP COLUMN cc4, instead of ALTER TABLE DROP COLUMN c4 followed by ALTER TABLE ADD COLUMN cc4.

@jmafc
Copy link
Member

jmafc commented Oct 17, 2011

@melezhik, in my testing, after the RENAME COLUMN, the yamltodb output is to DROP COLUMN c4, not cc4. Can you please confirm that is a typo on your part? Also, I think your last sentence should read "which is wrong, because only drop column c4, but also has to create column cc4!"

@melezhik
Copy link
Author

no, the sentence is correct. I see the limitation, but would not it be better to raise exception in this case? something like column c4 is missing ?

@jmafc
Copy link
Member

jmafc commented Oct 17, 2011

OK, I guess I misunderstood the intent of your example. You're running everything against the same database and the ALTER TABLE RENAME COLUMN is a statement that you'd want to back out. Is that correct? In other words, you want to reinstate the previous schema and undo the RENAME.

I don't think raising an exception is a useful default because that missing/renamed/dropped objects are rather common in version control. I can see three possible solutions:

  1. yamltodb ought to DROP the renamed column and ADD back the old column (this should be the default behavior)
  2. yamltodb, given some suitably named command-line option, will flag the missing column rather than DROP/ADD
  3. Add oldname: to the YAML in order to get a RENAME (already available)

@jmafc
Copy link
Member

jmafc commented Oct 18, 2011

For the record, behaviors 1 and 3 already work at the table level, e.g.,

$ dbtoyaml testdb
schema public:
  description: standard public schema
  table t1:
    columns:
    - c1:
        type: integer
    - c2:
        type: text
$ psql testdb
testdb=> ALTER TABLE t1 RENAME TO t2;
$ yamltodb testdb test.yaml
CREATE TABLE t1 (
    c1 integer,
    c2 text);
DROP TABLE t2;
$ # edit test.yaml, change 'table t1' to 'table 2' and add oldname: t1 under it
$ yamltodb testdb test.yaml
# no output because the table already has been renamed

The issue is when a column is renamed, e.g.,

$ psql testdb
testdb=> ALTER TABLE t1 RENAME c2 TO c3;
$ yamltodb testdb test.yaml
ALTER TABLE t1 DROP COLUMN c3;

I agree what is necessary is an ALTER TABLE t1 ADD COLUMN c2 text, in addition to the command issued above. Later we can talk about behavior 2, i.e., some exception or warning based on a user-specified option.

@melezhik
Copy link
Author

OK, I guess I misunderstood the intent of your example. You're running everything against the same database >and the ALTER TABLE RENAME COLUMN is a statement that you'd want to back out. Is that correct? In other >words, you want to reinstate the previous schema and undo the RENAME.

yes, correct!

  1. yamltodb ought to DROP the renamed column and ADD back the old column (this should be the default behavior)
  2. yamltodb, given some suitably named command-line option, will flag the missing column rather than DROP/ADD
  3. Add oldname: to the YAML in order to get a RENAME (already available)

yes, this would be good if these 3 cases would be available through usage of Pyrseas

jmafc added a commit that referenced this issue Oct 18, 2011
 * pyrseas/dbobject/column.py (ColumnDict.diff_map): Don't attempt to
   drop columns that have already been dropped in the catalogs (caused
   innocuous semicolons on output).
 * pyrseas/dbobject/table.py (Table.diff_map): Examine existing
   columns and generate ADD COLUMN statements if not present in the
   input map.
@jmafc
Copy link
Member

jmafc commented Oct 19, 2011

@melezhik I believe change 1a575b5 fixes your specific issue.

@melezhik
Copy link
Author

Hi, @jmafc, how can I see the changes? (Via git clone ... and install?)

@jmafc
Copy link
Member

jmafc commented Oct 21, 2011

Hi Alexey,

If you just want to see the changes, simply click on the 1a575b5 link. If you want to install the changes, you could do a git clone and then sudo python setup.py install (assuming you're on Linux), as described here. If you have privileges (and are adventurous), you could apply the change directly to pyrseas/dbobject/table.py. I'm also planning to do a release v0.4.1 soon, which will include this.

@melezhik
Copy link
Author

ok, after upgrading code still have the same behavior:

  1. psql --user=test test

         Table "public.test"
     Column |  Type   | Modifiers
    --------+---------+-----------
     d      | integer |
    
  2. cat test0.yml

schema public:
  description: standard public schema
  table test:
    columns:
  1. yamltodb -H melezhik.x --user test test test0.yml
    ALTER TABLE test DROP COLUMN d;

@jmafc
Copy link
Member

jmafc commented Oct 24, 2011

@melezhik hmm, I could've sworn ... Sorry, I guess I'll have to follow my own test-driven development guidelines and write a unit test case first, and then (re)write the fix.

jmafc added a commit that referenced this issue Oct 24, 2011
 * pyrseas/dbobject/table.py (Table.diff_map): Calculate number of
   columns present excluding dropped columns.  For columns out of
   order, add them if not in list of existing columns.
 * tests/dbobject/__init__.py: Invoke column tests.
 * tests/dbobject/test_column.py: New tests for dropping and adding
   columns, plus move other colum tests from test_table.py.
 * tests/dbobject/test_table.py: Move columns tests to new module.
@jmafc
Copy link
Member

jmafc commented Oct 24, 2011

@melezhik I'm pretty sure change 7fc6ace fixes your issue. I've separated column tests and added several variations.

@melezhik
Copy link
Author

OK, I checked it out, now it works as proposed, thank you! What about:

yamltodb, given some suitably named command-line option, will flag the missing column rather than DROP/ADD ?

-- the command line flag will be included in v0.4.1?

@jmafc
Copy link
Member

jmafc commented Oct 25, 2011

No, the command line option is a future enhancement. I'm still not sure if the option should be specific to columns, e.g., --fail-missing-columns, or more general --fail-missing-objects, i.e., to raise an exception instead of emitting DROP statements for missing tables, indexes, etc. The main use case for dbtoyaml/yamltodb was development, where dropping, renaming and other changes are commonplace.

@melezhik
Copy link
Author

Ok, I understand

@jmafc
Copy link
Member

jmafc commented May 30, 2012

I'm starting to implement this. I've settled on a -m/--fail-missing option to yamltodb. Just to make sure we're in the same wavelength, let's say we have:

schema public:
  description: standard public schema
  table t1:
    columns:
    - c1:
        type: integer
    - c2:
        type: text
    - c3:
        type: date

And someone has done ALTER TABLE t1 DROP c2, then yamltodb -m with the above YAML, you want to see something like KeyError: Column c2 missing from table public.t1(not a traceback, but a user-friendly error message) instead of ALTER TABLE t1 ADD COLUMN c2 text.

Is that correct? Is there some other case I should consider?

@melezhik
Copy link
Author

Is that correct? Is there some other case I should consider?

yeah, sounds good, I guess it'd better to say something: Error: current database schema missing object: table public.t1 column c2, rerun without -m flag to add it, but it's up to you, at least I am not a native english speaker ((:

@jmafc
Copy link
Member

jmafc commented May 31, 2012

Well, I'm not native English either (but I was exposed to it very early). Anyhow, the issue is we only detect that something is missing two layers deep (in pyrseas/dbobject/table.py ClassDict.diff_map, for example), so we can raise an error there, with specific schema/table/column info. The intermediate layer (pyrseas/database.py) doesn't need to take any special action. The top layer (yamltodb) can react to the error and perhaps add the "rerun without -m".

Do you also want to get an error for the converse situation, i.e., if table t1 is supposed to have two columns (say c1 and c3) and yamltodb finds three (c1, c2 and c3) and would have to issue ALTER TABLE t1 DROP c2?

I'm still not sure why you want this feature. By default, yamltodb does not update the database: it only shows the SQL needed to change it to match the input YAML spec. You have to use the -u/--update option (or pipe the SQL through to psql) to actually change the database.

If you want to control that a database hasn't changed from a master YAML spec, with the current Pyrseas tools, you could do two things:

  1. Run dbtoyaml dbname -o current.yaml and then diff master.yaml current.yaml will show you (roughly) what's different. You can then run yamltodb dbname master.yaml to see what needs to be changed.
  2. Run yamltodb dbname master.yaml -o changes.sql and if changes.sql is not empty, then you have differences that you'll have to investigate, or you can run either psql dbname -f changes.sql or yamltodb --update dbname master.yamlto apply the changes.

If we add the -m flag, the difference is that in option 2, you would run yamltodb --fail-missing dbname master.yaml and if something is different you'd get an error and then you'd investigate and then you'd continue with the steps in option 2. In other words, the --fail-missing option seems to be adding an extra step, unnecessarily.

@melezhik
Copy link
Author

I understand you, it's hard to show use cases to prove necessity of this flag, but roughly speaking it's sometimes quite usefull to get know, that database schema is changed towards given yaml schema. Also yamltodb command exit status <> 0 will be fine in automated deploy scripts ... Anyway, diff approach is also okay.

@jmafc
Copy link
Member

jmafc commented May 31, 2012

What would be very easy to add is to exit yamltodb with, say, exit status 2, if we wrote SQL statements, or status 0 if there were no schema diffs, i.e., no statements. If -u is used and the SQL fails to apply, a Python error is raised and we already exit with status 1.

@jmafc
Copy link
Member

jmafc commented May 31, 2012

I've committed change 38772f2 which adds exits status 2 to indicate SQL statements were generated. I've also updated the yamltodb to highlight the exit statutes and the possible use of status 0 vs. 2. Please let me know if this satisfactory for your purposes.

@melezhik
Copy link
Author

melezhik commented Jun 1, 2012

that's right, the only question why status 2 to indicate SQL statements were generated? As I know status <> 0 if for some failure cases

@jmafc
Copy link
Member

jmafc commented Jun 1, 2012

Maybe I misunderstood your earlier comment. You wrote "Also yamltodb command exit status <> 0 will be fine in automated deploy scripts". I thought you wanted an exit status <> 0 to indicate a diff was present. Yamltodb is somewhat like the Unix diff command, except instead of diff'ing two files, you diff a representation of the database (the YAML spec) to another (the current catalogs). When diff doesn't find a difference between two files, its exit status is 0. When it does find a diff, the status is 1. When some other error occurs, e.g., file(s) not found, the status is 2. I was actually going to implement yamltodb statuses that way (and I can still do that), but it was easier (oh laziness :-) to exit with 2 for a diff.

@melezhik
Copy link
Author

melezhik commented Jun 4, 2012

Let's I try some use case. Say, we want to apply schema to the current database, also we do it in automated process with chef recipe, if everything okay, the exit status of yamltodb command will be 0, if something goes wrong than it should return <> 0 exit status. The reasons why something goes wrong may be different, one of them is we miss some columns in database to be applied, or whantever, just to think about it in the way of automatic deploy ...

@jmafc
Copy link
Member

jmafc commented Jun 4, 2012

I've posted in the pyrseas-general ML: http://lists.pgfoundry.org/pipermail/pyrseas-general/2012-June/000053.html to see what others think. Feel free to subscribe and join the discussions.

@melezhik
Copy link
Author

melezhik commented Jun 5, 2012

Ok. I give up here ((: I understand diffish approach, but if to think in way of automatic deployment it's not that fine, maybe it means that yamltodb is not for automatic deployment, we need some wrapper around it, anyway, exit status is not that critical if one run yamltodb manually.

@jmafc
Copy link
Member

jmafc commented Jun 5, 2012

Please don't give up. I sort of convinced myself that "diffish approach", as you called it, would be OK for a tool named yamldbdiff, but not exactly for yamltodb which is supposed to generate SQL statements and even apply them to the database. yamldbdiff could be created as a simple wrapper around yamltodb if desired. So, I'll back out my exit status changes. The question then remains if there is anything else that you'd prefer to see done with this issue or whether I can close it.

@jmafc
Copy link
Member

jmafc commented Jun 5, 2012

Change f96448e backs out the "exit with 2" change.

@jmafc jmafc closed this as completed in 5fadc91 Jun 26, 2012
@jmafc
Copy link
Member

jmafc commented May 14, 2013

Got thinking about the general issue of renaming objects in the database and how to capture info about such an action. Ideally, PG would be modified so that when someone issues ALTER TABLE x RENAME TO y, PG would save 'x', 'y' and the oid of the table into some catalog that would track such history, then dbtoyaml would query the catalog to supply the oldname attribute for the renamed table. However, it is unlikely that PGDG would implement such functionality.

An alternative would be for Pyrseas to provide a utility or extension through which ALTER RENAME commands could be issued and at the same time the info would be captured in a pyrseas schema catalog, e.g., pg_history (similar to pg_description). Comprehensive coverage of all PG RENAME-able objects would be quite a task. A simpler interim solution would be a utility to insert the info into the pg_history catalog independent of the SQL ALTER command.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants