Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align structure.sql with generated output #4298

Open
gravitystorm opened this issue Oct 18, 2023 · 4 comments
Open

Align structure.sql with generated output #4298

gravitystorm opened this issue Oct 18, 2023 · 4 comments
Labels
dx Developer Experience

Comments

@gravitystorm
Copy link
Collaborator

There are a few small differences between our structure.sql file, and the output generated from rails schema dumping on my system. I'm not sure whether this is because of different rails versions or different postgresql versions used between the time that these bits of the file were generated, and today. But it's a bit annoying when trying to develop migrations, so if it's possible for us to resolve them it'll make our developers lives a little bit easier.

On my own system, I get four main differences:

  • The AS integer phrase for sequences isn't included on my system, but we have it in our structure.sql file
  • The ar_internal_metadata table has a slightly different definition, i.e.
    -    created_at timestamp(6) without time zone NOT NULL,
    -    updated_at timestamp(6) without time zone NOT NULL
    +    created_at timestamp without time zone NOT NULL,
    +    updated_at timestamp without time zone NOT NULL
  • There is a comment added on the btree_gist extension
  • The schema_migrations table has a wildly different order, but I'll ignore that for now

I'm interested to know, before I make a pull request, whether my experience is typical. You can test yourself by running bundle exec rails db:schema:dump followed by git diff to see what the differences are on your own system between the schema dump and what we have in structure.sql.

@gravitystorm gravitystorm added the dx Developer Experience label Oct 18, 2023
@tomhughes
Copy link
Member

I have loads of differences for things like that (which I think is probably down to postgres version) or field ordering (which is mostly down to me having applied and reverted migrations at various times and in various orders) and I generally figure that trying to "fix" them is as likely to break them for other people as it is to fix them for me... My current diff is:

diff --git a/db/structure.sql b/db/structure.sql
index 939799c0a..78eef205f 100644
--- a/db/structure.sql
+++ b/db/structure.sql
@@ -9,6 +9,13 @@ SET xmloption = content;
 SET client_min_messages = warning;
 SET row_security = off;
 
+--
+-- Name: public; Type: SCHEMA; Schema: -; Owner: -
+--
+
+-- *not* creating schema, since initdb creates it
+
+
 --
 -- Name: btree_gist; Type: EXTENSION; Schema: -; Owner: -
 --
@@ -16,6 +23,13 @@ SET row_security = off;
 CREATE EXTENSION IF NOT EXISTS btree_gist WITH SCHEMA public;
 
 
+--
+-- Name: EXTENSION btree_gist; Type: COMMENT; Schema: -; Owner: -
+--
+
+COMMENT ON EXTENSION btree_gist IS 'support for indexing common datatypes in GiST';
+
+
 --
 -- Name: format_enum; Type: TYPE; Schema: public; Owner: -
 --
@@ -107,6 +121,39 @@ CREATE TYPE public.user_status_enum AS ENUM (
     'deleted'
 );
 
+
+--
+-- Name: tile_for_point(integer, integer); Type: FUNCTION; Schema: public; Owner: -
+--
+
+CREATE FUNCTION public.tile_for_point(scaled_lat integer, scaled_lon integer) RETURNS bigint
+    LANGUAGE plpgsql IMMUTABLE
+    AS $$
+DECLARE
+  x int8; -- quantized x from lon,
+  y int8; -- quantized y from lat,
+BEGIN
+  x := round(((scaled_lon / 10000000.0) + 180.0) * 65535.0 / 360.0);
+  y := round(((scaled_lat / 10000000.0) +  90.0) * 65535.0 / 180.0);
+
+  -- these bit-masks are special numbers used in the bit interleaving algorithm.
+  -- see https://graphics.stanford.edu/~seander/bithacks.html#InterleaveBMN
+  -- for the original algorithm and more details.
+  x := (x | (x << 8)) &   16711935; -- 0x00FF00FF
+  x := (x | (x << 4)) &  252645135; -- 0x0F0F0F0F
+  x := (x | (x << 2)) &  858993459; -- 0x33333333
+  x := (x | (x << 1)) & 1431655765; -- 0x55555555
+
+  y := (y | (y << 8)) &   16711935; -- 0x00FF00FF
+  y := (y | (y << 4)) &  252645135; -- 0x0F0F0F0F
+  y := (y | (y << 2)) &  858993459; -- 0x33333333
+  y := (y | (y << 1)) & 1431655765; -- 0x55555555
+
+  RETURN (x << 1) | y;
+END;
+$$;
+
+
 SET default_tablespace = '';
 
 SET default_table_access_method = heap;
@@ -250,8 +297,8 @@ ALTER SEQUENCE public.active_storage_variant_records_id_seq OWNED BY public.acti
 CREATE TABLE public.ar_internal_metadata (
     key character varying NOT NULL,
     value character varying,
-    created_at timestamp(6) without time zone NOT NULL,
-    updated_at timestamp(6) without time zone NOT NULL
+    created_at timestamp without time zone NOT NULL,
+    updated_at timestamp without time zone NOT NULL
 );
 
 
@@ -274,7 +321,6 @@ CREATE TABLE public.changeset_comments (
 --
 
 CREATE SEQUENCE public.changeset_comments_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -376,7 +422,6 @@ CREATE TABLE public.client_applications (
 --
 
 CREATE SEQUENCE public.client_applications_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -802,7 +847,6 @@ CREATE TABLE public.issue_comments (
 --
 
 CREATE SEQUENCE public.issue_comments_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -842,7 +886,6 @@ CREATE TABLE public.issues (
 --
 
 CREATE SEQUENCE public.issues_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -955,7 +998,6 @@ CREATE TABLE public.note_comments (
 --
 
 CREATE SEQUENCE public.note_comments_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -991,7 +1033,6 @@ CREATE TABLE public.notes (
 --
 
 CREATE SEQUENCE public.notes_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1137,7 +1178,6 @@ CREATE TABLE public.oauth_nonces (
 --
 
 CREATE SEQUENCE public.oauth_nonces_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1216,7 +1256,6 @@ CREATE TABLE public.oauth_tokens (
 --
 
 CREATE SEQUENCE public.oauth_tokens_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1251,7 +1290,6 @@ CREATE TABLE public.redactions (
 --
 
 CREATE SEQUENCE public.redactions_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1326,7 +1364,6 @@ CREATE TABLE public.reports (
 --
 
 CREATE SEQUENCE public.reports_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1373,7 +1410,6 @@ CREATE TABLE public.user_blocks (
 --
 
 CREATE SEQUENCE public.user_blocks_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -1406,9 +1442,9 @@ CREATE TABLE public.user_preferences (
 CREATE TABLE public.user_roles (
     id integer NOT NULL,
     user_id bigint NOT NULL,
-    role public.user_role_enum NOT NULL,
     created_at timestamp without time zone,
     updated_at timestamp without time zone,
+    role public.user_role_enum NOT NULL,
     granter_id bigint NOT NULL
 );
 
@@ -1418,7 +1454,6 @@ CREATE TABLE public.user_roles (
 --
 
 CREATE SEQUENCE public.user_roles_id_seq
-    AS integer
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
@@ -3512,3 +3547,5 @@ INSERT INTO "schema_migrations" (version) VALUES
 ('7'),
 ('8'),
 ('9');
+
+

gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Nov 29, 2023
gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Nov 29, 2023
gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Nov 29, 2023
@gravitystorm
Copy link
Collaborator Author

For the timestamp(6) on ar_internal_metadata, the structure.sql should continue to use timestamp(6).

This is because for rails 6.0+, timestamp(6) is the default for timestamp columns. (default vs override for old versions) All of our migrations are correctly labelled with which AR version to use, so tables before that are defined with timestamp, tables after that with timestamp(6).

The edge case is that ar_internal_metadata table is not controlled, so gets the defaults at the moment it was created, which varies from system to system. On my laptop that was before rails 6, so I have timestamp. Any developer who first ran the code after November 2019 will have timestamp(6).

I've updated my development machine, by running:

alter table ar_internal_metadata alter column created_at set data type timestamp(6) without time zone;
alter table ar_internal_metadata alter column updated_at set data type timestamp(6) without time zone;

I would encourage any other veteran developers to run the same update, to avoid noise when working on database migrations.

gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Nov 29, 2023
@gravitystorm
Copy link
Collaborator Author

  • The schema_migrations table has a wildly different order, but I'll ignore that for now

I found that my schema_migrations values for legacy migrations (before we started using timestamps) were stored with leading zeros, which was messing up the ordering. So to fix this I ran:

update schema_migrations set version = TRIM(LEADING '0' FROM version);

and that sorted it out. I haven't managed to uncover why my local machine has those leading zeros, there's nothing obvious in the rails history as to when that might have changed. 🤷‍♂️

@gravitystorm
Copy link
Collaborator Author

The AS integer bit is super interesting and surprisingly complicated!

  • All sequences created on postgres 9 were bigint.
  • Then postgres/postgres@2ea5b06 happened.
  • Sequences created on postgres 10+ are now either integer or bigint, depending on the type of the primary key.

We have 9 tables with integer id columns, where if you created the table on postgres 9 (veteran dev, production) the sequence will have the bigint type, but other people (new devs, CI) will have the sequence with integer type, which is also what we have in structure.sql. These 9 sequences are:

  • changeset_comments_id_seq
  • client_applications_id_seq
  • issue_comments_id_seq
  • issues_id_seq
  • oauth_tokens_id_seq
  • redactions_id_seq
  • reports_id_seq
  • user_blocks_id_seq
  • user_roles_id_seq

There are 3 further tables that have an additional complication:

  • Primary keys created on rails < 5.1 were integer by default, rails >= 5.1 primary keys are bigint by default. (Override for old versions).
  • We have three migrations to convert integer primary keys to bigint for three specific tables. When these were run on postgres 9 (veteran dev, production), we ended up with bigint sequences, since all sequences were bigint to start with. However, when these migrations are run on postgres 10+ (new dev, CI) we end up with a mismatch between primary key and sequence, since updating the primary key to bigint does not also change the sequence to bigint.

These three tables are:

  • oauth_nonces (db/migrate/20201214144017_expand_nonce_id.rb:4: change_column :oauth_nonces, :id, :bigint)
  • notes (db/migrate/053_add_map_bug_tables.rb:16: change_column :map_bugs, :id, :bigint)
  • note_comments (db/migrate/054_refactor_map_bug_tables.rb:15: change_column :map_bug_comment, :id, :bigint)

I need to have a think about what the best thing to do is.

For the first set of 9, I'm tempted to just run "ALTER SEQUENCE foo_seq AS integer;" locally, since then my machine matches CI and new devs, and although production will still be bigint I can't see any harm in that. Alternatively, we could instead upgrade the affected tables to have bigint ids, for consistency if nothing else.

For the second set of 3, it's potentially confusing to new developers that the sequences are only integer despite the primary keys being bigint, and they might worry that it's also bug in production (without being able to know that production has postgres-9-era bigint sequences). It's also a gotcha for anyone else who deploys the code, since they will have a "for real" mismatch lurking in their production database. For these tables, it might be worth either adding migrations to alter the sequences to bigint (which will make no difference in production, but will fix everything else).

gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Dec 6, 2023
gravitystorm added a commit to gravitystorm/openstreetmap-website that referenced this issue Dec 6, 2023
These primary keys were converted to bigints in migrations, but the
sequences were left unmentioned. If the original migrations are run on
postgresql 10.0+, then this leads to a mismatch in column types vs sequence
types. This migration fixes these mismatches.

If the original migrations were run on postgresql < 10, all sequences were
bigints anyway, and this migration is a no-op.

If the sequence is a bigint, then postgresql doesn't output that fact in the
statement dump.

Refs openstreetmap#4298
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dx Developer Experience
Projects
None yet
Development

No branches or pull requests

2 participants