Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open pg:push/pull feedback #42

Closed
will opened this issue Sep 3, 2013 · 41 comments
Closed

Open pg:push/pull feedback #42

will opened this issue Sep 3, 2013 · 41 comments
Assignees

Comments

@will
Copy link
Contributor

will commented Sep 3, 2013

This issue is for collecting feedback on pg:push and pg:pull. These two commands will soon be promted from pg-extras into the Heroku Toolbelt proper. When this happens the existing db:push and db:pull commands will become deprecated, and later removed to a plugin. The documentation here is also serves as a draft for the eventual devcenter article.

background

For extraordinarily simple and tiny databases, the old db:* commands are alright, but they are unsuitable for any database that uses real postgres features—or for any database bigger than "tiny". With more and more people having trouble with those commands, we needed a modern replacement.

Unlike the db:* commands which read all your data into ruby and tries to massage data between sqlite and postgres if needed, the new pg:push/pull commands the same binary pg_dump and pg_restore commands that are standard for moving postgres data around. The downside is that this replacement does not work with sqlite. The upsides however are that it actually works, and is often very fast due to the native compression pg_dump does.

installing

If you are new to the pg-extras plugin, please see the installation instructionsin the readme. Please note that currently, pg-extras requires a Heroku Toolbelt version ≥2.40.0.

You must also have a working local postgres installation.

usage

pg:pull

pg:pull can be used to pull remote data from a Heroku Postgres database to a database on your local machine. The command looks like this:

$ heroku pg:pull DATABASE mylocaldb --app sushi

This command will create a new local database named "mylocaldb" and then pull data from database at DATABASE_URL from the app "sushi". In order to prevent accidental data overwrites and loss, the local database must not exist. You will be prompted to drop an already existing local database before proceeding.

Note: like all pg:* comamnds you can use the shorthand identifiers here, so to pull data from HEROKU_POSTGRESQL_RED on the app "sushi" you could do heroku pg:pull sushi::RED mylocaldb.

pg:push

Like pull but in reverse, pg:push will push data from a local database into a remote Heroku Postgres database. The command looks like this:

$ heroku pg:push mylocaldb DATABASE --app sushi

This command will take the local database "mylocaldb" and push it to the database at DATABASE_URL on the app "sushi". In order to prevent accidental data overwrites and loss, the remote database must be empty. You will be prompted to pg:reset an already a remote database that is not empty.

feedback

Please use this issue for any feedback on this upcoming feature or documentation.

@jfeust
Copy link

jfeust commented Sep 4, 2013

I assume the "-t TABLE_NAME" option won't be available to pull/push a single table since these commands only do full DB drop/creates.

One of my use cases is pulling down a table from production to dev, massaging the data, then pushing back to production. It's not a common use case, and probably not best practice, but I've found it useful a number of times. Also, I like pulling only certain tables down from PROD to DEV so, for example, i'm not pulling down a 200k+ user table to my local dev machine, but I can pull down an updated 1000 record feature/post/attribute table.

@hgmnz
Copy link
Contributor

hgmnz commented Sep 4, 2013

I assume the "-t TABLE_NAME" option won't be available to pull/push a single table since these commands only do full DB drop/creates.

We will eventually want to allow for that as well, perhaps even before promoting this to the toolbelt proper. So while we have thought about it, and punted on it for now, your feedback is considered a +1 on that feature. So thanks!

One of my use cases is pulling down a table from production to dev, massaging the data, then pushing back to production. It's not a common use case, and probably not best practice, but I've found it useful a number of times. Also, I like pulling only certain tables down from PROD to DEV so, for example, i'm not pulling down a 200k+ user table to my local dev machine, but I can pull down an updated 1000 record feature/post/attribute table.

In addition you could use dblink or, in postgres 9.3, the postgres foreign data wrapper, to SELECT and INSERT remotely directly. Just throwing out alternatives here.

@will
Copy link
Contributor Author

will commented Sep 4, 2013

I might add options like table name for pulling, but I'm somewhat uncomfortable adding partial pushing. I've had to answer so many support tickets over the years, that I'd rather pushing be completely safe.

That said, for your use case, have you considered using pg:psql and immediately starting a transaction? That way you can play with your data safely.

@hgmnz
Copy link
Contributor

hgmnz commented Sep 4, 2013

That said, for your use case, have you considered using pg:psql and immediately starting a transaction? That way you can play with your data safely.

a great point. It goes like this:

BEGIN
UPDATE your_table ....
-- inspect your data with SELECTs, counts and such
-- everything looks good?
COMMIT

(or ROLLBACK if you messed up)

@fedesoria
Copy link

Got the following error:

heroku pg:push local_db HEROKU_DB
pg_dump: server version: 9.2.4; pg_dump version: 9.1.5
pg_dump: aborting because of server version mismatch
pg_dump: *** aborted because of error
pg_restore: [archiver] input file is too short (read 0, expected 5)

@will
Copy link
Contributor Author

will commented Sep 4, 2013

Your local postgres instillation is 9.1 but you're trying to restore to a 9.2 database. You're going to have to install 9.2 locally.

Thanks for pasting the error. It would probably be good to catch that and give a hint of how to solve it.

The other—somewhat extreme—solution would be to build statically linked pg_dump and pg_restore for every platform, but I'm saving that as a last resort if this version mismatch is a widespread problem. I've had no troubles myself by having 9.2 locally, and with 9.2 or 9.1 remote.

@will
Copy link
Contributor Author

will commented Sep 4, 2013

Actually @fedesoria if you're doing a push, it should be dumping your local db. Your local server is 9.2 but your pg_dump binary is 9.1, there might be something weird with your $PATH.

What do you get for these commands?

~ ➤ pg_config --bindir --version
/usr/local/Cellar/postgresql/9.2.4/bin
PostgreSQL 9.2.4
~ ➤ pg_dump --version
pg_dump (PostgreSQL) 9.2.4
~ ➤ which pg_dump
/usr/local/bin/pg_dump
~ ➤ psql
Time: 9.579 ms
psql (9.3devel, server 9.2.4)
Type "help" for help.

will=# \q
~ ➤ 

@fedesoria
Copy link

⇒ pg_config --bindir --version
/usr/bin
PostgreSQL 9.1.5
⇒  pg_dump --version
pg_dump (PostgreSQL) 9.1.5
⇒ which pg_dump
/usr/bin/pg_dump
⇒ psql
psql (9.1.5, server 9.2.4)
WARNING: psql version 9.1, server version 9.2.
         Some psql features might not work.
Type "help" for help.

@fedesoria
Copy link

I'm using the latest Postgres.app

@cheukyu
Copy link

cheukyu commented Sep 5, 2013

So, this method requires a Heroku Toolbelt version ≥2.40.0, but I cannot update my Heroku Toolbelt.

$ heroku version
heroku-toolbelt/2.39.5 (x86_64-linux) ruby/1.9.3

$ heroku update
 !    To update this version of the Heroku client please use
 !    `apt-get install heroku-toolbelt'

$ sudo apt-get install heroku-toolbelt 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
heroku-toolbelt is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.

I use CrunchBang (Debian-based Linux).
Thank you

@neilmiddleton
Copy link

It would be real nice if pull could write to a pre-existing database, and support the traditional --confirm mechanism.

@harryglaser
Copy link

I got the following error:

 !    createdb: could not connect to database postgres: could not connect to server: No such file or directory
 !      Is the server running locally and accepting
 !      connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?
 !    
 !    Unable to create new local database. Ensure your local Postgres is working and try again.

My local Postgres server is running on localhost:5432. Not sure if there is a way to specify this?

@chrisvariety
Copy link

Seems to work great on both a super tiny db and a somewhat larger (400MB) db.

Much better than taps.

🌟 🌈

@fdr
Copy link

fdr commented Sep 6, 2013

@ipajoneskk do you have the repository installed? I use it and I have:

heroku-toolbelt/2.40.1 (x86_64-linux) ruby/1.9.3

@msakrejda
Copy link
Contributor

Yeah, same here:

maciek@gamera:~$ heroku --version
heroku-toolbelt/2.40.1 (x86_64-linux) ruby/1.9.3

The packages were stuck on 2.39.5 for a while, but the issue was fixed a few days ago.

@fdr
Copy link

fdr commented Sep 6, 2013

On Fri, Sep 6, 2013 at 3:55 PM, ipajoneskk notifications@github.com wrote:

@fdr https://github.com/fdr

I should have the repository installed. I have used "sudo apt-get install"
before and it was working.

Some distros (e.g. Ubuntu) like to disable third party repositories on
upgrades, so "should" may be a problem.

Also, run apt-get update; maybe you have old index files, although one
might think after the last couple of days they would have been refreshed at
least once.

@cheukyu
Copy link

cheukyu commented Sep 6, 2013

@fdr Thanks, it works now

$ heroku --version
heroku-toolbelt/2.40.1 (x86_64-linux) ruby/1.9.3

@catsby
Copy link
Contributor

catsby commented Sep 9, 2013

@harryglaser That's the default port for Postgres, and the default error message you get when you run createdb and Postgres isn't running. Are you sure it's running?

@harryglaser
Copy link

@catsby Yep, it is.

wit-3:secure-springs-5928 harry$ psql -h localhost -d PeriscopeSpine_development -c "select 1;"
 ?column? 
----------
        1
(1 row)

wit-3:secure-springs-5928 harry$ heroku pg:pull DATABASE_URL testdb -a secure-springs-5928
 !    createdb: could not connect to database postgres: could not connect to server: No such file or directory
 !      Is the server running locally and accepting
 !      connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?
 !    
 !    Unable to create new local database. Ensure your local Postgres is working and try again.
wit-3:secure-springs-5928 harry$ 

@fdr
Copy link

fdr commented Sep 9, 2013

Looks to me like a mismatch between the libpq and command line utilities in $PATH and the ones one is expecting to use (notably, this patch assumes the default connection settings will work).

One way to perhaps work around this is to use something like postgres://localhost:port/dbname explicitly rather than relying on libpq's default (the unix socket /var/pgsql_socket/.s.PGSQL.5432 in your case) but a better error message might be nice.

@harryglaser
Copy link

@fdr Personally I'd settle for an option to specify the connection settings.

@fdr
Copy link

fdr commented Sep 10, 2013

I see, it looks like the current implementation hard-codes use of the default database definition; maybe that is acceptable to change. Perhaps it can pull something similar to libpq's handling of matter. My last search suggests that it is libpq that handles whether the string is passed on one of three ways:

  • short form e.g. psql foo connecting to foo
  • connection string e.g. psql dbname=foo
  • URI e.g. psql postgres:///foo

Supporting these additional forms would result in no change for the current shorthand case but allow specification of other connection strings or URIs.

Annoyingly, if memory serves, pg_dump and pg_restore can't accept connection strings to just let libpq think about it, so instead it requires laborious handling of each option, so supporting each part of a connection string will have to happen blow by blow...

@fedesoria
Copy link

Hi,
Just wanted to share what was my fix.

  1. Deleted /usr/bin/pg_dump
  2. Ran sudo ln -s /Applications/Postgres.app/Contents/MacOS/bin/pg_dump /usr/bin/pg_dump

Thanks for the help guys.

@nathany
Copy link

nathany commented Sep 26, 2013

+1 for an option to use the last capture for pg:pull @neilmiddleton

just a shortcut for curl & pg_restore that doesn't hit the production DB for every developer that wants to do a pull

@will
Copy link
Contributor Author

will commented Sep 30, 2013

Thanks for the feedback everyone. pg:push and pull have been promoted to the toolbelt proper as of version 3.0.0 https://devcenter.heroku.com/changelog-items/333

@will will closed this as completed Sep 30, 2013
@denis
Copy link

denis commented Oct 5, 2013

When this happens the existing db:push and db:pull commands will become deprecated, and later removed to a plugin.

@will have you released this plugin already? Removing of db:push/db:pull has broken my workflow.

@will
Copy link
Contributor Author

will commented Oct 5, 2013

@denis Yes it's over at https://github.com/heroku/heroku-legacy-taps

@denis
Copy link

denis commented Oct 5, 2013

@will thanks for the link. Does it make sense to update changelog? I think it would be helpful for other users like me.

@GantMan
Copy link

GantMan commented Oct 14, 2013

I find the removal of db:push horrible. I'm fine with the moving to a plugin but deprovisioning the taps servers sounds like a huge regression for paying Heroku customers (like me).

I switch machines often and have grown fond of this feature. From cursory searches on the web, there are lots of people who use this feature. I imagine there's a lot of demand for it. Am I wrong?

@fdr
Copy link

fdr commented Oct 15, 2013

@GantMan Sorry, it's very likely not come back. Taps is not accurate. It is slow. It encourages an insidious practice of using a totally different database system (like SQLite) with a totally different type system, among other things, to model the production behavior. Consider all the tickets @will cross-referenced above. There's also a litany of support issues stemming from the same. Taps is popular (for lack of alternative, at the least) but not really okay.

It was not churned for no reason, even if the replacement is not superior to the old in every dimension for everyone.

If you would be so kind, it'd be nice to know why you cannot use the new toolchain.

@GantMan
Copy link

GantMan commented Oct 15, 2013

I appreciate your quick reply. I know my need doesn't apply to everyone, and I completely understand Heroku has always been clear on their stance against using SQLite in dev, and then PG in prod.

The reason you asked for:
I can no longer easily fix a Production bug on a dev machine that doesn't have PG.
(Which was the purpose of the feature for a lot of people, so I believe)

I understand making two different systems work is costly in dev. It's just that losing a powerful feature I've appreciated is hard.

@Arcolye
Copy link

Arcolye commented Oct 16, 2013

pg: commands assume your Heroku app is using a Postgres database. My Heroku app uses a ClearDB MySQL database, and I loved that the db: commands could deal with that. This change breaks so much good and turns it into bad.

@fdr
Copy link

fdr commented Oct 16, 2013

On Wed, Oct 16, 2013 at 2:29 AM, Stephen Hunter notifications@github.comwrote:

pg: commands assume your Heroku app is using a Postgres database. My
Heroku app uses a ClearDB MySQL database, and I loved that the db: commands
could deal with that. This change breaks so much good and turns it into bad.

That's a very interesting use case.

It's probably a bad idea for MySQL, too, though.

Taps is, again, slow and buggy, and nobody is going to great lengths to fix
it. Someone else is free to maintain a fork and make it thrive and maybe
then this idea can be revisited, but to date nobody at Heroku has wanted to
take up the sword to make Taps accurate and fast.

And so, here we are.

I am sorry you are among the disappointed.

@mbhnyc
Copy link

mbhnyc commented Oct 25, 2013

Hey guys - having what is probably a simple problem, but am unsure if this is an issue with pg:pull, or my own (rather vanilla) configuration. When I run:

heroku pg:pull ss-api-prod::RED ss_api_local

I keep running into createdb: could not connect to database template1: FATAL: password authentication failed

I'm using the same password (password) that rails uses to connect, successfully, etc.

Something dumb here i'm missing?

@hgmnz
Copy link
Contributor

hgmnz commented Oct 25, 2013

I keep running into createdb: could not connect to database template1: FATAL: password authentication failed

This means that psql can't connect to your local postgres instance with the information provided. How are you setting your local user/password? Try this instead:

PGUSER=some_user PGPASSWORD=some_password heroku pg:pull [...]

@mbhnyc
Copy link

mbhnyc commented Oct 25, 2013

Yes! that was the key - thanks, didn't realize it was pulling those env vars.

@fedesoria
Copy link

Hi guys,

Still getting some errors but they are being ignored, just a heads up:

pg_restore: creating EXTENSION plpgsql
pg_restore: creating COMMENT EXTENSION plpgsql
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2629; 0 0 COMMENT EXTENSION plpgsql 
pg_restore: [archiver (db)] could not execute query: ERROR:  must be owner of extension plpgsql
    Command was: COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


pg_restore: creating EXTENSION hstore
pg_restore: creating COMMENT EXTENSION hstore
pg_restore: [archiver (db)] Error from TOC entry 2630; 0 0 COMMENT EXTENSION hstore 
pg_restore: [archiver (db)] could not execute query: ERROR:  must be owner of extension hstore
    Command was: COMMENT ON EXTENSION hstore IS 'data type for storing sets of (key, value) pairs';

@brandonweiss
Copy link

Is the ability to exclude tables still planned? We have one table that is enormous (millions of rows) and it makes pulling down a copy of our db very time-consuming.

@wffurr
Copy link

wffurr commented Jun 26, 2014

pg:pull works like a charm on my Mac, but I can't get it to work on Ubuntu. I get a variety of errors about roles, databases not existing, etc. from pg_restore. I can't figure out how to get my local postgres to be configured the way pg:pull clearly wants it to be.

@fdr
Copy link

fdr commented Jun 26, 2014

pg:pull works like a charm on my Mac, but I can't get it to work on Ubuntu. I get a variety of errors about roles, databases not existing, etc. from pg_restore. I can't figure out how to get my local postgres to be configured the way pg:pull clearly wants it to be.

My guess is you cannot login by default on Ubuntu, e.g. just "psql" without credentials will log in. In a nutshell, in development it is helpful to create a Postgres user and database with the same name as your unix user to have easy superuser login, and I detail those steps below.

First, confirm this may be a problem:

$ psql
psql: FATAL:  role "fdr" does not exist

Here's what I do to ensure I can log in to a database by-default on a development machine:

$ sudo -u postgres createdb `whoami`
<no output>

# Makes a superuser account via "-s"
$ sudo -u postgres createuser -s `whoami`
<no output>

Then, try confirming that you can get a database prompt without any fuss:

$ psql
psql (9.3.4)
Type "help" for help.

fdr=# 

It also means you can use some of the command-line wrappers for CREATE DATABASE and such very easily:

$ createdb myapp
$ psql myapp
$ dropdb myapp

As for controlling the destination of pg:pull, one can use Postgres-standard environment variables like PGDATABASE (complete list).

@fdr fdr added duplicate and removed duplicate labels Jun 26, 2014
@heroku heroku locked and limited conversation to collaborators Jun 29, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests