Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update readme for setting PGPASSWORD when unable to include password in UTA_DB_URL #635

Merged
merged 5 commits into from
May 21, 2022

Conversation

korikuzma
Copy link
Contributor

We would like to be able to connect to our UTA DB that is hosted on AWS via a generated auth token. We ran into issues trying to do this when trying to use vrs-python's translate_to method in variation normalizer as seen in this issue.

@korikuzma
Copy link
Contributor Author

Hi @reece , GitHub is not allowing me to add reviewers to this PR. Would you be able to take a look at this? TIA

setup.py Outdated
@@ -53,6 +53,7 @@
"parsley",
"psycopg2",
"six",
"boto3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be an optional dependency?

Copy link
Contributor Author

@korikuzma korikuzma May 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmrobin you're right! I just pushed this change in my latest commit. I don't really like adding the import inside the conditional, but didn't know where else to put the new module that contains this code (didn't know if the extras or utils directory would be the best place for it). Let me know if you have any other suggestions!

@reece
Copy link
Member

reece commented May 10, 2022

@korikuzma @bmrobin : I think it would be more conventional and cleaner to convey the password in an environment variable. PG_PASSWORD PGPASSWORD would likely work here. And, if it doesn't, we can introduce a new variable like UTA_PASSWORD.

Alternatively, UTA_DB_URL=postgresql://<user>:<password>@localhost:5432/uta/uta_20180821 is already available and should work out of the box. (See the https://github.com/biocommons/hgvs/#configuration.)

Either way, my preference is make keep environment configuration as fully the caller's responsibility.

Am I missing something?

@korikuzma
Copy link
Contributor Author

Hi @reece , thank you for the suggestions to help clean up this PR! We are unable to set the UTA_DB_URL with the generated db auth token since it is parsed incorrectly. I went with the route of using PG_PASSWORD if the password is empty in the UTA_DB_URL. Let me know if there's anything else you'd like me to change.

@reece
Copy link
Member

reece commented May 17, 2022

Hi @korikuzma .

I slightly misled you: I should have written PGPASSWORD (no underscore).

With that change, no code change is required at all. If you don't provide a password, libpq uses PGPASSWORD and then ~/.pgpass to search for a password.

In other words, I think this entire PR can be whittled down to the the README comment to set UTA_DB_URL and/or PGPASSWORD. You can use them together. In your case, I think something like this should work:

$ export UTA_DB_URL=postgresql://genomemeduser@host:5432/uta/uta_20190921 PGPASSWORD="<password>" 

If you agree that this works, please remove the code change.

Sorry for the underscore. :-/

@korikuzma
Copy link
Contributor Author

Hi @reece , I tried your suggestion but am getting a psycopg2.OperationalError connection to server failed: fe_sendauth: no password supplied.

@reece
Copy link
Member

reece commented May 17, 2022

That's surprising. Here's a session that I think demonstrates that this works.

# 1) no passwords in ~/.pgpass that might confound testing
snafu$ ls ~/.pgpass
ls: cannot access '/home/reece/.pgpass': No such file or directory

# 2) Demonstrate that providing UTA_DB_URL works
# N.B. providing variables on the command line this way is exactly the same as exporting them *just for that command line*
snafu$ UTA_DB_URL=postgresql://anonymous:anonymous@uta.biocommons.org:5432/uta/uta_20210129  hgvs-shell
⋮
data provider url: postgresql://anonymous:anonymous@uta.biocommons.org:5432/uta/uta_20210129

# 3) Remove the password from UTA_DB_URL and show that it fails
snafu$ UTA_DB_URL=postgresql://anonymous@uta.biocommons.org:5432/uta/uta_20210129  hgvs-shell
Traceback (most recent call last):
⋮
psycopg2.OperationalError: connection to server at "uta.biocommons.org" (100.25.110.72), port 5432 failed: fe_sendauth: no password supplied

# 4) Supply the password via PGPASSWORD for the win
snafu$ PGPASSWORD=anonymous UTA_DB_URL=postgresql://anonymous@uta.biocommons.org:5432/uta/uta_20210129  hgvs-shell
⋮
data provider url: postgresql://anonymous@uta.biocommons.org:5432/uta/uta_20210129

If PGPASSWORD isn't working, it's either misspelled or it's being stripped from your environment (e.g., by docker or sudo, both of which provide mechanisms for conveying selected environment variables).

Does that help?

@korikuzma
Copy link
Contributor Author

Hi @reece ,

I double checked that PGPASSWORD is spelled correctly and can confirm that I'm still getting the no password supplied error. I added lines right before this (note that I removed my code changes) to check the values of UTA_DB_URL and PGPASSWORD and can confirm that they do exist and look correct. We have set these environment variables in our package that is used to connect to and query the RDS instance.

Are we unable to keep this line? (Changing PG_PASSWORD to PGPASSWORD)

 password=self.url.password if self.url.password else os.environ.get("PGPASSWORD"),

@reece
Copy link
Member

reece commented May 19, 2022

Hi @korikuzma - I'd like to get to the bottom of this. The postgresql authentication mechanism is extremely well-designed. We should be using it, and adding a different mechanism is likely to create confusion.

Please try this:

$ PGPASSWORD=anonymous psql -d uta -h uta.biocommons.org -U anonymous -Xtc 'select version()'
 PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

Then try against your own RDS instance. I am certain that this mechanism should work; if it doesn't, there's an underlying issue that should be fixed directly rather than by workaround in hgvs.

@korikuzma
Copy link
Contributor Author

Hi @reece ,

So I am able to connect to our RDS instance using these commands:

$ export PGPASSWORD="($aws rds generate-db-auth-token --hostname HOST --port PORT --region REGION --username USERNAME )"
$ psql -d $DBNAME -h $HOST -U $USERNAME -Xtc 'select version()'
 PostgreSQL 12.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

@reece
Copy link
Member

reece commented May 20, 2022

Thanks @korikuzma.

Okay, please try this in an environment with ipython and hgvs installed, and no ~./.pgpass

snafu$ ipython
Python 3.10.4 (main, Apr  2 2022, 09:04:19) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.3.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from hgvs.dataproviders.uta import connect

In [2]: import os

In [3]: del os.environ["PGPASSWORD"]   # just to make sure it's unset
...
KeyError: 'PGPASSWORD'

# No password provided anywhere
In [4]: hdp = connect(db_url="postgresql://anonymous@uta.biocommons.org:5432/uta/uta_20180821")
...
OperationalError: connection to server at "uta.biocommons.org" (100.25.110.72), port 5432 failed: fe_sendauth: no password supplied

In [5]: os.environ["PGPASSWORD"] = "anonymous"

# success with same URL when PGPASSWORD is set
# libpq detects that no password is provided and uses this variable
In [6]: hdp = connect(db_url="postgresql://anonymous@uta.biocommons.org:5432/uta/uta_20180821")

# add a colon after the username, which parses password as an empty string (!= None)
# I would expect this to fail because we're explicitly passing a password
In [7]: hdp = connect(db_url="postgresql://anonymous:@uta.biocommons.org:5432/uta/uta_20180821")
...
OperationalError: connection to server at "uta.biocommons.org" (100.25.110.72), port 5432 failed: fe_sendauth: no password supplied

In my opinion, this kinda proves that PGPASSWORD should be working. If it doesn't, there's something happening in your environment outside of hgvs.

@korikuzma
Copy link
Contributor Author

@reece oh my gosh. I'm so sorry. I was looking at your UTA_DB_URL and realized that there is no colon after the user (I was adding this). 🤦‍♀️ I'll remove my changes and just update the readme.

@korikuzma korikuzma changed the title Allow UTA db to connect via generated aws auth token Update readme for setting PGPASSWORD when unable to include password in UTA_DB_URL May 21, 2022
@reece
Copy link
Member

reece commented May 21, 2022

No problem. I like puzzles so this was actually kinda fun for me!

@reece reece merged commit 9386562 into biocommons:main May 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants