Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow parameterizing via env vars #164

Closed
nsheff opened this issue Feb 22, 2024 · 2 comments
Closed

Allow parameterizing via env vars #164

nsheff opened this issue Feb 22, 2024 · 2 comments

Comments

@nsheff
Copy link
Contributor

nsheff commented Feb 22, 2024

Here I have an example of some code that allows a database-backed Python object to be parameterized either via constructor, or via ENV vars:

https://github.com/refgenie/seqcolapi/blob/fa889f818e1858573cb261077fd737af905178af/seqcolapi/scconf.py#L79-L94

I have found this convenient, because to use it I just do something like:

source settings.env
python

Then

db = RDBDict()

which is quite nice, and also cloud-friendly. I have found myself wanting to do this with pipestat. Would it be possible to make the constructor work like this so it can be configured via ENV vars?

I know it's a little different since there's a pipestat config file, and there's nothing like that here.

I wonder if priority_get could help;

https://github.com/databio/yacman/blob/5db6323a6469347abfa16a590fc8b48d21b7b16f/yacman/yacman_future.py#L358-L389

@nsheff nsheff added this to the v0.10.0 milestone Mar 12, 2024
@donaldcampbelljr
Copy link
Contributor

Is this actually already possible? Looks as though we are using priority_get to potentially use any environment variables:

self.cfg[SCHEMA_PATH] = self.cfg[CONFIG_KEY].priority_get(
"schema_path", env_var=ENV_VARS["schema"], override=schema_path
)
self.process_schema(schema_path)
self.cfg[RECORD_IDENTIFIER] = self.cfg[CONFIG_KEY].priority_get(
"record_identifier", env_var=ENV_VARS["record_identifier"], override=record_identifier
)
self.cfg[PIPELINE_NAME] = (
self.cfg[SCHEMA_KEY].pipeline_name
if self.cfg[SCHEMA_KEY] is not None
else pipeline_name
)
self.cfg[PROJECT_NAME] = self.cfg[CONFIG_KEY].priority_get(
"project_name", env_var=ENV_VARS["project_name"], override=project_name
)
self.cfg[SAMPLE_NAME_ID_KEY] = self.cfg[CONFIG_KEY].priority_get(
"record_identifier",
env_var=ENV_VARS["sample_name"],
override=record_identifier,
)

@donaldcampbelljr
Copy link
Contributor

I did confirm that this works fine. The only snag I ran into was sourcing my .env file. I found a package that assisted with this in Python.
https://github.com/theskumar/python-dotenv

Example of Code, assuming main.py and .env are parallel:

from pipestat import PipestatManager
from dotenv import load_dotenv

load_dotenv()
psm1=PipestatManager()
psm1.report(values={"number_of_things":3})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants