Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining credentials from variable environment #49

Closed
HassenIO opened this issue Jul 8, 2019 · 6 comments
Closed

Allow defining credentials from variable environment #49

HassenIO opened this issue Jul 8, 2019 · 6 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@HassenIO
Copy link

HassenIO commented Jul 8, 2019

Description

It is not clear how one can best deploy Kedro with credentials for production. A good practice in traditional applications is to put sensitive credentials using environment variables.

In Kedro, it looks like it relies only on credentials.yml, located somewhere except in conf/local/ folder.

(PS: If I'm missing something with this, the problem is probably bad or missing documentation since I'm not able to find this information in the doc)

Context

Using environment variables will help standardize the deployment of Kedro like any app, thus reduces learning curve for developers.

Possible Implementation

Possible change could be to look for environment variables first, before looking for the content of credentials.yml. Changes will mainly be located in ConfigLoader class.

Possible Alternatives

I create credentials.yml file in config/base/ folder, because this folder ignored by Git, but is still packaged in the Kedro Docker. This is still not good because the credentials are now located in the Docker images repository: Anyone can pull that image and get prod credentials!


If this is an interesting implementation (which I think it is), I would be happy to contribute by implementing it. I first want to open the discussion about the best implementation given Kedro orientation, before making any pull request.

Regards,

@HassenIO HassenIO added the Issue: Feature Request New feature or improvement to existing feature label Jul 8, 2019
@DmitriiDeriabinQB
Copy link
Contributor

@htaidirt, thank you for your contribution. Please find some comments below.

It is not clear how one can best deploy Kedro with credentials for production.

Currently Kedro supports deploying credentials for production via configuration environments. Default project template contains 2 of them - conf/base is intended for storage of non-sensitive shareable configuration, and conf/local, where you can put sensitive credentials. conf/local is in .gitignore by default. Please find more information on how kedro configuration module works in this section of documentation.

I create credentials.yml file in config/base/ folder, because this folder ignored by Git, but is still packaged in the Kedro Docker. This is still not good because the credentials are now located in the Docker images repository: Anyone can pull that image and get prod credentials!

conf/base is indeed copied into Docker image, however, as documentation suggests, it is not intended to store any credentials. You should rather store your credentials in conf/local/credentials.yml, which is in .dockerignore by default.

Using environment variables will help standardize the deployment of Kedro like any app, thus reduces learning curve for developers.

Currently you can manually construct/enrich your credentials dictionary in src/<package_name>/run.py with any data, including one coming from the environment variables.

In long term we consider adding templating capability for kedro configs, which may, possibly, handle environment variables, however exact specification hasn't been finalised yet.

@921kiyo
Copy link
Contributor

921kiyo commented Aug 12, 2019

We have updated our docs regarding credentials in 5f4325f. So I am closing this for now, but feel free to re-open it if this is still an issue :)

@921kiyo 921kiyo closed this as completed Aug 12, 2019
@HugoPerrier
Copy link

Hello,
I am trying to implement the following solution.

Currently you can manually construct/enrich your credentials dictionary in src/<package_name>/run.py with any data, including one coming from the environment variables.

When I clone the project, the file conf/local/credentials.yml is not present,
I want it to be created from env variables in the following situations:

  • Running kedro run from command line
  • Running context = load_context(MY_PATH) from a notebook (not a kedro notebook)

For this I modified run.py as follows:

import yaml

class ProjectContext(KedroContext):
    project_name = "my_project"
    project_version = "0.16.1"
    package_name = "my_package"

    def __init__(self, project_path, **kwargs):
        super().__init__(project_path,  **kwargs)
        self._set_credentials()
    
    def _set_credentials(self):
        kedro_project_dir = self.project_path
       credential_file = os.path.join(kedro_project_dir, "conf", "local", "credentials.yml")

        credentials = {
            "my_credential" = os.getenv("MY_ENV_VAR")
        }

        with open(credential_file, "w") as file:
            yaml.dump(credentials, file)



    def _get_pipelines(self) -> Dict[str, Pipeline]:
        return create_pipelines()

This does the job for now but is it future proof ?
(I guess the init argument of the super class KedroContext might change in the future).

Is there a better way to do this?

@DmitriiDeriabinQB
Copy link
Contributor

DmitriiDeriabinQB commented Jun 3, 2020

@HugoPerrier Can you please open a new issue and cross-link to this thread in there? Otherwise we won't get much visibility on the updates in closed issues.

Regarding your question: There might be a better way of handling your credentials using TemplatedConfigLoader. You can create conf/local/credentials.yml manually and add templated credentials in there (similar to catalog.yml in the example above). That way the file won't contain any static secrets and can be committed to the repository. The actual secret values will come from the corresponding environment variables, which will be resolved by TemplatedConfigLoader at runtime.

@HugoPerrier
Copy link

I posted my solution in
How do I fill the credentials from environment variables #403

Thanks for the help

@noklam
Copy link
Contributor

noklam commented May 25, 2021

I am not sure how I can use the crendtials for CI environment, I cannot just put in the YAML in the local/ folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

5 participants