You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm just starting with DVC and there may be more correct ways to do what I have initially came up with, but since I couldn't find anything in the documentation or forums this is what I did. Context: a git repo (hosted by Bitbucket) with DVC (tracked files in an S3 bucket under a project specific directory).
we already have .env files for staging and production
our code uses variables defined in the .env files to run, build, deploy, log, etc
we use Bitbucket pipelines, but most of the work is done by our bash scripts
because we're using AWS we have on our developer machines $HOME/.aws/config and $HOME/.aws/credentials
these credentials are also the .env files but because they are for deployments they have names like DEPLOYMENT_AWS_ACCESS_KEY_ID and the AWS_ACCESS_KEY_ID is for run time on EC2.
we have multiple developers, some work on machine learning models (pytorch) and others do processing code and devops (myself), and some do machine learning and processing.
Given the above, I have written a script for deployment that assumes that the .env file has been parsed and all of the definitions are available in the current script environment. It takes the "developer centric" DVC configuration for our remote storage and converts it to use the script environment variables. I didn't see anything that explained a better way to do this and came up with this workaround to provide the credentials through environment variables:
set +u
if [[ -n"${DEPLOYMENT_AWS_ACCESS_KEY_ID}" ]];thenexport AWS_ACCESS_KEY_ID="${DEPLOYMENT_AWS_ACCESS_KEY_ID}"elseecho"Warning: DEPLOYMENT_AWS_ACCESS_KEY_ID is not defined. Using AWS_ACCESS_KEY_ID">&2fiif [[ -n"${DEPLOYMENT_AWS_SECRET_ACCESS_KEY}" ]];thenexport AWS_SECRET_ACCESS_KEY="${DEPLOYMENT_AWS_SECRET_ACCESS_KEY}"elseecho"Warning: DEPLOYMENT_AWS_SECRET_ACCESS_KEY is not defined. Using AWS_SECRET_ACCESS_KEY">&2fiif [[ -n"${DEPLOYMENT_AWS_DEFAULT_REGION}" ]];thenexport AWS_DEFAULT_REGION="${DEPLOYMENT_AWS_DEFAULT_REGION}"elseecho"Warning: DEPLOYMENT_AWS_DEFAULT_REGION is not defined. Using AWS_DEFAULT_REGION">&2fiset -u
REMOTE_STORAGE_PROFILE=""
REMOTE_STORAGE_CREDENTIALPATH=""# remove the local version of remote.storage.credentialpath and use# the environment variables this is likely only on a development machineset +e
REMOTE_STORAGE_PROFILE="$(dvc config --project remote.storage.profile)"
REMOTE_STORAGE_CREDENTIALPATH="$(dvc config --local remote.storage.credentialpath)"
dvc config --project --unset remote.storage.profile
dvc config --local --unset remote.storage.credentialpath
echo"REMOTE_STORAGE_PROFILE = ${REMOTE_STORAGE_PROFILE}"echo"REMOTE_STORAGE_CREDENTIALPATH = ${REMOTE_STORAGE_CREDENTIALPATH}"set -e
dvc pull --verbose
if [[ -n"${REMOTE_STORAGE_PROFILE}" ]];then# restore the value for remote.storage.profile if it was set before
dvc config --project remote.storage.profile "${REMOTE_STORAGE_PROFILE}"fiif [[ -n"${REMOTE_STORAGE_CREDENTIALPATH}" ]];then# restore the value for remote.storage.credentialpath if it was set before
dvc config --local remote.storage.credentialpath "${REMOTE_STORAGE_CREDENTIALPATH}"fi
If I could have used $HOME in my .dvc/config I could have used --project configuration everywhere. As it is each developer will need to run dvc config --local remote.storage.credentialpath "$HOME/.aws/credentials" in their working copy of the repository. I could have also created a $HOME/.aws/credentials file with the correct content in the bitbucket environment.
Instead I kind of aimed for the middle of the road, thinking that I could define the DVC remote.storage.url, remote.storage.profile, and remote.storage.credentialpath in a cross-developer way, I started down that path. But then had to remove remote.storage.profile and remote.storage.credentialpath from the DVC configuration when building on bitbucket.
The text was updated successfully, but these errors were encountered:
@drjasonharrison I moved your request to a new issue since it's a bit different from environment variables in pipelines. Supporting environment variable expansion in config files is a potentially easier problem to solve.
We are using an NFS file system and as a workaround for slow md5 checksum calculations have set the index and state dirs to be in /tmp/dvc/. The issue is that dvc creates this directory and makes it read only for other users so we cannot share repos.
A solution would be to have something like:
[index]
dir = /tmp/dvc_${USER}/index
[state]
dir = /tmp/dvc_${USER}/state
work in the config file. I have tried this and it creates two directories, one with the variable expanded and one without.
I'd also benefit from this. We're using a mapped Onedrive folder for our DVC remote. But the local path to that mapped folder contains the $USER or $HOME path which mean you have to have a different "version" of the remote for each user. DVC being able to resolve an env variable would be really helpful.
Originally posted by @drjasonharrison in #1416 (comment)
I'm just starting with DVC and there may be more correct ways to do what I have initially came up with, but since I couldn't find anything in the documentation or forums this is what I did. Context: a git repo (hosted by Bitbucket) with DVC (tracked files in an S3 bucket under a project specific directory).
Given the above, I have written a script for deployment that assumes that the .env file has been parsed and all of the definitions are available in the current script environment. It takes the "developer centric" DVC configuration for our remote storage and converts it to use the script environment variables. I didn't see anything that explained a better way to do this and came up with this workaround to provide the credentials through environment variables:
If I could have used $HOME in my .dvc/config I could have used --project configuration everywhere. As it is each developer will need to run
dvc config --local remote.storage.credentialpath "$HOME/.aws/credentials"
in their working copy of the repository. I could have also created a $HOME/.aws/credentials file with the correct content in the bitbucket environment.Instead I kind of aimed for the middle of the road, thinking that I could define the DVC remote.storage.url, remote.storage.profile, and remote.storage.credentialpath in a cross-developer way, I started down that path. But then had to remove remote.storage.profile and remote.storage.credentialpath from the DVC configuration when building on bitbucket.
The text was updated successfully, but these errors were encountered: