Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Given configuration path either does not exist or is not a valid directory #33

Closed
eliorc opened this issue Nov 21, 2022 · 20 comments

Comments

@eliorc
Copy link
Contributor

eliorc commented Nov 21, 2022

I am using kedro-azureml==0.3.1 and facing an issue with executing a pipeline.
I have a set environment on my workspace
image

An I am using my own custom kedro starter, which uses Poetry for package management, so the Dockerfile for creating this environment looks like this

ARG BASE_IMAGE=python:3.8-buster
FROM $BASE_IMAGE

ARG AZURE_STORAGE_ACCOUNT_NAME
ARG AZURE_STORAGE_ACCOUNT_KEY

# install project requirements
ENV PYTHONFAULTHANDLER=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONHASHSEED=random \
    PIP_NO_CACHE_DIR=off \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_DEFAULT_TIMEOUT=100 \
    PATH="${PATH}:/root/.local/bin/poetry:/home/kedro/src" \
    AZURE_STORAGE_ACCOUNT_NAME="$AZURE_STORAGE_ACCOUNT_NAME" \
    AZURE_STORAGE_ACCOUNT_KEY="$AZURE_STORAGE_ACCOUNT_KEY"

RUN curl -sSL https://install.python-poetry.org | python3 -
COPY poetry.lock pyproject.toml ./
RUN /root/.local/bin/poetry config virtualenvs.create false && \
    /root/.local/bin/poetry install --no-interaction --no-root --no-ansi && \
    rm poetry.lock pyproject.toml

If it is relevant, this Dockerfile, with the addition of configuring user and workdir, worked perfectly in previous kedro-azureml versions.

Now, when I execute using the azureml environment shown in the image, I get the following error

ProjectMetadata(config_file=PosixPath('/mnt/azureml/cr/j/20935b5bfe9d4a9796036cb173eeffc0/exe/wd/pyproject.toml'), package_name='kedro_test', project_name='kedro-test', project_path=PosixPath('/mnt/azureml/cr/j/20935b5bfe9d4a9796036cb173eeffc0/exe/wd'), project_version='0.18.3', source_dir=PosixPath('/mnt/azureml/cr/j/20935b5bfe9d4a9796036cb173eeffc0/exe/wd/src'))
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/kedro:8 in │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/framework/cli/cli.py:211 in │
│ main │
│ │
│ 208 │ """ │
│ 209 │ _init_plugins() │
│ 210 │ cli_collection = KedroCLI(project_path=Path.cwd()) │
│ ❱ 211 │ cli_collection() │
│ 212 │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:1130 in call
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/framework/cli/cli.py:139 in │
│ main │
│ │
│ 136 │ │ ) │
│ 137 │ │ │
│ 138 │ │ try: │
│ ❱ 139 │ │ │ super().main( │
│ 140 │ │ │ │ args=args, │
│ 141 │ │ │ │ prog_name=prog_name, │
│ 142 │ │ │ │ complete_var=complete_var, │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:1055 in main │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:1657 in invoke │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:1657 in invoke │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:1404 in invoke │
│ │
│ /usr/local/lib/python3.8/site-packages/click/core.py:760 in invoke │
│ │
│ /usr/local/lib/python3.8/site-packages/click/decorators.py:38 in new_func │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro_azureml/cli.py:312 in execute │
│ │
│ 309 ): │
│ 310 │ # 1. Run kedro │
│ 311 │ parameters = parse_extra_params(params) │
│ ❱ 312 │ with KedroContextManager( │
│ 313 │ │ ctx.metadata.package_name, env=ctx.env, extra_params=parameter │
│ 314 │ ) as mgr: │
│ 315 │ │ runner = AzurePipelinesRunner() │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro_azureml/utils.py:33 in │
enter
│ │
│ 30 │ │ return KedroAzureMLConfig.parse_obj(self.context.config_loader. │
│ 31 │ │
│ 32 │ def enter(self): │
│ ❱ 33 │ │ self.session = KedroSession.create( │
│ 34 │ │ │ self.package_name, env=self.env, extra_params=self.extra_pa │
│ 35 │ │ ) │
│ 36 │ │ self.context = self.session.load_context() │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/framework/session/session.py:18 │
│ 1 in create │
│ │
│ 178 │ │ session._store.update(session_data) │
│ 179 │ │ │
│ 180 │ │ # we need a ConfigLoader registered in order to be able to set │
│ ❱ 181 │ │ session._setup_logging() │
│ 182 │ │ return session │
│ 183 │ │
│ 184 │ def _get_logging_config(self) -> Dict[str, Any]: │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/framework/session/session.py:19 │
│ 8 in _setup_logging │
│ │
│ 195 │ def _setup_logging(self) -> None: │
│ 196 │ │ """Register logging specified in logging directory.""" │
│ 197 │ │ try: │
│ ❱ 198 │ │ │ logging_config = self._get_logging_config() │
│ 199 │ │ except MissingConfigException: │
│ 200 │ │ │ self._logger.debug( │
│ 201 │ │ │ │ "No project logging configuration loaded; " │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/framework/session/session.py:18 │
│ 5 in _get_logging_config │
│ │
│ 182 │ │ return session │
│ 183 │ │
│ 184 │ def _get_logging_config(self) -> Dict[str, Any]: │
│ ❱ 185 │ │ logging_config = self._get_config_loader().get( │
│ 186 │ │ │ "logging*", "logging*/", "/logging*" │
│ 187 │ │ ) │
│ 188 │ │ # turn relative paths in logging config into absolute path │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/config/config.py:101 in get │
│ │
│ 98 │ │ return _remove_duplicates(self._build_conf_paths()) │
│ 99 │ │
│ 100 │ def get(self, *patterns: str) -> Dict[str, Any]: │
│ ❱ 101 │ │ return _get_config_from_patterns( │
│ 102 │ │ │ conf_paths=self.conf_paths, patterns=list(patterns) │
│ 103 │ │ ) │
│ 104 │
│ │
│ /usr/local/lib/python3.8/site-packages/kedro/config/common.py:69 in │
│ _get_config_from_patterns │
│ │
│ 66 │ │
│ 67 │ for conf_path in conf_paths: │
│ 68 │ │ if not Path(conf_path).is_dir(): │
│ ❱ 69 │ │ │ raise ValueError( │
│ 70 │ │ │ │ f"Given configuration path either does not exist " │
│ 71 │ │ │ │ f"or is not a valid directory: {conf_path}" │
│ 72 │ │ │ ) │
╰──────────────────────────────────────────────────────────────────────────────╯
ValueError: Given configuration path either does not exist or is not a valid
directory: /mnt/azureml/cr/j/20935b5bfe9d4a9796036cb173eeffc0/exe/wd/conf/local

My .amlignore is empty if it's relevant

@marrrcin
Copy link
Contributor

.amlignore by default should be empty, that's OK.

It seems like some files didn't get uploaded. In the azureml.yml is your code_directory set to "."?

Were there any warnings during kedro azureml run command locally?

Please also verify that you can see all of the relevant files in the Code tab in Azure ML.

Screenshot 2022-11-21 at 09 06 07

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

I can see all the code,
image

Suprisingly conf/local/.gitkeep is not there

The only warning is

.amlignore file is empty, which means all of the files from /home/azureuser/workspace/kedro-test
will be uploaded to Azure ML. Make sure that you excluded sensitive files first!

@marrrcin

@marrrcin
Copy link
Contributor

Can you please add

!conf/local

to .amlignore and see if it resolves the issue?

Usually, when you run the pipeline on Azure ML you don't use local environment, rather a separate one, because by Kedro's convention, local environment is for local use (on the developer's machine) and some other environment, e.g. cloud to run in a remote location.

Running the pipeline on Azure using different Kedro environment (not to be mistaken with Azure ML Environment) is done by

kedro azureml -e name_of_kedro_env run

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

Trying now with the editted version of .amlignore

I didn't understand though why you mentioned the kedro env thing at the end?

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

@marrrcin
I tried using the edit to .amlignore you suggested - the conf/local still does not get uploaded. I notice that all folders that have empty .gitkeep files in them are not getting uploaded also.

The problem still persists

image

@marrrcin
Copy link
Contributor

What version of AZ CLI do you have (az --version)? Try upgrading to the latest one if it isn't the most recent.

For the purpose of further debugging the issue, you can try temporarily removing .gitignore to see whether it's causing the issue. From my experience, having .amlignore should be sufficient, as it takes precedence over .gitignore (at least per Azure's docs...).

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

@marrrcin I don't think it has anything to do with the .gitignore. I just created a dummy text file with some contents in it and put it inside conf/local - this made the conf/local directory to get uploaded and then it worked

@marrrcin
Copy link
Contributor

marrrcin commented Nov 21, 2022

OK, maybe last thing - what if you add !.gitkeep to .amlignore?

I suspect it might have something to do with hidden files.

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

OK, maybe last thing - what if you add !.gitkeep to .amlignore?

I suspect it might have something to do with hidden files.

I will check in a few minutes and report again

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

@marrrcin

Didn't work...

It seems to me that during upload all files which are hidden and without contents are ignored and not all hidden files. I'm saying this because .dockerignore which has contents is not ignored.

@marrrcin
Copy link
Contributor

I guess it's something you can notify the Azure ML SDK team about 🤔

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

I rather think this is a kedro problem, as in I have no need for the conf/local directory, as I have no reference to it anywhere in my code nor my catalog. Yet kedro still expects it to be there

Anyway, this also effect kedro-azureml am I wrong? It is impossible to execute with kedro-azureml with a empty conf/local files

@eliorc
Copy link
Contributor Author

eliorc commented Nov 21, 2022

For anyone encountering this problem, just put some textual content inside conf/local/.gitkeep and AzureML will upload it as a part of the code.

@marrrcin
Copy link
Contributor

Yes, it affects the plugin (I will think about some solution to that problem) but the underlying issue is the fact that the Azure SDK does not upload the files, even though they are present and explicitly not ignored.

@eliorc
Copy link
Contributor Author

eliorc commented Dec 4, 2022

@marrrcin from the release notes, seems like this is solved in 0.3.2?

@marrrcin
Copy link
Contributor

marrrcin commented Dec 5, 2022

To some extent it is, but my thorough testing showed some other issues with ignore files (at least on macOS). I've tracked them down to this PR Azure/azure-sdk-for-python#27338 (comment) - it seems like they've resolved it, but it's not released yet. I will wait for the Azure team to release a new version of the azure-ai-ml and I will bump the plugin accordingly.

@marrrcin
Copy link
Contributor

marrrcin commented Dec 8, 2022

I've released 0.3.3. Now everything should work as expected.

@marrrcin
Copy link
Contributor

@eliorc have you tested out the fixes?

@eliorc
Copy link
Contributor Author

eliorc commented Dec 29, 2022

No, I haven't yet as the workaround of just filling the .gitkeep worked and I rolled it out in our different projects

@marrrcin
Copy link
Contributor

Seems like no further issues occurred. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants