Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RASA X error on model train when volume mounted in docker image #8598

Closed
gioppoluca opened this issue May 4, 2021 · 8 comments
Closed

RASA X error on model train when volume mounted in docker image #8598

gioppoluca opened this issue May 4, 2021 · 8 comments
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-x/infrastructure 🚂 All things related to infrastructure or deployments type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@gioppoluca
Copy link

Rasa version: 2.3.1

Rasa SDK version (if used & relevant): not relevant

Rasa X version (if used & relevant): 0.39.0

Python version: The one in the official docker image

Operating system (windows, osx, ...): CentOS 7 (Official Docker image)

Issue:
I' m using the official docker images to run a custom docker compose file and I bind mount the app folder.
The container runs with a user 1000:1000 that is needed since in the host there is not the user and the group 1001.
The container and rasa x runs correctly BUT when I try to train the model it breaks due to the fact that the "move" from the TMP cannot be done since the file systems are "not the same" since the app folder is a volume of the container.

BTW when the rasa-X image start it only create the user in passwd, BUT NOT the group so there is also the problem of the group not being known and also this could cause troubles?

Error (including full traceback):

Traceback (most recent call last):
rasa-x_1           |   File "/usr/lib/python3.8/shutil.py", line 788, in move
rasa-x_1           |     os.rename(src, real_dst)
rasa-x_1           | OSError: [Errno 18] Invalid cross-device link: '/tmp/tmpnqrxy_21' -> '/app/models/20210428-152633.tar.gz'
rasa-x_1           | 

Command or request that led to error:
probably the internal command that wants to mode the model from the TMP to the MODAL folder, but I have a host binded volume and docker raises an error

Content of configuration file (config.yml) (if relevant):
not relevant

Content of domain file (domain.yml) (if relevant):
not relevant

@gioppoluca gioppoluca added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels May 4, 2021
@sara-tagger
Copy link
Collaborator

Thanks for raising this issue, @melindaloubser1 will get back to you about it soon✨

Please also check out the docs and the forum in case your issue was raised there too 🤗

@HotThoughts
Copy link
Contributor

Do you also get the error in logs?
PermissionError: [Errno 13] Permission denied: '/app/models/20190610-072003.tar.gz'

I am asking this because there are two similar issues #3661 and #4331 that address the Invalid cross-device link error.

@gioppoluca
Copy link
Author

yes I also have that one but that is a consequence of not being able to execute the previous operation I believe.

If I assign the models folder the permission 1001:0 it works but this should not be an option I started the container with the user owner of the folder in the host FS.
Rasa is assigned the "id" correctly BUT it keeps ROOT as a group probably overriding also that part could help

@gioppoluca
Copy link
Author

This could be the reason:

rename(2): OverlayFS does not fully support the rename(2) system call. Your application needs to detect its failure and fall back to a “copy and unlink” strategy.

The container is using overlay and the operation is issuing a rename operation, can you change it to a copy and unlink?

@TyDunn TyDunn added the area:rasa-x/infrastructure 🚂 All things related to infrastructure or deployments label May 26, 2021
@tmbo
Copy link
Member

tmbo commented Jun 7, 2021

@tczekajlo do you have an idea what this might be caused by?

@tczekajlo
Copy link
Contributor

It's right that the rename function doesn't work with OverlayFS, but we use shutil.move that should handle this situation.

If the destination is on the current filesystem, then os.rename() is used. Otherwise, src is copied to dst using copy_function and then removed. In case of symlinks, a new symlink pointing to the target of src will be created in or as dst and src will be removed.
https://docs.python.org/3/library/shutil.html#shutil.move

@gioppoluca You mentioned that you bind mount the app folder, but it looks like /tmp and /app are the same filesystem type.

You can check it by executing e.g. the df -T /tmp command within a Rasa X container.

Please be sure that /app/models mount uses a different filesystem than /tmp.

@tczekajlo
Copy link
Contributor

I'm closing it because of the lack of further updates.

@menon92
Copy link

menon92 commented Dec 26, 2023

Any updates regarding this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-x/infrastructure 🚂 All things related to infrastructure or deployments type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

7 participants