Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log file conflicts when running colcon with parallel and COLCON_LOG_PATH variable set #129

Closed
murphm8 opened this issue Nov 2, 2018 · 5 comments
Labels
bug Something isn't working

Comments

@murphm8
Copy link
Contributor

murphm8 commented Nov 2, 2018

I'm running multiple colcon build executions using the paralell tool. When I have COLCON_LOG_PATH set I get the error:

Traceback (most recent call last):
  File "/usr/local/bin/colcon", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.5/dist-packages/colcon_core/command.py", line 127, in main
    create_log_path(args.verb_name)
  File "/usr/local/lib/python3.5/dist-packages/colcon_core/location.py", line 146, in create_log_path
    path, path.parent / 'latest_{verb_name}'.format_map(locals()))
  File "/usr/local/lib/python3.5/dist-packages/colcon_core/location.py", line 171, in _create_symlink
    os.symlink(str(src), str(dst))
 FileExistsError: [Errno 17] File exists: 'build_2018-11-02_20-22-42' -> '/workspace/build/private/logs/latest_build'

This is because both instances of my colcon build command are trying to write to the same latest_build folder. Is this a supported use case?

@murphm8 murphm8 changed the title File conflicts when running colcon with parallel and COLCON_LOG_PATH variable set Log file conflicts when running colcon with parallel and COLCON_LOG_PATH variable set Nov 2, 2018
@dirk-thomas
Copy link
Member

dirk-thomas commented Nov 2, 2018

using the paralell tool

What kind of tool is that?

This is because both instances of my colcon build command are trying to write to the same latest_build folder.

Each invocation will only write to the specific directory with the timestamp. So invoking the tool with the same verb at the same second is currently indeed a problem (both invocations using the same timestamp). I don't think that is your problem though.

The latest* symlinks are only created for the user for convenience. There could be a race condition if two invocations run through the same logic with checking and then creating the symlink.

Can you please provide a reproducible example.

@dirk-thomas dirk-thomas added the bug Something isn't working label Nov 2, 2018
@dirk-thomas
Copy link
Member

Please try #130 which makes sure that each invocation gets a separate log directory and all FS calls which might be subject to a race are handling exceptions gracefully.

@murphm8
Copy link
Contributor Author

murphm8 commented Nov 5, 2018

I'm using https://www.gnu.org/software/parallel/

The command is:

parallel < commands.txt

The contents of commands.txt is:

colcon build --base-paths ./workspace1 --build-base ./build/workspace1/build-output --install-base ./build/workspace1/install-output
colcon build --base-paths ./workspace2 --build-base ./build/workspace2/build-output --install-base ./build/workspace2/install-output

The environment has COLCON_LOG_PATH set to /home/ec2-user/build/logs. My current working directory is /home/ec2-user. Maybe it should be possible to set the log directory via a command line argument?

@dirk-thomas
Copy link
Member

I can't make it fail with similar commands (e.g. building a ROS 2 workspace twice in parallel). Maybe it is related to your filesystem? Mine is on a SSD which will be less likely to run into a race condition due to its speed.

Please try #130 since it clearly separates your two log directories. Please comment here with your results.

@murphm8
Copy link
Contributor Author

murphm8 commented Nov 6, 2018

I successfully ran the build in parallel with your changes. Thanks for the solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants