Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generated output is owned by ${USER} instead of root #344

Merged
merged 10 commits into from Jul 2, 2020

Conversation

erikr
Copy link

@erikr erikr commented Jun 24, 2020

  1. scripts/tf.sh runs as ${USER} instead of root by default. This is accomplished by setting up the correct user and group in bash in docker, then calling the Python script as ${USER}.

    The end result is that output from calling ml4cvd is now owned by ${USER} instead of root!

    To preserve prior behavior, the user can toggle between ${USER} and root using the -r flag, although I am unaware of situations where this is desired.

  2. The jupyter directory is now created by calling tf.sh with the -j flag, instead of created by default.

@erikr erikr added the enhancement New feature or request label Jun 24, 2020
@erikr erikr requested review from StevenSong and paolodi June 24, 2020 22:04
@erikr erikr self-assigned this Jun 24, 2020
@erikr erikr added this to In progress in Infrastructure via automation Jun 24, 2020
@erikr erikr changed the title tf.sh runs as $USER instead of root by default tf.sh runs as ${USER} instead of root by default Jun 24, 2020
@erikr erikr changed the title tf.sh runs as ${USER} instead of root by default tf.sh runs and generates output owned by ${USER} instead of root Jun 24, 2020
@erikr erikr changed the title tf.sh runs and generates output owned by ${USER} instead of root generated output is owned by ${USER} instead of root Jun 24, 2020
@erikr erikr linked an issue Jun 24, 2020 that may be closed by this pull request
@erikr erikr moved this from In progress to In review in Infrastructure Jun 24, 2020
@paolodi
Copy link
Contributor

paolodi commented Jun 24, 2020

Nice functionality, Erik, and nicer implementation than what I had in mind when we discussed this!

Just to clarify before reviewing: it seems that the functionality you added gets triggered by the -r flag and is not active by default. This contradicts a bit the wording of the description, or am I missing something?

@erikr
Copy link
Author

erikr commented Jun 24, 2020

@paolodi Yikes, good catch. I made a last-minute change to the implementation and got mixed up. Fixing now.

@erikr
Copy link
Author

erikr commented Jun 25, 2020

Fixed and ready for review!

Copy link
Collaborator

@StevenSong StevenSong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, just 1 q

scripts/tf.sh Outdated Show resolved Hide resolved
@StevenSong
Copy link
Collaborator

@erikr finding it might be useful to preserve all of a user's group permissions as well

if /data is part of lab group and my user ssong is part of lab group, I would like to both own the results as well as access data

@erikr
Copy link
Author

erikr commented Jun 25, 2020

Good idea. So you'd suggest:

  • get the group of the /data folder
  • check if ${USER} is in that group
  • if so, set that as the group

@StevenSong
Copy link
Collaborator

Good idea. So you'd suggest:

  • get the group of the /data folder
  • check if ${USER} is in that group
  • if so, set that as the group

I think it'd have more utility to add all of a user's group to the docker context, that way it's somewhat obvious to debug if a file cannot be accessed

let's say i get an error that's like Permission denied for /foo/bar, my next step in debugging when I see that error is to see if I can access it myself ls /foo/bar, if that's true then I know its really a permission issue

@erikr
Copy link
Author

erikr commented Jun 28, 2020

User who calls tf.sh now has all of their groups added to their user within bash within docker. I learned more about bash, environment variables, and arrays!

https://github.com/broadinstitute/ml/blob/cff1bc88a0035b5c3ad3a0de8d28c6a588ef8111/scripts/tf.sh#L20-L50

Example output:

$ ./scripts/tf.sh -c -t \    $HOME/ml/ml4cvd/recipes.py \                                                                                                                                                                                                                                                                                    ml -> er_output_not_root
    --logging_level INFO \
    --mode cross_reference \
    --tensors_source ~/dropbox/ecg/explore/mgh/tensors_all_union.csv \
    --tensors_name ecg \
    --join_tensors partners_ecg_patientid_clean \
    --time_tensor partners_ecg_datetime \
    --reference_tensors ~/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv \
    --reference_name ici \
    --reference_join_tensors mgh_mrn \
    --number_per_window           1 \
    \
    --reference_start_time_tensor ici_start_for_calc -365 \
    --reference_end_time_tensor   ici_start_for_calc \
    --window_name                 pre-ici \
    \
    --output_folder ~ \
    --id fix-docker-permissions

Error response from daemon: unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication
Could not pull the image gcr.io/broad-ml4cvd/deeplearning:tf2-latest-cpu. Will try anyway...
Found /data folder will try to mount it.
Found /mnt folder will try to mount it.
Attempting to run Docker with
    docker run -it          --rm     --ipc=host     -v /home/erik/ml/:/home/erik/ml/     -v /home/erik/:/home/erik/      -v /data/:/data/ -v /mnt/:/mnt/     gcr.io/broad-ml4cvd/deeplearning:tf2-latest-cpu /bin/bash -c "pip install /home/erik/ml;
        eval  python   /home/erik/ml/ml4cvd/recipes.py --logging_level INFO --mode cross_reference --tensors_source /home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv --tensors_name ecg --join_tensors partners_ecg_patientid_clean --time_tensor partners_ecg_datetime --reference_tensors /home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv --reference_name ici --reference_join_tensors mgh_mrn --number_per_window 1 --reference_start_time_tensor ici_start_for_calc -365 --reference_end_time_tensor ici_start_for_calc --window_name pre-ici --output_folder /home/erik --id fix-docker-permissions"
Processing /home/erik/ml
Building wheels for collected packages: ml4cvd
  Building wheel for ml4cvd (setup.py) ... done
  Created wheel for ml4cvd: filename=ml4cvd-0.0.1-py3-none-any.whl size=434468 sha256=f9ee5785ccffc8ae8b50cac6c1f3ea9d3665d1919c01fe28a6976d984de0ea15
  Stored in directory: /tmp/pip-ephem-wheel-cache-9efqz1kf/wheels/98/36/ad/478662d8860921e0aa5ad4db1af72b8e825c13212ca8822e10
Successfully built ml4cvd
Installing collected packages: ml4cvd
Successfully installed ml4cvd-0.0.1
WARNING: You are using pip version 20.1; however, version 20.1.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  sudo
0 upgraded, 1 newly installed, 0 to remove and 63 not upgraded.
Need to get 427 kB of archives.
After this operation, 1765 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 sudo amd64 1.8.21p2-3ubuntu1.2 [427 kB]
Fetched 427 kB in 1s (784 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package sudo.
(Reading database ... 22621 files and directories currently installed.)
Preparing to unpack .../sudo_1.8.21p2-3ubuntu1.2_amd64.deb ...
Unpacking sudo (1.8.21p2-3ubuntu1.2) ...
Setting up sudo (1.8.21p2-3ubuntu1.2) ...
Creating group erik with gid 1000
Adding user erik to group erik
Creating group sudo with gid 27
Adding user erik to group sudo
Creating group docker with gid 998
Adding user erik to group docker
Creating group aguirrelab with gid 1006
Adding user erik to group aguirrelab
2020-06-28 17:41:13.312941: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-06-28 17:41:13.313001: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-06-28 17:41:13.313005: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-06-28 17:41:15.561128: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-06-28 17:41:15.561147: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
2020-06-28 17:41:15.561163: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2020-06-28 17:41:15.561369: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-06-28 17:41:15.586646: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3099995000 Hz
2020-06-28 17:41:15.588788: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4981790 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-28 17:41:15.588816: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-06-28 17:41:15,916 - logger:25 - INFO - Logging configuration was loaded. Log messages can be found at /home/erik/fix-docker-permissions/log_2020-06-28_17-41_0.log.
2020-06-28 17:41:15,916 - arguments:411 - INFO - Command Line was:
./scripts/tf.sh /home/erik/ml/ml4cvd/recipes.py --logging_level INFO --mode cross_reference --tensors_source /home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv --tensors_name ecg --join_tensors partners_ecg_patientid_clean --time_tensor partners_ecg_datetime --reference_tensors /home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv --reference_name ici --reference_join_tensors mgh_mrn --number_per_window 1 --reference_start_time_tensor ici_start_for_calc -365 --reference_end_time_tensor ici_start_for_calc --window_name pre-ici --output_folder /home/erik --id fix-docker-permissions

2020-06-28 17:41:15,917 - arguments:412 - INFO - Total TensorMaps: 539 Arguments are Namespace(activation='relu', aligned_dimension=16, alpha=0.5, anneal_max=2.0, anneal_rate=0.0, anneal_shift=0.0, app_csv=None, b_slice_force=None, balance_csvs=[], batch_size=16, bigquery_credentials_file='/mnt/ml4cvd/projects/jamesp/bigquery/bigquery-viewer-credentials.json', bigquery_dataset='broad-ml4cvd.ukbb7089_r10data', block_size=3, bottleneck_type=<BottleneckType.FlattenRestructure: 1>, cache_size=125000000.0, categorical_field_ids=[], continuous_field_ids=[], continuous_file=None, continuous_file_column=None, continuous_file_discretization_bounds=[], continuous_file_normalize=False, conv_dilate=False, conv_dropout=0.0, conv_layers=[32], conv_normalize=None, conv_regularize=None, conv_type='conv', conv_x=3, conv_y=3, conv_z=2, debug=False, dense_blocks=[32, 24, 16], dense_layers=[16, 64], dicom_series='cine_segmented_sax_b6', dicoms='./dicoms/', dropout=0.0, eager=False, embed_visualization=None, epochs=12, explore_export_errors=False, freeze_model_layers=False, hidden_layer='embed', id='fix-docker-permissions', imputation_method_for_continuous_fields='random', include_array=False, include_instance=False, include_missing_continuous_channel=False, input_tensors=[], inspect_model=False, inspect_show_labels=True, join_tensors=['partners_ecg_patientid_clean'], label_weights=None, language_layer='ecg_rest_text', language_prefix='ukb_ecg_rest', learning_rate=0.0002, learning_rate_schedule=None, logging_level='INFO', match_any_window=False, max_models=16, max_parameters=9000000, max_patients=999999, max_pools=[], max_sample_id=7000000, max_samples=None, max_slices=999999, min_sample_id=0, min_samples=3, min_values=10, mixup_alpha=0, mlp_concat=False, mode='cross_reference', model_file=None, model_files=[], model_layers=None, mri_field_ids=['20208', '20209'], num_workers=28, number_per_window=1, optimizer='radam', order_in_window=None, output_folder='/home/erik', output_tensors=[], padding='same', patience=8, phecode_definitions='/mnt/ml4cvd/projects/jamesp/data/phecode_definitions1.2.csv', phenos_folder='gs://ml4cvd/phenotypes/', plot_hist=True, plot_mode='clinical', pool_type='max', pool_x=2, pool_y=2, pool_z=1, random_seed=12878, reference_end_time_tensor=[['ici_start_for_calc']], reference_join_tensors=['mgh_mrn'], reference_labels=None, reference_name='ici', reference_start_time_tensor=[['ici_start_for_calc', '-365']], reference_tensors='/home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv', sample_csv=None, sample_weight=None, t=48, tensor_maps_in=[], tensor_maps_out=[], tensors=None, tensors_name='ecg', tensors_source='/home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv', test_csv=None, test_ratio=0.1, test_steps=32, time_tensor='partners_ecg_datetime', train_csv=None, training_steps=400, tsv_style='standard', u_connect=defaultdict(<class 'set'>, {}), valid_csv=None, valid_ratio=0.2, validation_steps=40, window_name=['pre-ici'], write_pngs=False, x=256, xml_field_ids=['20205', '6025'], xml_folder='/mnt/disks/ecg-rest-xml/', y=256, z=48, zip_folder='/mnt/disks/sax-mri-zip/', zoom_height=96, zoom_width=96, zoom_x=50, zoom_y=35)
2020-06-28 17:41:34,887 - explorations:1058 - INFO - Loaded ecg into dataframe
2020-06-28 17:41:34,890 - explorations:1060 - INFO - Loaded ici into dataframe
2020-06-28 17:41:36,514 - explorations:1084 - INFO - Cleaned data columns and removed rows that could not be parsed
2020-06-28 17:41:37,543 - explorations:1089 - INFO - Removed duplicates from dataframes, based on join, time, and label
2020-06-28 17:41:38,664 - explorations:1099 - INFO - Cross referenced based on join tensors
2020-06-28 17:41:38,839 - explorations:1201 - INFO - Cross referenced so unique event occurs 1+ times in all time windows
2020-06-28 17:41:38,857 - explorations:1130 - INFO - Saved cross reference to /home/erik/fix-docker-permissions/list_1+_in_all_windows.csv
2020-06-28 17:41:38,863 - explorations:1248 - INFO - Saved cohort counts to /home/erik/fix-docker-permissions/summary_cohort_counts.csv
2020-06-28 17:41:38,864 - recipes:120 - INFO - Executed the 'cross_reference' operation in 22.95 seconds

Checking permissions of output:

$ ls -al
drwxr-xr-x  2 erik erik 4.0K Jun 28 17:41 fix-docker-permissions/

and

$ ls -al ~/fix-docker-permissions
total 200
drwxr-xr-x  2 erik erik   4096 Jun 28 17:41 .
drwxr-xr-x 25 erik erik   4096 Jun 28 17:43 ..
-rw-r--r--  1 erik erik   3761 Jun 28 17:41 arguments_2020-06-28_17-41.txt
-rw-r--r--  1 erik erik 177717 Jun 28 17:41 list_1+_in_all_windows.csv
-rw-r--r--  1 erik erik   5054 Jun 28 17:41 log_2020-06-28_17-41_0.log
-rw-r--r--  1 erik erik    731 Jun 28 17:41 summary_cohort_counts.csv

@erikr
Copy link
Author

erikr commented Jun 29, 2020

We should test this in a more rigorous way, e.g. set the output to a directory in which user only has r/w access via their group.

@StevenSong
Copy link
Collaborator

We should test this in a more rigorous way, e.g. set the output to a directory in which user only has r/w access via their group.

works:
image

  • current directory is owned by stultzlab and part of group stultzlab
  • user sn69 is part of group stultzlab
  • ml4cvd output folder voltage-stats is owned by sn69

@erikr erikr removed the request for review from paolodi July 1, 2020 19:11
@erikr erikr requested a review from paolodi July 1, 2020 19:12
@erikr
Copy link
Author

erikr commented Jul 1, 2020

Tests passed calling ./scripts/tf.sh -T /path/to/tests, although tests are all downstream of tf.sh:

============================= test session starts ==============================
platform linux -- Python 3.6.9, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
rootdir: /home/erik/broad-ml
plugins: typeguard-2.7.1, cov-2.8.1
collected 736 items

../home/erik/broad-ml/tests/test_arguments.py .....                      [  0%]
../home/erik/broad-ml/tests/test_models.py ............................. [  4%]
........................................................................ [ 14%]
........................................................................ [ 24%]
........................................................................ [ 33%]
........................................................................ [ 43%]
........................................................................ [ 53%]
........................................................................ [ 63%]
........................................................................ [ 73%]
........................................................................ [ 82%]
........................................................................ [ 92%]
......................s                                                  [ 95%]
../home/erik/broad-ml/tests/test_recipes.py .........                    [ 97%]
../home/erik/broad-ml/tests/test_tensor_generators.py .................. [ 99%]
....                                                                     [100%]

=============================== warnings summary ===============================
/home/erik/broad-ml/tests/test_models.py:91
  /home/erik/broad-ml/tests/test_models.py:91: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:103
  /home/erik/broad-ml/tests/test_models.py:103: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:115
  /home/erik/broad-ml/tests/test_models.py:115: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:139
  /home/erik/broad-ml/tests/test_models.py:139: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:162
  /home/erik/broad-ml/tests/test_models.py:162: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:322
  /home/erik/broad-ml/tests/test_models.py:322: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

tests/test_recipes.py::TestRecipes::test_find_learning_rate
  /usr/local/lib/python3.6/dist-packages/ml4cvd/optimizers.py:94: RuntimeWarning: Mean of empty slice.
    delta = smoothed_loss[i - mean_width - 1: i - 1].mean() - smoothed_loss[i - mean_width: i].mean()  # positive is good, means loss decreased a lot

tests/test_recipes.py::TestRecipes::test_find_learning_rate
  /usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
    ret = ret.dtype.type(ret / rcount)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
=========== 735 passed, 1 skipped, 8 warnings in 2063.72s (0:34:23) ============

Copy link
Contributor

@paolodi paolodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great functionality, Erik, it works great!

ml4cvd/tensor_generators.py Outdated Show resolved Hide resolved
@erikr erikr merged commit a0b3d07 into master Jul 2, 2020
Infrastructure automation moved this from In review to Done Jul 2, 2020
@erikr erikr deleted the er_output_not_root branch July 2, 2020 03:32
lucidtronix pushed a commit that referenced this pull request Jan 13, 2023
* user/group of ml4cvd output is no longer root

* tf.sh options: run as root, set up jupyter. Closes #334

* root is no longer default

* disable silent reporting for -j in tf.sh

* many echo statements to check array vals

* user added to all user's groups in docker -> bash

* fix indentation

* adds --env to printed tf.sh call
lucidtronix pushed a commit that referenced this pull request Jan 13, 2023
* user/group of ml4cvd output is no longer root

* tf.sh options: run as root, set up jupyter. Closes #334

* root is no longer default

* disable silent reporting for -j in tf.sh

* many echo statements to check array vals

* user added to all user's groups in docker -> bash

* fix indentation

* adds --env to printed tf.sh call
lucidtronix pushed a commit that referenced this pull request Jan 13, 2023
* user/group of ml4cvd output is no longer root

* tf.sh options: run as root, set up jupyter. Closes #334

* root is no longer default

* disable silent reporting for -j in tf.sh

* many echo statements to check array vals

* user added to all user's groups in docker -> bash

* fix indentation

* adds --env to printed tf.sh call
lucidtronix pushed a commit that referenced this pull request Jan 13, 2023
* user/group of ml4cvd output is no longer root

* tf.sh options: run as root, set up jupyter. Closes #334

* root is no longer default

* disable silent reporting for -j in tf.sh

* many echo statements to check array vals

* user added to all user's groups in docker -> bash

* fix indentation

* adds --env to printed tf.sh call
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

make it easier to run ml4cvd for users without root access
3 participants