generated output is owned by ${USER} instead of root #344

erikr · 2020-06-24T22:04:02Z

scripts/tf.sh runs as ${USER} instead of root by default. This is accomplished by setting up the correct user and group in bash in docker, then calling the Python script as ${USER}.

The end result is that output from calling ml4cvd is now owned by ${USER} instead of root!

To preserve prior behavior, the user can toggle between ${USER} and root using the -r flag, although I am unaware of situations where this is desired.
The jupyter directory is now created by calling tf.sh with the -j flag, instead of created by default.

paolodi · 2020-06-24T23:51:16Z

Nice functionality, Erik, and nicer implementation than what I had in mind when we discussed this!

Just to clarify before reviewing: it seems that the functionality you added gets triggered by the -r flag and is not active by default. This contradicts a bit the wording of the description, or am I missing something?

erikr · 2020-06-24T23:57:56Z

@paolodi Yikes, good catch. I made a last-minute change to the implementation and got mixed up. Fixing now.

erikr · 2020-06-25T00:19:45Z

Fixed and ready for review!

StevenSong

awesome, just 1 q

scripts/tf.sh

StevenSong · 2020-06-25T21:07:49Z

@erikr finding it might be useful to preserve all of a user's group permissions as well

if /data is part of lab group and my user ssong is part of lab group, I would like to both own the results as well as access data

erikr · 2020-06-25T21:22:44Z

Good idea. So you'd suggest:

get the group of the /data folder
check if ${USER} is in that group
if so, set that as the group

StevenSong · 2020-06-25T21:25:49Z

Good idea. So you'd suggest:

get the group of the /data folder

check if ${USER} is in that group

if so, set that as the group

I think it'd have more utility to add all of a user's group to the docker context, that way it's somewhat obvious to debug if a file cannot be accessed

let's say i get an error that's like Permission denied for /foo/bar, my next step in debugging when I see that error is to see if I can access it myself ls /foo/bar, if that's true then I know its really a permission issue

erikr · 2020-06-28T21:40:13Z

User who calls tf.sh now has all of their groups added to their user within bash within docker. I learned more about bash, environment variables, and arrays!

https://github.com/broadinstitute/ml/blob/cff1bc88a0035b5c3ad3a0de8d28c6a588ef8111/scripts/tf.sh#L20-L50

Example output:

$ ./scripts/tf.sh -c -t \    $HOME/ml/ml4cvd/recipes.py \                                                                                                                                                                                                                                                                                    ml -> er_output_not_root
    --logging_level INFO \
    --mode cross_reference \
    --tensors_source ~/dropbox/ecg/explore/mgh/tensors_all_union.csv \
    --tensors_name ecg \
    --join_tensors partners_ecg_patientid_clean \
    --time_tensor partners_ecg_datetime \
    --reference_tensors ~/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv \
    --reference_name ici \
    --reference_join_tensors mgh_mrn \
    --number_per_window           1 \
    \
    --reference_start_time_tensor ici_start_for_calc -365 \
    --reference_end_time_tensor   ici_start_for_calc \
    --window_name                 pre-ici \
    \
    --output_folder ~ \
    --id fix-docker-permissions

Error response from daemon: unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication
Could not pull the image gcr.io/broad-ml4cvd/deeplearning:tf2-latest-cpu. Will try anyway...
Found /data folder will try to mount it.
Found /mnt folder will try to mount it.
Attempting to run Docker with
    docker run -it          --rm     --ipc=host     -v /home/erik/ml/:/home/erik/ml/     -v /home/erik/:/home/erik/      -v /data/:/data/ -v /mnt/:/mnt/     gcr.io/broad-ml4cvd/deeplearning:tf2-latest-cpu /bin/bash -c "pip install /home/erik/ml;
        eval  python   /home/erik/ml/ml4cvd/recipes.py --logging_level INFO --mode cross_reference --tensors_source /home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv --tensors_name ecg --join_tensors partners_ecg_patientid_clean --time_tensor partners_ecg_datetime --reference_tensors /home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv --reference_name ici --reference_join_tensors mgh_mrn --number_per_window 1 --reference_start_time_tensor ici_start_for_calc -365 --reference_end_time_tensor ici_start_for_calc --window_name pre-ici --output_folder /home/erik --id fix-docker-permissions"
Processing /home/erik/ml
Building wheels for collected packages: ml4cvd
  Building wheel for ml4cvd (setup.py) ... done
  Created wheel for ml4cvd: filename=ml4cvd-0.0.1-py3-none-any.whl size=434468 sha256=f9ee5785ccffc8ae8b50cac6c1f3ea9d3665d1919c01fe28a6976d984de0ea15
  Stored in directory: /tmp/pip-ephem-wheel-cache-9efqz1kf/wheels/98/36/ad/478662d8860921e0aa5ad4db1af72b8e825c13212ca8822e10
Successfully built ml4cvd
Installing collected packages: ml4cvd
Successfully installed ml4cvd-0.0.1
WARNING: You are using pip version 20.1; however, version 20.1.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  sudo
0 upgraded, 1 newly installed, 0 to remove and 63 not upgraded.
Need to get 427 kB of archives.
After this operation, 1765 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 sudo amd64 1.8.21p2-3ubuntu1.2 [427 kB]
Fetched 427 kB in 1s (784 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package sudo.
(Reading database ... 22621 files and directories currently installed.)
Preparing to unpack .../sudo_1.8.21p2-3ubuntu1.2_amd64.deb ...
Unpacking sudo (1.8.21p2-3ubuntu1.2) ...
Setting up sudo (1.8.21p2-3ubuntu1.2) ...
Creating group erik with gid 1000
Adding user erik to group erik
Creating group sudo with gid 27
Adding user erik to group sudo
Creating group docker with gid 998
Adding user erik to group docker
Creating group aguirrelab with gid 1006
Adding user erik to group aguirrelab
2020-06-28 17:41:13.312941: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-06-28 17:41:13.313001: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-06-28 17:41:13.313005: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-06-28 17:41:15.561128: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-06-28 17:41:15.561147: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
2020-06-28 17:41:15.561163: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2020-06-28 17:41:15.561369: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-06-28 17:41:15.586646: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3099995000 Hz
2020-06-28 17:41:15.588788: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4981790 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-28 17:41:15.588816: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-06-28 17:41:15,916 - logger:25 - INFO - Logging configuration was loaded. Log messages can be found at /home/erik/fix-docker-permissions/log_2020-06-28_17-41_0.log.
2020-06-28 17:41:15,916 - arguments:411 - INFO - Command Line was:
./scripts/tf.sh /home/erik/ml/ml4cvd/recipes.py --logging_level INFO --mode cross_reference --tensors_source /home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv --tensors_name ecg --join_tensors partners_ecg_patientid_clean --time_tensor partners_ecg_datetime --reference_tensors /home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv --reference_name ici --reference_join_tensors mgh_mrn --number_per_window 1 --reference_start_time_tensor ici_start_for_calc -365 --reference_end_time_tensor ici_start_for_calc --window_name pre-ici --output_folder /home/erik --id fix-docker-permissions

2020-06-28 17:41:15,917 - arguments:412 - INFO - Total TensorMaps: 539 Arguments are Namespace(activation='relu', aligned_dimension=16, alpha=0.5, anneal_max=2.0, anneal_rate=0.0, anneal_shift=0.0, app_csv=None, b_slice_force=None, balance_csvs=[], batch_size=16, bigquery_credentials_file='/mnt/ml4cvd/projects/jamesp/bigquery/bigquery-viewer-credentials.json', bigquery_dataset='broad-ml4cvd.ukbb7089_r10data', block_size=3, bottleneck_type=<BottleneckType.FlattenRestructure: 1>, cache_size=125000000.0, categorical_field_ids=[], continuous_field_ids=[], continuous_file=None, continuous_file_column=None, continuous_file_discretization_bounds=[], continuous_file_normalize=False, conv_dilate=False, conv_dropout=0.0, conv_layers=[32], conv_normalize=None, conv_regularize=None, conv_type='conv', conv_x=3, conv_y=3, conv_z=2, debug=False, dense_blocks=[32, 24, 16], dense_layers=[16, 64], dicom_series='cine_segmented_sax_b6', dicoms='./dicoms/', dropout=0.0, eager=False, embed_visualization=None, epochs=12, explore_export_errors=False, freeze_model_layers=False, hidden_layer='embed', id='fix-docker-permissions', imputation_method_for_continuous_fields='random', include_array=False, include_instance=False, include_missing_continuous_channel=False, input_tensors=[], inspect_model=False, inspect_show_labels=True, join_tensors=['partners_ecg_patientid_clean'], label_weights=None, language_layer='ecg_rest_text', language_prefix='ukb_ecg_rest', learning_rate=0.0002, learning_rate_schedule=None, logging_level='INFO', match_any_window=False, max_models=16, max_parameters=9000000, max_patients=999999, max_pools=[], max_sample_id=7000000, max_samples=None, max_slices=999999, min_sample_id=0, min_samples=3, min_values=10, mixup_alpha=0, mlp_concat=False, mode='cross_reference', model_file=None, model_files=[], model_layers=None, mri_field_ids=['20208', '20209'], num_workers=28, number_per_window=1, optimizer='radam', order_in_window=None, output_folder='/home/erik', output_tensors=[], padding='same', patience=8, phecode_definitions='/mnt/ml4cvd/projects/jamesp/data/phecode_definitions1.2.csv', phenos_folder='gs://ml4cvd/phenotypes/', plot_hist=True, plot_mode='clinical', pool_type='max', pool_x=2, pool_y=2, pool_z=1, random_seed=12878, reference_end_time_tensor=[['ici_start_for_calc']], reference_join_tensors=['mgh_mrn'], reference_labels=None, reference_name='ici', reference_start_time_tensor=[['ici_start_for_calc', '-365']], reference_tensors='/home/erik/dropbox/ecg_immunotherapy/data/casecontrol2020624_events_controls.csv', sample_csv=None, sample_weight=None, t=48, tensor_maps_in=[], tensor_maps_out=[], tensors=None, tensors_name='ecg', tensors_source='/home/erik/dropbox/ecg/explore/mgh/tensors_all_union.csv', test_csv=None, test_ratio=0.1, test_steps=32, time_tensor='partners_ecg_datetime', train_csv=None, training_steps=400, tsv_style='standard', u_connect=defaultdict(<class 'set'>, {}), valid_csv=None, valid_ratio=0.2, validation_steps=40, window_name=['pre-ici'], write_pngs=False, x=256, xml_field_ids=['20205', '6025'], xml_folder='/mnt/disks/ecg-rest-xml/', y=256, z=48, zip_folder='/mnt/disks/sax-mri-zip/', zoom_height=96, zoom_width=96, zoom_x=50, zoom_y=35)
2020-06-28 17:41:34,887 - explorations:1058 - INFO - Loaded ecg into dataframe
2020-06-28 17:41:34,890 - explorations:1060 - INFO - Loaded ici into dataframe
2020-06-28 17:41:36,514 - explorations:1084 - INFO - Cleaned data columns and removed rows that could not be parsed
2020-06-28 17:41:37,543 - explorations:1089 - INFO - Removed duplicates from dataframes, based on join, time, and label
2020-06-28 17:41:38,664 - explorations:1099 - INFO - Cross referenced based on join tensors
2020-06-28 17:41:38,839 - explorations:1201 - INFO - Cross referenced so unique event occurs 1+ times in all time windows
2020-06-28 17:41:38,857 - explorations:1130 - INFO - Saved cross reference to /home/erik/fix-docker-permissions/list_1+_in_all_windows.csv
2020-06-28 17:41:38,863 - explorations:1248 - INFO - Saved cohort counts to /home/erik/fix-docker-permissions/summary_cohort_counts.csv
2020-06-28 17:41:38,864 - recipes:120 - INFO - Executed the 'cross_reference' operation in 22.95 seconds

Checking permissions of output:

$ ls -al
drwxr-xr-x  2 erik erik 4.0K Jun 28 17:41 fix-docker-permissions/

and

$ ls -al ~/fix-docker-permissions
total 200
drwxr-xr-x  2 erik erik   4096 Jun 28 17:41 .
drwxr-xr-x 25 erik erik   4096 Jun 28 17:43 ..
-rw-r--r--  1 erik erik   3761 Jun 28 17:41 arguments_2020-06-28_17-41.txt
-rw-r--r--  1 erik erik 177717 Jun 28 17:41 list_1+_in_all_windows.csv
-rw-r--r--  1 erik erik   5054 Jun 28 17:41 log_2020-06-28_17-41_0.log
-rw-r--r--  1 erik erik    731 Jun 28 17:41 summary_cohort_counts.csv

erikr · 2020-06-29T16:20:04Z

We should test this in a more rigorous way, e.g. set the output to a directory in which user only has r/w access via their group.

StevenSong · 2020-06-29T18:34:55Z

We should test this in a more rigorous way, e.g. set the output to a directory in which user only has r/w access via their group.

works:

current directory is owned by stultzlab and part of group stultzlab
user sn69 is part of group stultzlab
ml4cvd output folder voltage-stats is owned by sn69

erikr · 2020-07-01T20:29:05Z

Tests passed calling ./scripts/tf.sh -T /path/to/tests, although tests are all downstream of tf.sh:

============================= test session starts ==============================
platform linux -- Python 3.6.9, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
rootdir: /home/erik/broad-ml
plugins: typeguard-2.7.1, cov-2.8.1
collected 736 items

../home/erik/broad-ml/tests/test_arguments.py .....                      [  0%]
../home/erik/broad-ml/tests/test_models.py ............................. [  4%]
........................................................................ [ 14%]
........................................................................ [ 24%]
........................................................................ [ 33%]
........................................................................ [ 43%]
........................................................................ [ 53%]
........................................................................ [ 63%]
........................................................................ [ 73%]
........................................................................ [ 82%]
........................................................................ [ 92%]
......................s                                                  [ 95%]
../home/erik/broad-ml/tests/test_recipes.py .........                    [ 97%]
../home/erik/broad-ml/tests/test_tensor_generators.py .................. [ 99%]
....                                                                     [100%]

=============================== warnings summary ===============================
/home/erik/broad-ml/tests/test_models.py:91
  /home/erik/broad-ml/tests/test_models.py:91: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:103
  /home/erik/broad-ml/tests/test_models.py:103: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:115
  /home/erik/broad-ml/tests/test_models.py:115: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:139
  /home/erik/broad-ml/tests/test_models.py:139: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:162
  /home/erik/broad-ml/tests/test_models.py:162: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

/home/erik/broad-ml/tests/test_models.py:322
  /home/erik/broad-ml/tests/test_models.py:322: PytestUnknownMarkWarning: Unknown pytest.mark.slow - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.slow

tests/test_recipes.py::TestRecipes::test_find_learning_rate
  /usr/local/lib/python3.6/dist-packages/ml4cvd/optimizers.py:94: RuntimeWarning: Mean of empty slice.
    delta = smoothed_loss[i - mean_width - 1: i - 1].mean() - smoothed_loss[i - mean_width: i].mean()  # positive is good, means loss decreased a lot

tests/test_recipes.py::TestRecipes::test_find_learning_rate
  /usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
    ret = ret.dtype.type(ret / rcount)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
=========== 735 passed, 1 skipped, 8 warnings in 2063.72s (0:34:23) ============

paolodi

Great functionality, Erik, it works great!

ml4cvd/tensor_generators.py

* user/group of ml4cvd output is no longer root * tf.sh options: run as root, set up jupyter. Closes #334 * root is no longer default * disable silent reporting for -j in tf.sh * many echo statements to check array vals * user added to all user's groups in docker -> bash * fix indentation * adds --env to printed tf.sh call

Erik Reinertsen added 2 commits June 24, 2020 16:20

user/group of ml4cvd output is no longer root

49ab23d

tf.sh options: run as root, set up jupyter. Closes #334

574bb41

erikr added the enhancement New feature or request label Jun 24, 2020

erikr requested review from StevenSong and paolodi June 24, 2020 22:04

erikr self-assigned this Jun 24, 2020

erikr added this to In progress in Infrastructure via automation Jun 24, 2020

erikr changed the title ~~tf.sh runs as $USER instead of root by default~~ tf.sh runs as ${USER} instead of root by default Jun 24, 2020

erikr changed the title ~~tf.sh runs as ${USER} instead of root by default~~ tf.sh runs and generates output owned by ${USER} instead of root Jun 24, 2020

erikr changed the title ~~tf.sh runs and generates output owned by ${USER} instead of root~~ generated output is owned by ${USER} instead of root Jun 24, 2020

erikr linked an issue Jun 24, 2020 that may be closed by this pull request

make it easier to run ml4cvd for users without root access #334

Closed

erikr moved this from In progress to In review in Infrastructure Jun 24, 2020

root is no longer default

01fdff9

StevenSong reviewed Jun 25, 2020

View reviewed changes

scripts/tf.sh Outdated Show resolved Hide resolved

disable silent reporting for -j in tf.sh

8c474cd

Erik Reinertsen added 2 commits June 28, 2020 17:22

many echo statements to check array vals

fc130a8

user added to all user's groups in docker -> bash

cff1bc8

This was referenced Jun 30, 2020

infer header of sample_csv #350

Closed

infer mrn column header. Closes #331 #351

Merged

Merge branch 'master' into er_output_not_root

0f0098a

erikr removed the request for review from paolodi July 1, 2020 19:11

erikr requested a review from paolodi July 1, 2020 19:12

Erik Reinertsen added 2 commits July 1, 2020 15:20

fix indentation

a6ec505

adds --env to printed tf.sh call

5e266d5

paolodi approved these changes Jul 2, 2020

View reviewed changes

ml4cvd/tensor_generators.py Outdated Show resolved Hide resolved

removes extra tab

6496cb0

erikr merged commit a0b3d07 into master Jul 2, 2020

Infrastructure automation moved this from In review to Done Jul 2, 2020

erikr deleted the er_output_not_root branch July 2, 2020 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generated output is owned by ${USER} instead of root #344

generated output is owned by ${USER} instead of root #344

erikr commented Jun 24, 2020 •

edited

paolodi commented Jun 24, 2020

erikr commented Jun 24, 2020

erikr commented Jun 25, 2020

StevenSong left a comment

StevenSong commented Jun 25, 2020

erikr commented Jun 25, 2020 •

edited

StevenSong commented Jun 25, 2020

erikr commented Jun 28, 2020 •

edited

erikr commented Jun 29, 2020

StevenSong commented Jun 29, 2020

erikr commented Jul 1, 2020 •

edited

paolodi left a comment

generated output is owned by ${USER} instead of root #344

generated output is owned by ${USER} instead of root #344

Conversation

erikr commented Jun 24, 2020 • edited

paolodi commented Jun 24, 2020

erikr commented Jun 24, 2020

erikr commented Jun 25, 2020

StevenSong left a comment

Choose a reason for hiding this comment

StevenSong commented Jun 25, 2020

erikr commented Jun 25, 2020 • edited

StevenSong commented Jun 25, 2020

erikr commented Jun 28, 2020 • edited

erikr commented Jun 29, 2020

StevenSong commented Jun 29, 2020

erikr commented Jul 1, 2020 • edited

paolodi left a comment

Choose a reason for hiding this comment

erikr commented Jun 24, 2020 •

edited

erikr commented Jun 25, 2020 •

edited

erikr commented Jun 28, 2020 •

edited

erikr commented Jul 1, 2020 •

edited