Bugfix: Multiple bugs from first round of feedback... #27

rilango · 2021-02-13T22:20:30Z

Docker COPY Command Does Not Transfer Necessary Files Into Container
Now git clone is used to copy everything from master branch. TODO: Further
changes are required to align development and production deployment.
Incorrectly Set GPU Flag Issue Still Exists
Now while creating list of available GPUs casting exception are managed to
avoid this issue.
UMAP Bug with CPU Pipeline in UI
This was a ripple effect of changes to Dockerfile and conda env. file.
Changing the Number of Clusters Does not Seem to Impact Plot
Additional changes to event handling callbacks are make to pick the value
from the textbox is working. Now, the changed 'Number of Clusters' is picked
on Recluster or Reload event
Reclustering on Two Clusters Produces an Error
Reclustering with no 'Number of cluster' was the underlying reason for this
issue. Now, 'Number of cluster' is not set to null or ''. In cases were
'Number of cluster is invalid, the default value (7) is used.
Cluster Color Palette Seems to Change
Now the colors are generated only when 'Number of Clusters' is changed.

- Docker COPY Command Does Not Transfer Necessary Files Into Container Now git clone is used to copy everything from master branch. TODO: Further changes are required to align development and production deployment. - Incorrectly Set GPU Flag Issue Still Exists Now while creating list of available GPUs casting exception are managed to avoid this issue. - UMAP Bug with CPU Pipeline in UI This was a ripple effect of changes to Dockerfile and conda env. file. - Changing the Number of Clusters Does not Seem to Impact Plot Additional changes to event handling callbacks are make to pick the value from the textbox is working. Now, the changed 'Number of Clusters' is picked on Recluster or Reload event - Reclustering on Two Clusters Produces an Error Reclustering with no 'Number of cluster' was the underlying reason for this issue. Now, 'Number of cluster' is not set to null or ''. In cases were 'Number of cluster is invalid, the default value (7) is used. - Cluster Color Palette Seems to Change Now the colors are generated only when 'Number of Clusters' is changed.

mlgill

Have noted non-blocking issues with Dockerfile and unfortunately some bugs that block testing. Happy to finish testing when the blockers are fixed.

Dockerfile

mlgill · 2021-02-15T16:35:06Z

@rilango Have started testing the new commits unfortunately there are still some bugs and they are blockers for testing, so I can't continue without more fixes. Please see requested change on the PR, in addition fo the following:

One of the imports in gpukmeansumap.py is incorrect -- I've corrected it in in my version of your branch here.

Unfortunately there is also a bug after fixing the import, and I couldn't figure out how to fix it via Google:

Traceback (most recent call last):
  File "startdash.py", line 274, in <module>
    main()
  File "startdash.py", line 270, in main
    Launcher()
  File "startdash.py", line 92, in __init__
    getattr(self, args.command)()
  File "startdash.py", line 210, in analyze
    n_gpu=args.n_gpu)
  File "/workspace/nvidia/cheminformatics/utils/dask.py", line 46, in initialize_cluster
    enable_infiniband=enable_infiniband)
  File "/opt/conda/envs/cuchem/lib/python3.7/site-packages/dask_cuda/local_cuda_cluster.py", line 160, in __init__
    memory_limit, threads_per_worker, n_workers
  File "/opt/conda/envs/cuchem/lib/python3.7/site-packages/distributed/worker.py", line 3270, in parse_memory_limit
    memory_limit = int(system.MEMORY_LIMIT * min(1, nthreads / total_cores))
ZeroDivisionError: division by zero
Exception ignored in: <function Cluster.__del__ at 0x7ff37a4b1560>
Traceback (most recent call last):
  File "/opt/conda/envs/cuchem/lib/python3.7/site-packages/distributed/deploy/cluster.py", line 113, in __del__
    if self.status != Status.closed:
AttributeError: 'LocalCUDACluster' object has no attribute 'status'

Happy to finish testing when this CUDACluster bug is fixed. I should be available later this evening, if needed.

mlgill · 2021-02-15T16:41:47Z

Dockerfile

-COPY launch.sh /opt/nvidia/cheminfomatics/
-COPY *.py /opt/nvidia/cheminfomatics/
-COPY nbs/*.ipynb /opt/nvidia/cheminfomatics/
+RUN mkdir -p /opt/nvidia/ \


Perhaps this was known, but the still seems to be missing most of the library files:

# ls /opt/nvidia/cheminfomatics/ Dockerfile LICENSE README.md chemvisualize.py demo.ipynb icla.pdf launch.sh screenshot.jpg startdash.py

Added some of the changes recommended during the initial feedback in Dockerfile. Fix the import stmt issus in GPU workflow class.

mlgill · 2021-02-16T22:35:13Z

@rilango I've verified the following are fixed and have approved changes so you can merge when the rest (below) are fixed:

UMAP Bug with CPU Pipeline in UI
Changing the Number of Clusters Does not Seem to Impact Plot
Reclustering on Two Clusters Produces an Error
Cluster Color Palette Seems to Change

The ones that remain are:

Docker COPY Command Does Not Transfer Necessary Files Into Container -- from your comments, I think it's known that the codebase in /opt/nvidia/cheminformatics is not complete enough to be functional on it's own.
Incorrectly Set GPU Flag Issue Still Exists -- leaving this in since my setup still has issues, per our Slack call.
New ChEMBL IDs aren't handled correctly -- as you mentioned
KMeans will crash with a small number of molecules -- as you mentioned
I also noticed that reclustering with the lasso selector using the CPU pipeline may not work more than once. I did ensure it was a small number of molecules, but not less than 100. Might be worth checking.

- Dockerfile Issue: Copy Dockerfile changes from feedback PDF. - New ChEMBLE issue: This is now fixed. A filtering issue was causing empty dataframe passed for kmeans. Root cause was ChEMBLE ID to molregno conversion error - Potential fix for start error when just one GPU is available - Ablitity to remove all MoI and remove the marker from Figure.

Added new workflow RandomProjection to the Dropdown. This uses single GPU. Other changes include: - Introduce error Dialog. At the moment error during reclustering is rendered - Move some code the base cluster class

Changes to benchmark unittest to use classes directly instead of mocking thru. commandline args. Other changes includes, further refactoring to make unittest development easier.

mlgill

@rilango Have finished testing this code. Issues and suggestions for how to handle are below.

Requirements to Fix for Release

• BUG: Remove hardcoded GPU device number in nvidia/cheminformatics/utils/dask.py This is noted in the code review.
• BUG: UI seems to be pulling 5k molecules, but it's not clear where this is set at. The default in startdash.py is 10k. I can't figure out how to update / change this.

Fix in Future Release
I suggest the following be fixed in a future release so that we can get this code merged.

• BUG: test_workflow.py hangs for at least an hour on my system sometimes. I haven't yet figured out the exact pattern to reproduce the issue. If there are issues with the CI, we can set this test to not run by default using available purest flags.
• BUG: there is still a bug associated with the benchmark.sh script. I'm happy to work on this after the release.
• IMPROVEMENT: UI responsiveness is still slow with small number (5k) molecules after switch to multiple GPUs. I haven't tested large number of molecules to see if this is improved there since I can't figure out how to set the number of molecules (see issue #2 above)

mlgill · 2021-02-18T14:52:05Z

nvidia/cheminformatics/utils/dask.py

@@ -34,6 +34,7 @@ def initialize_cluster(use_gpu=True, n_cpu=None, n_gpu=-1):
                CUDA_VISIBLE_DEVICES.append(int(device))
            except ValueError as vex:
                logger.warn(vex)
+        CUDA_VISIBLE_DEVICES = [0, 1]


Please remove hardcoded CUDA devices. Thanks. :D

…sampling WIP: MegaMolBART single molecule sampling and linear interpolation

Bugfix: Multiple bugs from first round of feedback...

…sampling WIP: MegaMolBART single molecule sampling and linear interpolation

rilango requested a review from mlgill February 13, 2021 22:20

mlgill suggested changes Feb 15, 2021

View reviewed changes

Dockerfile Show resolved Hide resolved

mlgill reviewed Feb 15, 2021

View reviewed changes

Bugfix: Incorrect import stmt and Dockerfile issues...

47478e8

Added some of the changes recommended during the initial feedback in Dockerfile. Fix the import stmt issus in GPU workflow class.

mlgill approved these changes Feb 16, 2021

View reviewed changes

rilango added 4 commits February 16, 2021 16:43

Bugfix: Remove hardcoded GPU lists

4081c7a

New clustering workflow: RandomProjection...

4c88af4

Added new workflow RandomProjection to the Dropdown. This uses single GPU. Other changes include: - Introduce error Dialog. At the moment error during reclustering is rendered - Move some code the base cluster class

Bugfix: Fix ripple effect of refactor on unit tests...

6f9e13a

Changes to benchmark unittest to use classes directly instead of mocking thru. commandline args. Other changes includes, further refactoring to make unittest development easier.

mlgill suggested changes Feb 18, 2021

View reviewed changes

rilango merged commit 8c1e3bb into NVIDIA:dev Feb 18, 2021

rilango added a commit to rilango/cheminformatics that referenced this pull request Jul 23, 2021

Merge pull request NVIDIA#27 from clara-parabricks/michelle/megatron_…

76cae7e

…sampling WIP: MegaMolBART single molecule sampling and linear interpolation

rilango added a commit to rilango/cheminformatics that referenced this pull request Jul 18, 2022

Merge pull request NVIDIA#27 from rilango/rilango/prep-release

35f6f95

Bugfix: Multiple bugs from first round of feedback...

rilango added a commit to rilango/cheminformatics that referenced this pull request Jul 18, 2022

Merge pull request NVIDIA#27 from clara-parabricks/michelle/megatron_…

2b12898

…sampling WIP: MegaMolBART single molecule sampling and linear interpolation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: Multiple bugs from first round of feedback... #27

Bugfix: Multiple bugs from first round of feedback... #27

rilango commented Feb 13, 2021

mlgill left a comment

mlgill commented Feb 15, 2021

mlgill Feb 15, 2021

mlgill commented Feb 16, 2021

mlgill left a comment

mlgill Feb 18, 2021

Bugfix: Multiple bugs from first round of feedback... #27

Bugfix: Multiple bugs from first round of feedback... #27

Conversation

rilango commented Feb 13, 2021

mlgill left a comment

Choose a reason for hiding this comment

mlgill commented Feb 15, 2021

mlgill Feb 15, 2021

Choose a reason for hiding this comment

mlgill commented Feb 16, 2021

mlgill left a comment

Choose a reason for hiding this comment

mlgill Feb 18, 2021

Choose a reason for hiding this comment