-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Refactor the library - Merge same as PR #10 -> Extensibility + Customability #9
Conversation
Cleanup Create the class::MolProcessor Update utils.py
Preparation Add credits Cleanup Create the class::MolProcessor Update utils.py
@pstjohn I have completed the first step preparing module |
Cleanup `drawing.py` Update `model.py` (P1)
- Refactor + Cleanup - Correct documentation - Move RDKit logging into `__init__.py` to disable by default.
@pstjohn Can you review the code for me?
|
@pstjohn I have corrected the failed test. Can you review it again ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so I'm following, is the computational speed improvement here mainly de-duplicating inside the fragment code, rather than after all the fragments have been generated? That might be more simply implemented as a line or two here:
https://github.com/NREL/alfabet/blob/master/alfabet/fragment.py#L11-L12
A couple style notes:
It's standard to use snake_case
for variable and function names, and TitleCase
for class definitions. Internal class functions usually just start with a preceding underscore (and don't have a trailing one) i.e., _my_internal_function
.
I will correct the function as your style. Yes you are right, but it does not stop there, the optimization of de-duplicating is important but what I want is to centralize all molecules processing with a centralized class to avoid any memory leak, possible parallelization and reduce duplicated operation. For example, the conversion of molecules from SMILES and AddHs function which return new molecule. Even if they are written in C++, the time and temporary memory is non-trivial with less yime fluctuation. The web-based is the best example to get this advantage, deduplicate fragments benefits large dataset processing Moreover, I tried to reduce the namespace of ALFABET by taking only necessary functions which is definitely faster with narrower import. I would fix it in within this week as in here, it is the midnight. Furthermore, we want to gain more control on the molecule processing, which you can see in the method All in all, thank you for your review. |
Are you able to run the test suite locally? Looks like some of those tests are still failing |
Are you able to run the test suite locally? Looks like some of those tests are still failing |
The installation worked fine and call Speed the model start-up by setting Linkedin: https://www.linkedin.com/in/minh-pham-hoang-0b0626172/ |
- Resolving conflict - Correct style on `utils.py` - Update document - Move from `utils.py` to `fragment.py` back - Remove redundant whitespace
This PR attempted to refactor the source code by:
|
The tests worked well. Do you have any comments? @pstjohn |
@pstjohn Please review for me when I am processed and afterward. Do you mind if the function convert to the class object for better the state of the molecule is maintained.
Related: #2