Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conformor prevent clash #23

Merged
merged 12 commits into from Dec 2, 2019
Merged

Conformor prevent clash #23

merged 12 commits into from Dec 2, 2019

Conversation

AlaaShamandy
Copy link
Member

@AlaaShamandy AlaaShamandy commented Nov 19, 2019

Removing clashes implementation is done.

Reduced the four for loops and 2 if statements to just 2 for loops by vectorizing the calculations.
These could be reduced to 0 for loops eventually if we vectorize the whole program.

Reasonable assumptions made while developing:

  • Any individual loop from the loop database does not have clashes within, therefore no checks are made on individual loops
  • loops are not smaller than 2 residues

EDITED (by @joaomcteixeira):
This PR addresses #26.

@AlaaShamandy
Copy link
Member Author

Wrote two functions:

  1. calculates the clashes by manually iterating over the two chains and
  2. vectorizes these calculations.

Next steps: will continue to write tests for these functions

@joaomcteixeira
Copy link
Member

Thanks @AlaaShamandy for your implementations. I will review them carefully.

@joaomcteixeira joaomcteixeira added the enhancement New feature or request label Nov 25, 2019
@joaomcteixeira joaomcteixeira added this to In progress in Conformer Construction via automation Nov 25, 2019
@joaomcteixeira joaomcteixeira added this to the First release milestone Nov 25, 2019
Copy link
Member

@joaomcteixeira joaomcteixeira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat implementation, congratulations and thanks @AlaaShamandy !
I made some comments, let's discuss a bit further before merging so that we define a strategy to then pass to a library arch.

@joaomcteixeira
Copy link
Member

@AlaaShamandy I wait for you to tell me when the PR is completed before I review it for merging. Added the label not to merge.

"""

# concatenate every residue containing four atoms into one numpy array
last_chain_vector = np.concatenate([np.array(list(atom.values())) for atom in last_chain[:-2]]) #ORIGINAL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlaaShamandy We should not slice the conformer.coords items inside the validators. That should be made before sending the chains to the function/class call. I will do the same in the Numba implementations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?
Normally the function should be handling any slicing etc + for someone who doesn't know the program very well, it will reiterate their understanding when they see the slicing inside the function. The stateholder keeps track of conformers built on each other. So current_conformer = last_conformer + new loop. This is the logic outside the validators.

We should just stay consistent. There is no advantage to either method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normally we say we'll make standalone classes to work on their own but these validators are very specific to our program. It depends on the format of the conformers we have specified, it depends on only finding the atoms N, O, CA and C for now and it depends on the order we have specified. It's ok to deviate from making it standalone since it doesn't make much sense on its own anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in this PR you are working on alphas it is fine to have the slicing there, but I don't agree with that. Let's see:

  • you have two logics in the same place: 1) deciding what to compare 2) how to compare
  • if we have different strategies for comparison we need to repeat the logic in 1) in the different strategies
  • if there are edge cases in the future that need different method we will need to develop logic that goes around the logic 1) implemented in the core of the VDW validators.

So for this reasons, it should be defined somewhere else how the slicing of the fragments is done before sending two fragments to the VDW validator. The logic of the VDW validator should be "validate A against B according to this criteria", and not "validate A against B but to A lets remove one index or two depending..." that part of the logic should be somewhere else.

Yes, though these validators are only for our program lets keep logic and abstractions as separated as possible, it will also facilitate unit testing.

@AlaaShamandy
Copy link
Member Author

@joaomcteixeira I tried to think of the most efficient way to utilize kdtrees, please review.

This pull request is ready to be merged.

Next steps:

  1. I have a couple of tentative tests written, still need to write more vigorous testing.
  2. comparing kdtrees with vectorization to find out which one is faster/more efficient

new_chain_kdtree = KDTree(new_chain_vector)

clashes = last_chain_kdtree.sparse_distance_matrix(new_chain_kdtree, 3.4) #maximum value to consider
for indices, distance in zip(clashes.keys(),clashes.values()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick note: this can be written as clashes.items().

Copy link
Member

@joaomcteixeira joaomcteixeira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean and neat implementation of the vectorized and KDTree implementation. The architecture looks nice at this stage, though I am not evaluating it extensively for now as I am sure it will change or re-adapt when this code is implemented in the actual program libraries.

The the conformer_builder.py you have corrected some of the bugs we have identified in meetings and others you have found. We will have those updates on top of your desk when implementing for #27.

Thanks @AlaaShamandy , I will merge now.

@joaomcteixeira joaomcteixeira merged commit 6a954e0 into julie-forman-kay-lab:develop Dec 2, 2019
@joaomcteixeira joaomcteixeira moved this from In progress to Done in Conformer Construction Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants