-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major performance improvement by exchange vectorized Numpy code with Numba #40
Conversation
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I'm curious about, in every single file you have a copy of the licence. What's the point? You already maintain a copy of the licence under conkit/LICENCE.txt
and surely having a copy everywhere means that once a year you need to go through and change it everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just in case, once the repo is installed via PyPi, the license is not copied across and so this way it's explicit on a repo and per-file basis.
conkit/core/ext/_contactmap.py
Outdated
for i in range(X.shape[0]): | ||
for j in numba.prange(i + 1, X.shape[0]): | ||
for j in range(i + 1, X.shape[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is numba.prange
not an advantage here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numba.prange
assumes that values in each iteration can be computed independently of the previous. Since I'm skipping all throwables[j]
that are already flagged for removal, vectorizing this loop might cause computations where they are not needed.
Added
numba
added as dependencyChanged
SequenceFile.calculate_freq
backend changed fromnumpy
tonumba
for faster computationSequenceFile.calculate_weights
backend changed fromnumpy
tonumba
for faster computationSequenceFile.filter
backend changed fromnumpy
tonumba
for faster computationSequenceFile.filter_gapped
backend changed fromnumpy
tonumba
for faster computationSequenceFile.calculate_weights
renamed toSequenceFile.get_weights
SequenceFile.compute_freq
renamed toSequenceFile.get_frequency
ContactMap.singletons
backend changed fromnumpy
tonumba
for faster computationBandwidth
backend changed fromnumpy
tonumba
for faster computationContactMap.short_range_contacts
renamed toContactMap.short_range
ContactMap.medium_range_contacts
renamed toContactMap.medium_range
ContactMap.long_range_contacts
renamed toContactMap.long_range
ContactMap.calculate_scalar_score
renamed toContactMap.set_scalar_score
ContactMap.calculate_contact_density
renamed toContactMap.get_contact_density
ContactMap.calculate_jaccard_index
renamed toContactMap.get_jaccard_index
Fixed
SequenceFile.filter
to removeSequence
entries reliablyContactMapMatrixFigure
whengap
variable was less than 1