New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TabularToolDataTable._deduplicate_data() faster #5383

Merged
merged 1 commit into from Jan 25, 2018

Conversation

Projects
None yet
2 participants
@nsoranzo
Member

nsoranzo commented Jan 25, 2018

Another improvement for #4783 .
Reduces the time to deduplicate the snpeffv_databases.loc data table (42762 lines) from seconds to milliseconds.

Make TabularToolDataTable._deduplicate_data() faster
Another improvement for #4783 .
Reduces the time to deduplicate `snpeffv_databases.loc` data table (42762
lines) from seconds to milliseconds.
if entries:
for entry in entries:
self.add_entry(entry, allow_duplicates=allow_duplicates, persist=persist, persist_on_error=persist_on_error, entry_source=entry_source, **kwd)
for entry in entries:

This comment has been minimized.

@mvdbeek

mvdbeek Jan 25, 2018

Member

Is entries always going to be an iterable ?

This comment has been minimized.

@mvdbeek

mvdbeek Jan 25, 2018

Member

found it, answer should be yes.

This comment has been minimized.

@nsoranzo

nsoranzo Jan 25, 2018

Member

add_entries() is only called once, with other_table.data as parameter. And TabularToolDataTable.data is initialised to [], so yes.

@mvdbeek

Nice!

@mvdbeek mvdbeek merged commit e110488 into galaxyproject:dev Jan 25, 2018

6 checks passed

api test Build finished. 343 tests run, 4 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 171 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 79 tests run, 4 skipped, 0 failed.
Details
selenium test Build finished. 118 tests run, 2 skipped, 0 failed.
Details
toolshed test Build finished. 577 tests run, 0 skipped, 0 failed.
Details

@nsoranzo nsoranzo deleted the nsoranzo:deduplicate_speedup branch Jan 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment