-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-16965 Performance warnings from Cassandra workunit code #9542
HPCC-16965 Performance warnings from Cassandra workunit code #9542
Conversation
Fix "Aggregation query used without partition key" warning when counting total number of workunits. Execute counts independently and asynchronously on all partitions instead. Signed-off-by: Richard Chapman <rchapman@hpccsystems.com>
https://track.hpccsystems.com/browse/HPCC-16965 |
@ghalliday Please review |
plugins/cassandra/cassandrawu.cpp
Outdated
@@ -3670,39 +3712,31 @@ class CCasssandraWorkUnitFactory : public CWorkUnitFactory, implements ICassandr | |||
unsigned validateRepository(bool fix) | |||
{ | |||
unsigned errCount = 0; | |||
// MORE - if the batch gets too big you may need to flush it occasionally | |||
CassandraBatch batch(fix ? cass_batch_new(CASS_BATCH_TYPE_LOGGED) : NULL); | |||
CIArrayOf<CassandraStatement> secondaryBatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: unused I think
@richardkchapman all looks good to me. One comment about an unused variable. |
Change partitioning so that all tables use same partitioning column as far as possible. Updates to the parent and all children can thus be done efficiently in a single unlogged batch. Changes to secondary tables used for searching are moved into a separate batch, implemented as independent async Cassandra calls for best performance. Change validateRepository code to use much smaller and more appropriate batches Signed-off-by: Richard Chapman <rchapman@hpccsystems.com>
Deleting and then adding some exceptions within a single workunit commit would end up losing the newly added ones - this is because of how Cassandra batches work when rows are both deleted and updated within the same batch. I doubt this would cause any issues in practice outside of test suite code. Also, removing certain information from a workunit - specifically the file associations but there may be other instances - would result in this information not being properly committed. We need to bind NULL to any columns that we want cleared in an update. Signed-off-by: Richard Chapman <rchapman@hpccsystems.com>
afc336a
to
dab7202
Compare
@ghalliday I have removed the unused variable and repushed |
Automated Smoketest
Install hpccsystems-platform-community_6.3.0-trunk0.el7.x86_64.rpm Unittest result:
HPCC Stop: OK |
No description provided.