New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch refactor #1122
Elasticsearch refactor #1122
Conversation
elog("Error sending data to database"); | ||
//return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either remove the commented-out return statement or add curly braces around these two lines
if(!my->update_account_histories(b)) | ||
{ | ||
edump(("Error populating ES database, exiting to avoid history gaps.")); | ||
throw; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to throw graphene::chain::plugin_exception if you want to re-apply the block.
} | ||
|
||
void elasticsearch_plugin_impl::prepareBulk(const account_transaction_history_id_type& ath_id, | ||
const std::string& bulk_line ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bulk_line is a member of the plugin_impl, no need to pass it in as a parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't test this because I don't have an ES cluster ready. Error handling looks good, except for the throw - see my comment. Since this is probably hard to add to the unit test suite, I'd suggest you run a replay, then shut down your ES while the node is still replaying, then start ES up again and see if the replay goes through.
Generally, I think the code still leaves a lot of room for optimization, but it looks much cleaner now. Good job.
My impression of the es_objects plugin is that it contains a lot of boilerplate code that can probably be written much more tersely using templates. It seems the map<object_id_type,*_struct> maps are always added to but never cleared - this doesn't look like a good idea IMO.
@@ -152,9 +127,65 @@ std::vector<std::string> createBulk(std::string index_name, std::string data, st | |||
|
|||
bulk.push_back("{ \"index\" : { \"_index\" : \""+index_name+"\", \"_type\" : \"data\" "+create_string+" } }"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better don't create JSON strings like this. Create a variant_object instead and let fc::json serialize it.
You are copying a lot of stuff around, might be worthwhile to avoid this wherever possible. |
Great suggestion. Had been doing some tests with this, now changed to The good thing about this is that when doing a replay and ES gets down the plugin will keep trying until it gets up again. Doing some testing in the first index created by the plugin(bitshares-2015-10) when doing all straight the index end up with 161,004 records. By stopping the ES service and restarting several times the index ends up with the exact same number of ops. Only difference is it seems it send some duplicates but this gets ignored by the database as the indexing is done by operation ID. Will keep working in the other suggestions, very appreciated. |
std::string trx_id = ""; | ||
if(!b.transactions.empty() && oho->trx_in_block < b.transactions.size()) { | ||
if(!b.transactions.empty() && oho->trx_in_block < b.transactions.size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!b.transactions.empty() &&
is not needed.
const account_statistics_object& elasticsearch_plugin_impl::getStatsObject(const account_id_type account_id) | ||
{ | ||
graphene::chain::database& db = database(); | ||
const auto &stats_obj = account_id(db).statistics(db); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebase to latest develop branch, use database::get_account_stats_by_owner()
for better performance.
fc::mutable_variant_object bulk_header; | ||
bulk_header["_index"] = index_name; | ||
bulk_header["_type"] = "data"; | ||
bulk_header["_id"] = fc::to_string(ath_id.space_id) + "." + fc::to_string(ath_id.type_id) + "." + ath_id.instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use something like fc::to_string(ath)
or even directly use ath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i don't like this line either. if i do fc::to_string(ath.id)
it will add double "
to the final string. if i do ath_id
directly the compiler complains that cant convert it. looking for alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks. Current code is fine.
3597743
to
376fffc
Compare
376fffc
to
57f7ee4
Compare
The pull is work in progress for #1103
Plugin was refactored and cleaned making it easier to understand to others and easier to maintain, elasticsearch basic auth was added, new field
operation_id_num
for easy filtering was added, node will be stopped at any insert failure avoiding operation gap, etc.Logs were removed as they take too much space and the proper way to test is take snapshots and compare with operations inserted.
By now, tests were made up to block 10 million getting the exact same results in elastic and in RAM. A test in current block number will be done, the bitshares-core team is in the progress to rent a machine for this as a full traditional node is needed.
More commits will be added into this pull request before any merge specially in the
es_objects
plugin side, not cleaned yet. Also new commits are expected to address all the issues in #1103