Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store object arrays in ES as nested type instead of object #2568

Open
1 of 17 tasks
abitmore opened this issue Dec 25, 2021 · 1 comment
Open
1 of 17 tasks

Store object arrays in ES as nested type instead of object #2568

abitmore opened this issue Dec 25, 2021 · 1 comment

Comments

@abitmore
Copy link
Member

User Story

Object arrays stored in ES are flattened by default (see https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html).

For example, for account_auths in account history object with ID 2.9.671969, the original input was

"account_auths": [
    [ "1.2.121", 30 ],
    [ "1.2.2204", 15 ],
    [ "1.2.3284", 10 ]
] 

After processed by our code in ES plugin (#2565), it becomes

"account_auths_object": [
    { "key_string": "1.2.121", "data_int": 30 },
    { "key_string": "1.2.2204", "data_int": 15 },
    { "key_string": "1.2.3284", "data_int": 10 }
]

But in ES it got flattened as

"account_auths_object.key_string": [ "1.2.121", "1.2.2204", "1.2.3284" ],
"account_auths_object.data_int": [ 30, 15, 10 ]

Screenshot:
image

If we query with "key_string" : "1.2.121" and "data_int" : 15, this record will be returned. This behavior is not desired.

To fix this, we need to store account_auths_object as nested type but not automatically (by the default dynamic mapping rules) as object. It means we need to specify our own explicit mappings.

And there are more fields. The most complex case is multi-level nested proposals, although most of them were malformed unexpectedly.

The challenges are

  • specify explicit mapping rules when creating new indexes (because we create a new index every month)
    • when replaying, we don't need to create or update mappings, but need to check whether an index exists already
  • perhaps use dynamic templates to handle multi-level proposals (I think we can use it to handle normal fields too).

Impacts
Describe which portion(s) of BitShares Core may be impacted by your request. Please tick at least one box.

  • API (the application programming interface)
  • Build (the build process or something prior to compiled code)
  • CLI (the command line wallet)
  • Deployment (the deployment process after building such as Docker, Travis, etc.)
  • DEX (the Decentralized EXchange, market engine, etc.)
  • P2P (the peer-to-peer network for transaction/block propagation)
  • Performance (system or user efficiency, etc.)
  • Protocol (the blockchain logic, consensus, validation, etc.)
  • Security (the security of system or user data, etc.)
  • UX (the User Experience)
  • Other (please add below)

CORE TEAM TASK LIST

  • Evaluate / Prioritize Feature Request
  • Refine User Stories / Requirements
  • Define Test Cases
  • Design / Develop Solution
  • Perform QA/Testing
  • Update Documentation
@abitmore
Copy link
Member Author

abitmore commented Jan 2, 2022

There was a template mentioned in another issue: #681 (comment)

I'm using template to pre-define the settings:

$ curl -XPUT 'http://localhost:9200/_template/graphene' -d '{
  "index_patterns" : ["graphene-*"],
  "settings": { "number_of_shards": 2,
    "index": {
      "translog": {
        "retention": {
          "size": "512mb", "age": "300s"
        }
      }
    }
  }
}' -H 'Content-Type: application/json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant