-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include bpmnProcessId as part of exported variable records #8181
Comments
Just to make sure I understand, does this mean you correlate variables to their processes via the |
Exactly. We store all instance data for a given definition across all versions (so by the One of the reasons for this is that in Optimize, it is quite typical for a report to evaluate instances for a given process from across all versions of that process, so it makes sense for these to exist in one index. Another is that splitting this up by version would/could result in too many indices. We have runtime engine customers that have thousands of definitions, with even hundreds of versions for each definition. |
Thanks Josh, we try to solve this during the quarter, but it doesn't have the highest priority. Please let us know if this is an blocking issue so that we have to increase the priority. |
Thanks @menski. So far we only know of it affecting one customer - if it becomes more frequent we will let you know that it needs bumping 👍 |
Hi @menski, do you think this is likely to be picked up in Q1? We have a focus on import performance this quarter and resolving this would be a nice improvement for Optimize. Thanks 🙂 |
I can make sure it's there for alpha2 at the latest. Of course we're also happy to review a PR from someone on your team 😉 |
8803: Include bpmnProcessId as part of exported variable records r=npepinpe a=npepinpe ## Description This PR adds a new `bpmnProcessId` property to all `VariableRecordValue` instances. This property is set during processing before the record is written in all cases. However, it is not stored in the state, but only persisted in the log, much like the `VariableRecordValue#getProcessInstanceKey` property. Doing this allows consumers of the exported record stream to index variables over a process' ID. This use case is mostly for users who want to aggregate variables over a specific process without having to care about its version (and thus the unique keys for each process/version pair). See the linked issue for more. NOTE: a small thing jumped out, which is that variables now have a few properties which are really only used for the consumers of the exported stream. I don't expect we add more of these, but if we do, at some point we might want to revise our approach instead of just tacking on the property over each record. NOTE: the `bpmnProcessId` has an empty string as default value. While we should ensure it's always set in newer versions, I assumed for backwards compatibility this is necessary. Let me know if that's not the case, maybe I forgot how the message pack decoding works. ## Related issues closes #8181 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
8803: Include bpmnProcessId as part of exported variable records r=npepinpe a=npepinpe ## Description This PR adds a new `bpmnProcessId` property to all `VariableRecordValue` instances. This property is set during processing before the record is written in all cases. However, it is not stored in the state, but only persisted in the log, much like the `VariableRecordValue#getProcessInstanceKey` property. Doing this allows consumers of the exported record stream to index variables over a process' ID. This use case is mostly for users who want to aggregate variables over a specific process without having to care about its version (and thus the unique keys for each process/version pair). See the linked issue for more. NOTE: a small thing jumped out, which is that variables now have a few properties which are really only used for the consumers of the exported stream. I don't expect we add more of these, but if we do, at some point we might want to revise our approach instead of just tacking on the property over each record. NOTE: the `bpmnProcessId` has an empty string as default value. While we should ensure it's always set in newer versions, I assumed for backwards compatibility this is necessary. Let me know if that's not the case, maybe I forgot how the message pack decoding works. ## Related issues closes #8181 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Is your feature request related to a problem? Please describe.
When importing variable records and converting them to Optimize variables, we have to fetch the definition from ES to extract the
bpmnProcessId
in order to route it to the correct Optimize index. This can be expensive, and results in retries if the process hasn't been been exported by zeebe/imported by Optimize. Furthermore, if the process has been exported but the record is no longer available (cleaned up), then the import gets stuck, as seen in a recent support issue.Describe the solution you'd like
If the
bpmnProcessId
(like how it is included on the incident records for example) was included as part of the exported variable record, we could improve the performance of import by removing the fetch, and also guard against the possibility of the import getting stuck. If we have the key, Optimize is resilient enough to continue the import even if that process never can be imported.Describe alternatives you've considered
The current implementation sort of is our alternative, but it is not ideal. The performance hit is somewhat acceptable, but the import blocking issue is more awkward for us to work around. With some refactoring, skipping this data import after x retries could potentially avoid the skip but also exposes the risk of missing data that the user would expect in Optimize
Additional context
Raised as a result of investigation from https://jira.camunda.com/browse/SUPPORT-12111
The text was updated successfully, but these errors were encountered: