Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migration script for migrating old execution field data to the new format #5255

Merged
merged 35 commits into from May 13, 2021

Conversation

Kami
Copy link
Member

@Kami Kami commented May 7, 2021

This pull request adds a simple migration script which migrations data for executions which utilize old and slow + inefficient fielld type to the new type.

I tested it locally and it works fine (it's also a bit of a PITA to manually test, easiest is to just create MongoDB snapshot with old version of st2 and re-use that with st2 master).

Implementation Details

#4846 introduced a new field types which is up to 10-30x faster when saving and retrieving objects with larger values.

In that PR I wasn't a massive fan of a migration script - yes I could add it, but I already spent many, many days on those performance optimization changes and I think migration script adds little value yet the effort is not totally trivial - yes the code changes mostly is, but things are not just a code change - it includes test cases, documentation, upgrade notes, etc and that takes quite some effort.

The migration script doesn't provide tons of value because it only works in old executions and most users only care about new and recent executions so if when viewing data for some old execution it takes longer to load it's not the end of the world.

I would be all of migration if that would actually provide more value - e.g. reduce maintenance burden / amount of code we need to maintain, but that is not the case and it likely won't change any time soon. We still utilize those two field types in many other places and migration script step is optional and manual and we can't really force users to run them and assume by the next release everyone has done it so we can just remove that code.

For now, I only implemented it for execution related objects. Technically we could also do it for workflow, trigger instance and rule related objects, but that would require even more work and some more "work around" to determine if specific object in database utilizes all type or now - it's possible to detect that, but it's not totally trivial and just re-writing all objects sames wasteful.

Complete contributions (with tests) for other models are also welcome (especially in this case where the change itself will likely save many 10's of thousands and likely even much more $ per month in terms of CPU utilization across all the large StackStorm installations so contributing such a small change seems worth trade off for such a big improvement :)).

TODO

… the old

and and very slow field type to the new field type.

This is really an optional script since all the new objects will already
utilize new field type and most users only care about new / recent
executions (old executions are not retrieved that often so that taking a
bit longer is not the end of the world).
@Kami Kami added this to the 3.5.0 milestone May 7, 2021
@pull-request-size pull-request-size bot added the size/L PR that changes 100-499 lines. Requires some effort to review. label May 7, 2021
@Kami Kami force-pushed the db_field_type_migration_scripts branch from 339b2fb to b9fc5e0 Compare May 7, 2021 19:47
@Kami Kami changed the title [WIP] Add migration script for migrating old execution field data to the new format Add migration script for migrating old execution field data to the new format May 8, 2021
Kami added 5 commits May 8, 2021 22:16
Previously setup-virtualenv step would fail in case Python version got
upgraded since we used cache per partial Python version and not a
complete one.

I believe this change should fix that issue.
@Kami
Copy link
Member Author

Kami commented May 9, 2021

I also pushed a change so we specify complete Python version for setup-python action and virtualenv cache.

Previously we didn't use complete version for virtualenv directory cache which means build started to fail once Github actions upgraded Python 3.8 from 3.8.9 to 3.8.10 (which looks like it happened some time in the last couple of days).

Example builds:

Technically, we could still only specify major + minor version, but we would still need cache per major + minor + patch version which means we would need another Github actions step which captured complete version post setup-python action which just adds additional complexity.

/cc @cognifloyd

directly affect us since we don't utilize websockets functionality).
@Kami
Copy link
Member Author

Kami commented May 9, 2021

I also included eventlet bump in this PR - that security issue doesn't affect us directly since we don't utilize websocket functionality, but it's still a good idea to upgrade.

@Kami Kami force-pushed the db_field_type_migration_scripts branch 2 times, most recently from 24a3d74 to 5338a9c Compare May 9, 2021 18:26
@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones Does that execution have result_size attribute in the collection document (you can use mongo shell to find out)?

Since that's what ?max_result_size query parameter logic utilizes. As described in st2web PR, if that attribute is not present, we can't perform that filtering and full result will be returned (same as before).

EDIT: It could be that migration script doesn't set result_size, but I would need to double check. If that's the case, I will need to make a small change to the migration script.

temporary workaround for AJ which will be removed after he runs this
script and confirms it works.
@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones This should do the trick for you - 2a0969f.

As you can see from the commit message and code, it include "one off" code change for your situation - once you confirm it works, I will remove it since that won't be needed anymore for other users.

I confirmed that change myself - ran mongo query to unset result size (db.action_execution_d_b.update({}, {$unset: {result_size: 1}}, false, true)) and then ran migration script.

@guzzijones
Copy link
Contributor

This is a new execution i just ran with latest build of master.

https://stackstorm-vf-dev.xxx.com/api/v1/executions/609bfafc35774f26bfb9f31c?max_result_size=102400

brought back a 3MB json and chrome is struggling to render the component to show the results still.

@guzzijones
Copy link
Contributor

st2-3.5dev-65.x86_64.rpm

@Kami
Copy link
Member Author

Kami commented May 12, 2021

What happens if you manually run cURL command to retrieve execution details with ?max_result_size query param? And as per my comment above - inspect collection using mongo engine and make sure that execution contains result_size attribute.

I can't reproduce it locally - with latest master it works fine for all the new (and migrated) executions.

@guzzijones
Copy link
Contributor

I will need to lookup how to query mongo directly. Let me see if I can get the latest rpm installed, run a new workflow, and check the st2web result

@Kami
Copy link
Member Author

Kami commented May 12, 2021

Something like this should work in mongo shell:

use st2;
db.action_execution_d_b.find({"_id": ObjectId("ID")}).pretty();

@guzzijones
Copy link
Contributor

{ "_id" : ObjectId("609bfafc35774f26bfb9f31c"), "result_size" : 232 }

it says the result size is 232? This json is 3MB

@guzzijones
Copy link
Contributor

fwiw this is a workflow not an action.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones Ah, that does change everything - for workflow executions, I believe that actual execution result field will just contain what workflow explicitly publishes as output (but, I need to double check and confirm).

How does workflow work - does it publish the whole result / similar? A workflow which reproduces this issue would be ideal.

But it could also be that it works as expected and it's more an issue / feature request for st2web.

@guzzijones
Copy link
Contributor

The problem is the input to the workflow is huge. And that is displayed via st2web.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones If it's indeed related to input (workflow action parameter?) then this it has nothing to do with this functionality and migration script - this one only solves it for execution results (same for related st2web change).

Having said that, please still post all the context which is needed to reproduce it (workflow + action definitions + data / parameters which reproduces it).

But if it's indeed not related to the result attribute itself, then you will need to open a new issue (feature request) for it, likely against st2web repo.

@guzzijones
Copy link
Contributor

yes it is a feature request against st2web. Whatever you did for the result set we also need to do for input parameters.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

That will probably be quite an involved change and likely also need to involve "support" on the server side if we want to implement it efficiently (aka adding parameters_size field or similar...). Even more so, because I don't think we have any "active" st2web contributor.

So likely sadly nothing which will happen any time soon.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones Also, have you re-ran the migration script yet with my temporary change (migrating executions without result_size field)?

Please let me know when you and finish running it so I can remove that temporary change.

@guzzijones
Copy link
Contributor

go ahead and revert that change.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

@guzzijones Thanks for letting me know - will revert it.

@Kami
Copy link
Member Author

Kami commented May 12, 2021

@winem I believe all your feedback has been addressed and this is now ready for a final look.

# We retrieve and process up to this amount of object IDs in a single MongoDB query. This is done
# to avoid potentially retrieving many 10s of millions of object ids in a single query
# TODO: Implement batching --since-ts, --model=action,trigger,workflow flag
BATCH_SIZE = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like batching is not yet implemented and I guess this variable definition is obsolete? I'll approve the PR to unblock you and all those looking forward for the fix on the apt cache issue as this is just a minor issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is not needed anymore - won't implement it since --start-dt and --end-dt was implemented which can be used to achieve similar end result.

Will remove it.

@Kami Kami merged commit 9402197 into master May 13, 2021
@Kami Kami deleted the db_field_type_migration_scripts branch May 13, 2021 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external dependency migrations mongodb performance size/L PR that changes 100-499 lines. Requires some effort to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants