You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is about research into improving the development workflow when interested in performance bottlenecks. While we could just create a copy of the live system for local experimentation (the same as we do with the test system every so often), it might contain personal information which we rather don't want to even be able to obtain as developers.
Devise a method of generating reasonably authentic data in comparison with the live system, namely a similar number of regions with similar content.
Verify that a dev environment with that data behaves similarly to the live environment in performance (mind resources available, which might vary greatly between the machines used for development; while e.g. the redis cache is probably only to be mentioned to the developer as a potentially deciding if not active).
This should be separate from the existing test_data.json fixture, as the small dataset is highly preferable during quick iterations on a feature, except when performance with large data is the focus. The developer should be able to switch between them with relative ease.
The text was updated successfully, but these errors were encountered:
which can be used to generate a lot of pages, however it does not cover specific edge cases which are not reflected in the original test data. So one solution could be to create a more diverse baseline test data, which hopefully would result in a more realistic dataset if the duplication algorithm is executed a few times (~1k pages for large regions is realistic).
timobrembeck
changed the title
META: Large synthetic dataset for performance evaluation
Meta: Large synthetic dataset for performance evaluation
Oct 31, 2023
timobrembeck
changed the title
Meta: Large synthetic dataset for performance evaluation
Meta: 🔡 Large synthetic dataset for performance evaluation
Nov 4, 2023
however it does not cover specific edge cases which are not reflected in the original test data
One edge case example would be #2530, where performance testing requires lots of different links, which cannot be created using the duplicate_pages tool
This issue is about research into improving the development workflow when interested in performance bottlenecks. While we could just create a copy of the live system for local experimentation (the same as we do with the test system every so often), it might contain personal information which we rather don't want to even be able to obtain as developers.
This should be separate from the existing
test_data.json
fixture, as the small dataset is highly preferable during quick iterations on a feature, except when performance with large data is the focus. The developer should be able to switch between them with relative ease.The text was updated successfully, but these errors were encountered: