Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result Pagination and REST improvements #39

Merged
merged 49 commits into from
Jan 31, 2024
Merged

Result Pagination and REST improvements #39

merged 49 commits into from
Jan 31, 2024

Conversation

meksor
Copy link
Contributor

@meksor meksor commented Jan 23, 2024

In this PR:

Removed GET list endpoints

I've removed all the GET endpoints to reduce redundancy, also all query endpoints now use the filter framework (the timeseries endpoint didn't before).

Pagination

Each query endpoint now returns all results wrapped in a standardized object:

{
  "pagination": {"limit":1000, "offset": 0}
  "total": <total no of objects in the db for the query>
  "results": [...]
}

Where results can be a list of objects or a serialized datatable.
The query parameters limit and offset can be used to control pagination. Default and maximum values for limit are yet to be found out.

@pmussak
The client side code for unpacking the pagination can be found here:
https://github.com/iiasa/ixmp4/pull/39/files#diff-6b8ca4d71cf682a8211f6d8d527b684fc42f7034d03c955505f9c0a327b93ff3R148-R203
(it should also be backwards compatible)

Bug fixes and code improvements

There was quite a few bugs and definitely unintentional typos that i found while going through the rest server.

All filter arguments in the rest layer endpoints SHOULD NO LONGER set any defaults. If I wanted to change a default I would need to change it in the filter class and in the rest layer which is confusing. I already found a bug thanks to this where the default run filter included is_default=False which is most certainly not right. Please watch out @glatterf42.

As of now the run filter should set default_only=True by default (this wasn't the case everywhere by default). This affects the rest endpoints and backend layer. Is this correct @danielhuppmann?

Test Data Generation

ixmp4 platforms generate <name>

will generate some test data and fill the given platform with it.
There is also a clas for programmatic use in ixmp4.data.generator.

IamcData/Repo

I've refactored IamcData and IamcRepository into RunIamcData and PlatformIamcData as this seems like the better approach. (fyi @danielhuppmann)

Type Hints

All type hints concerning tabulation and listing have become a more concrete list[T] or List[T] instead of the current Iterable[T].

Upload Chunking/Pagination (to solve the "timeout" problem when uploading large datasets via rest)

... is not included in this PR and will come in the next one.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through this PR to the best of my abilities - the changes look good to me, so I suggest to merge and then see whether we see any issues in the ixmp4-ts or pyam integration.


if self.enumeration_method == "GET":
params.update(kwargs)
def _enumeration_request(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to think twice before getting that this method actually executes the request and does not just prepare it.
At least for me _request_enumeration is easier to understand.

ixmp4/data/api/base.py Outdated Show resolved Hide resolved
@pmussak
Copy link
Contributor

pmussak commented Jan 24, 2024

@meksor thanks for pointing out the client side code for unpacking the pagination!

I had a look at this section of the code and left two comments.

Since I just reviewed scse-component/chart today: @deepak-shah-np I am fairly sure this change will break how the chart component fetches data from ixmp4. So I guess a release of these changes should be coordinated with the ssp-extensions app.
Maybe you are aware anyways, just wanted to make sure the ssp-extensions app does not break.

@meksor
Copy link
Contributor Author

meksor commented Jan 24, 2024

That was not on my radar, so thanks.
I will be deploying it on the dev instance for Artelys anyway, so I guess we can adjust ssp-extensions while artelys is adjusting their Client!

@pmussak
Copy link
Contributor

pmussak commented Jan 25, 2024

That was not on my radar, so thanks. I will be deploying it on the dev instance for Artelys anyway, so I guess we can adjust ssp-extensions while artelys is adjusting their Client!

Sounds like a good way to go forward. That will also give me the opportunity to implement paging in ixmp4-ts once you released these changes as docker image ene-docker.iiasa.ac.at/ixmp4-server:dev. That will be the one, right?

and nice to have ixmp4-ts for the future, so if scse-apps just depend on that (and not the ixmp4 REST API directly) we can incorporate changes in the ixmp4 REST API swiftly in all scse-apps.

@meksor
Copy link
Contributor Author

meksor commented Jan 31, 2024

I just did some tabulation benchmarks with different default page sizes:

 # @ 1000:
    # Name (time in ms)                                                                  Min                   Max                  Mean              StdDev                Median                 IQR            Outliers      OPS            Rounds  Iterations
    # test_filter_datapoints_benchmark[filters1-test_api_sqlite_mp_generated]     1,584.7705 (167.24)   1,798.6337 (162.72)   1,692.8689 (166.44)   70.2911 (119.56)   1,700.0772 (165.54)   128.0212 (159.83)        5;0   0.5907 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_api_pgsql_mp_generated]      1,654.1651 (174.56)   1,856.6805 (167.97)   1,733.0510 (170.39)   76.6519 (130.38)   1,711.3418 (166.64)   105.7047 (131.97)        3;0   0.5770 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_sqlite_mp_generated]         1,663.0025 (175.49)   1,864.9028 (168.71)   1,770.1756 (174.04)   68.6123 (116.71)   1,763.3645 (171.70)   133.9527 (167.23)        5;0   0.5649 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_pgsql_mp_generated]          1,692.6607 (178.62)   1,863.0845 (168.55)   1,809.3409 (177.89)   61.5830 (104.75)   1,836.3951 (178.81)    49.3257 (61.58)         2;2   0.5527 (0.01)         10           1
    # @ 5000:
    # Name (time in ms)                                                                  Min                   Max                  Mean              StdDev                Median                 IQR            Outliers      OPS            Rounds  Iterations
    # test_filter_datapoints_benchmark[filters1-test_api_pgsql_mp_generated]      1,480.1531 (147.95)   1,778.7158 (159.82)   1,592.0231 (148.94)   105.4753 (320.37)   1,597.7156 (149.21)   173.1558 (405.18)        3;0   0.6281 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_api_sqlite_mp_generated]     1,656.0123 (165.53)   1,863.8410 (167.47)   1,755.9859 (164.28)    71.1137 (216.00)   1,755.2059 (163.92)   139.8603 (327.27)        4;0   0.5695 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_pgsql_mp_generated]          1,879.7496 (187.89)   2,087.3115 (187.55)   1,997.9998 (186.92)    72.0661 (218.90)   2,005.7560 (187.32)   134.7412 (315.29)        4;0   0.5005 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_sqlite_mp_generated]         1,899.9429 (189.91)   2,186.0449 (196.42)   2,035.5030 (190.43)    94.7828 (287.90)   2,058.0543 (192.20)   137.7293 (322.28)        4;0   0.4913 (0.01)         10           1
    # @ 10_000:
    # test_filter_datapoints_benchmark[filters1-test_sqlite_mp_generated]         1,664.2066 (176.64)   1,861.3096 (172.89)   1,754.7241 (173.31)   69.9147 (135.18)   1,743.5030 (171.19)   136.4173 (181.76)        4;0   0.5699 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_api_sqlite_mp_generated]     1,821.1961 (193.30)   1,928.2479 (179.11)   1,862.4421 (183.95)   34.6745 (67.04)    1,851.6245 (181.81)    46.5347 (62.00)         3;0   0.5369 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_api_pgsql_mp_generated]      1,830.7979 (194.32)   1,941.0381 (180.30)   1,867.1940 (184.42)   34.5950 (66.89)    1,864.1469 (183.04)    52.7134 (70.24)         3;0   0.5356 (0.01)         10           1
    # test_filter_datapoints_benchmark[filters1-test_pgsql_mp_generated]          2,218.4879 (235.47)   2,364.2887 (219.61)   2,282.1817 (225.41)   37.6066 (72.71)    2,279.4920 (223.82)    14.0702 (18.75)         2;3   0.4382 (0.00)         10           1

Seems as though larger page sizes also mean longer requests, although the difference is not so noticeable and the TCP round trip time on a real production setup will probably counteract some of the effects.

I've now set the default page size to 5_000 and the maximum to 10_000. I've also added these numbers as parameters to the settings object so they can be set via the env variables IXMP4_MAX_PAGE_SIZE and IXMP4_DEFAULT_PAGE_SIZE. This way we can change them if we need to without pushing an update.

@meksor meksor merged commit 5d06552 into main Jan 31, 2024
4 checks passed
@meksor meksor deleted the feature/pagination branch January 31, 2024 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants