Show extended run-"index" from `Platform.meta.tabulate()` #33

danielhuppmann · 2023-11-23T14:20:26Z

While further working on the pyam-imxp4-integration, I noticed an inconsistency:

Platform.runs.tabulate() returns a dataframe with the columns run-id, model, scenario, version
Platform.iamc.tabulate() returns a dataframe with (only) the "user-readable index" model, scenario, version, ...
Platform.meta.tabulate() returns a dataframe only with (only) the run-id, ...

In actual work (like the pyam-integration), the model-scenario-version is more relevant.

So this PR extends the Platform.meta.tabulate() to return the "extended index" of model-name, scenario-name and version number.

As a utility, the PR also adds a map() endpoint to the model- and scenario-repository for easy translation of id's to names.

If there is a need for having (only) the run-id at the facade/core-layer, I'd be happy to implement that as a kwarg to both iamc/meta tabulate methods (and if someone has a good suggestion for the name of that argument...)

danielhuppmann · 2023-11-23T14:23:26Z

The implementation follows the approach in Platform.runs.tabulate() where only the id's are queried from the backend and expanded in the core layer. I guess that joining the tables in the ORM-layer would be a viable alternative, not sure if there is a performance difference...

meksor · 2023-11-28T13:09:17Z

I think there is a strong argument to be made to do this (optionally) in the database layer. Performance wise there is definitely a difference: for big tables I expect faster execution and less memory usage in the database.

There is already similar functionality in the DataPointRepository when supplying join_parameters:

ixmp4/ixmp4/data/db/iamc/datapoint/repository.py

Lines 159 to 173 in 6eead51

    
           def select( 
        
               self, 
        
               *, 
        
               join_parameters: bool | None = False, 
        
               join_runs: bool = False, 
        
               _filter: DataPointFilter | None = None, 
        
               **kwargs: Any 
        
           ) -> db.sql.Select: 
        
               exc = ( 
        
                   self.select_joined_parameters(join_runs) 
        
                   if join_parameters 
        
                   else select(self.bundle) 
        
               ) 
        
               return super().select(_exc=exc, _filter=_filter, **kwargs)

More detail in select_joined_parameters.

Side note:
Currently there doesnt seem to be the need to implement the map method in both the api and db layer. You can just leave it in the abstract base class since its the same code...

danielhuppmann · 2023-11-28T13:17:58Z

I think there is a strong argument to be made to do this (optionally) in the database layer. Performance wise there is definitely a difference: for big tables I expect faster execution and less memory usage in the database.

Thanks for the feedback. Just for my understanding, does the faster database-execution outweigh the downside of sending a table twice the size via the Rest API?

meksor · 2023-11-28T13:24:36Z

That's a good question. My guess: yes, since gzip will hopefully take care of compressing away any repeating patterns and sending a seperate request for the list of all runs also presents some overhead.

danielhuppmann · 2024-01-04T14:42:54Z

closing in favor of #37

danielhuppmann added 2 commits November 23, 2023 14:18

Add map method to model- and scenario backend

e8b2d49

Show extended run-"index" from Platform.meta.tabulate()

f1aef5c

danielhuppmann requested review from meksor, phackstock and glatterf42 November 23, 2023 14:20

danielhuppmann self-assigned this Nov 23, 2023

danielhuppmann added 4 commits November 23, 2023 15:28

Add map methods to core

6bad2c9

Allow kwargs-filters in RunRepository list/tabulate

1ae15dd

Use direct merge with runs-table

da143a0

Fix style errors

ad72706

danielhuppmann marked this pull request as ready for review November 27, 2023 09:57

danielhuppmann mentioned this pull request Dec 11, 2023

Implement map() method for model and scenario repositories #34

Merged

danielhuppmann mentioned this pull request Jan 4, 2024

Show extended run-"index" from Platform.meta.tabulate() #37

Merged

danielhuppmann closed this Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show extended run-"index" from `Platform.meta.tabulate()` #33

Show extended run-"index" from `Platform.meta.tabulate()` #33

danielhuppmann commented Nov 23, 2023

danielhuppmann commented Nov 23, 2023

meksor commented Nov 28, 2023

danielhuppmann commented Nov 28, 2023

meksor commented Nov 28, 2023

danielhuppmann commented Jan 4, 2024

Show extended run-"index" from Platform.meta.tabulate() #33

Show extended run-"index" from Platform.meta.tabulate() #33

Conversation

danielhuppmann commented Nov 23, 2023

danielhuppmann commented Nov 23, 2023

meksor commented Nov 28, 2023

danielhuppmann commented Nov 28, 2023

meksor commented Nov 28, 2023

danielhuppmann commented Jan 4, 2024

Show extended run-"index" from `Platform.meta.tabulate()` #33

Show extended run-"index" from `Platform.meta.tabulate()` #33