Include optimization parameter #94

glatterf42 · 2024-06-28T09:25:57Z

This is the final PR bringing all changes regarding optimization.Parameter together. Originally, the goal was to have this PR be very quick because all precursors had been reviewed, but this didn't happen. After merging #82, I had to rework all these branches because for some reason, the propagation of changes upon rebasing did not happen as I expected. During that reworking phase, some errors were introduced and I didn't want to fix these errors individually on all precursor PRs, which I why I collapsed all of them to this one PR, that is now quite large (again), sorry for that.

To spare you the trip to the old PRs, let me summarize some discussions here:

There is a DeprecationWarning arising from the httpx code, it seems. When I discovered it, I thought it should be fixed by starlette 0.37.2, which was allowed by fastapi 0.111.0, but this doesn't seem to be the case. I'm not entirely sure why or how urgent it is to remove this warning.
ixmp.__init__ gains some imports, which @danielhuppmann did not like too much. We can potentially remove all except for Platform and Datapoint, but this is probably going into subsequent removal-PRs (see also Clean up top-level namespace #84).
Table and Parameter are very similar, as will be Variable and Equation. For every single layer (but @danielhuppmann asked specifically for the Core layer first), these could inherit from each other (or be composed elegantly) to avoid repetition of very similar or identical code. I agree that this is a very useful step and I already have an idea for the Core layer. I would like to implement this pretty much right away, but in a new PR, to keep it manageable.

Working on the transport tutorial, I noticed that I had completely forgotten about adding parameters = db.relationship() in the Run DB-table. I have even run migrations for these other PRs, but no error came up. I'm wondering if we even need these relationship() declarations at all.

glatterf42 · 2024-08-30T11:32:21Z

Let me quickly explain why I raised the minimum pandas version here: since we're using pandas for the upsert-functionality of parameter.add_data(), the workflow is like this:

        # Gather all indexsets/columns
        index_list = [column.name for column in parameter.columns]
        # Convert existing data to DataFrame
        existing_data = pd.DataFrame(parameter.data)
        # Set the index to be constructed as each "key", i.e. row of indexset-values
        # This allows quick comparison of keys for existence and only updates what we actually want to update, i.e. values and units.
        # Can't do that for empty dataframes, though (i.e. first addition of data to a parameter)
        if not existing_data.empty:
            existing_data.set_index(index_list, inplace=True)
        # Upsert the new data
        parameter.data = (
            data.set_index(index_list).combine_first(existing_data).reset_index()
        ).to_dict(orient="list")

I don't see any issue with that code, but pandas < 2.1.0 does: if existing_data is empty, we can't set an index on it. But if the indices of the two dataframes in combine_first() don't at least overlap, pandas doesn't know how to combine them.
In our case, this should not be an issue, though: existing data will always have the same multiindex as new data, we validate all data to ensure precisely this. The one exception, of course, is if existing_data is empty, but in that case, pandas should just take all values from the new data and don't bother about upserting -- which it does, but only as of 2.1.0.
This version was released a year ago, so I'm fine with bumping our minimum version. If you have a better idea of how to resolve this, I'm happy to hear it, too :)

(Going to make exactly the same commit on #97 and #98, too.)

meksor

Alright so this lgtm from a code point of view. I noticed Im starting to have trouble discerning all the boilerplate code from the important stuff.
I hope I can get to some bigger refactoring soon, the tests now take very long to run and its probably smart to encapsulate some stuff/move it to scse-toolkit instead of testing it for each data type (im thinking filters and docs maybe).

glatterf42 · 2024-09-25T10:16:24Z

Thanks for the review :)
I agree that a lot of code seem to be boilerplate, which is one of the reasons I opened #96. So I'm happy to help with some refactoring, if I can :)

I would likely merge this PR then, if you change your review to "approve", as @danielhuppmann doesn't seem to have enough time to review it. No one will immediately start using it, anyway, so I think the most time-efficient version for @danielhuppmann is to review #101 and/or #108, which showcase most of the added syntax. If this is not to our liking, we can still introduce clean-up PRs before merging the tutorials to main.
Besides, I've tried to make the syntax consistent with what we already have, so I think it's unlikely that there are major problems hiding in there.

Finally, @meksor, may I recommend reviewing #97 and #98 soon, too, if time permits? Both PRs are similar to this one and even more so to one another, so doing it soon would keep you from having to chew through all the boilerplate again as you probably still remember the parts that are important.

* Covers: * run__id, data, name, uniqueness of name together with run__id * Adapts tests since default order of columns changes

* Make Column generic enough for multiple parents * Introduce optimization.Parameter * Add tests for add_data * Enable remaining parameter tests (#86) * Enable remaining parameter tests * Include optimization parameter api layer (#89) * Bump several dependency versions * Let api/column handle both tables and parameters * Make api-layer tests pass * Include optimization parameter core layer (#90) * Enable parameter core layer and test it * Fix things after rebase * Ensure all intended changes survive the rebase * Adapt data validation function for parameters * Allow tests to pass again

glatterf42 added the enhancement New feature or request label Jun 28, 2024

glatterf42 requested review from meksor and danielhuppmann June 28, 2024 09:25

glatterf42 self-assigned this Jun 28, 2024

glatterf42 mentioned this pull request Jul 1, 2024

DRY optimization core layer #96

Open

glatterf42 force-pushed the include/optimization-parameter branch from 77d9fc8 to 0debc9f Compare August 22, 2024 13:47

meksor reviewed Sep 19, 2024

View reviewed changes

glatterf42 and others added 19 commits September 30, 2024 10:40

Make creation information a mixin

5499b87

Make name column a mixin

a2a63ff

Make optimization columns mixins

90b7105

* Covers: * run__id, data, name, uniqueness of name together with run__id * Adapts tests since default order of columns changes

Inherit mixin requirements directly

7b8e389

Fix references to DB filters in docs

02e3f43

Streamline naming in tests

33d186e

Fix and test parameter list and tabulate for specific runs

6910d9e

Include Run-side of relationship

247b06c

Make indexset-creation a test utility

2d02e4f

Make new tests more efficient

87dd645

Incorporate changes from #110

94b337e

Use pandas for updated add_data behaviour

8e737c6

Remove superfluous session.add() for parameter

52a07ed

Use core layer in core test

5b488ab

Raise minimum pandas version to enable add_data upsert

ce5ff14

Generalize UsageError for more optimization items

b41eb28

Use generalized UsageError for Table

f160e9c

Use own errors for Parameter

a9e5ac8

glatterf42 force-pushed the include/optimization-parameter branch from acc783f to a9e5ac8 Compare September 30, 2024 09:17

meksor approved these changes Oct 2, 2024

View reviewed changes

glatterf42 merged commit 769a198 into main Oct 3, 2024
10 checks passed

glatterf42 deleted the include/optimization-parameter branch October 3, 2024 08:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include optimization parameter #94

Include optimization parameter #94

glatterf42 commented Jun 28, 2024 •

edited

Loading

glatterf42 commented Aug 30, 2024

meksor left a comment

glatterf42 commented Sep 25, 2024

Include optimization parameter #94

Include optimization parameter #94

Conversation

glatterf42 commented Jun 28, 2024 • edited Loading

glatterf42 commented Aug 30, 2024

meksor left a comment

Choose a reason for hiding this comment

glatterf42 commented Sep 25, 2024

glatterf42 commented Jun 28, 2024 •

edited

Loading