Add aggregated read functionality #87

znicholls · 2019-02-01T06:51:26Z

Pull request

Please confirm that this pull request has done the following:

Tests added
Discussed tests added with co-authors
Documentation added (where applicable)
Example added (either to an existing notebook or as a new notebook, where applicable) (N/A as low level)
Description in CHANGELOG.rst added
make black

Adding to CHANGELOG.rst

Please add a single line in the changelog notes similar to one of the following:

- (`#XX <http://link-to-pr.com>`_) Added feature which does something
- (`#XX <http://link-to-pr.com>`_) Fixed bug identified in (`#XX <http://link-to-issue.com>`_)

znicholls · 2019-02-05T04:24:49Z

@swillner close #92 and #86 before looking at this (builds on both those PRs)

openscm/parameter_views.py

znicholls · 2019-02-09T00:10:28Z

@swillner are you happy with the tests that have been added?

swillner · 2019-02-09T00:16:30Z

@swillner are you happy with the tests that have been added?

Very! ;)

openscm/parameter_views.py

Co-Authored-By: znicholls <zebedee.nicholls@climate-energy-college.org>

znicholls · 2019-02-09T00:45:06Z

@swillner this is good to go from my end

swillner

I am afraid there are some issues with this approach:

Why do we need to catch NameError?
The unit converter converts between this parameter's unit and the requested one, so the one of the child parameter might be different (same for the timeframe)
Once we attempt an aggregated view, shall we allow added children?
Minor: Return type is missing
Minor: We should probably make this a protected ("_"-prefixed) method
Major: Rather than summing and then caching the sum in self._parameter._data I would collect views to the children's values and sum them on get (comes with the penality of always doing the sum for each get, even if there have been no changes, but the conversion is done and changes in the child parameters reflect on a new get as is the idea of the parameters views)
The recursion can be much simplyfied similar to:

    def _sum_child_data(self) -> float:
        """
        Sum child data.

        Returns
        -------
        float
            Sum of all child data
        """
        if self._parameter._children:
            data = 0
            for _, cp in self._parameter._children.items():
                data += self._unit_converter.convert_from(cp._sum_child_data()) # TODO fix unit conversion
        else:
            return self._parameter._data

(similar for timeseries).

znicholls · 2019-02-09T06:03:02Z

* Why do we need to catch NameError?

We don't, it's just another way to do these loops where you initialise in the first iteration then add later. These are all equivalent I think

from timeit import default_timer as timer

def sum_iterable(nums):
    data = 0  # you have to know the type in advance
    for j, a in enumerate(nums):
        data += a

    return data

start = timer()
numbers = [1, 3, 5]
for i in range(10000):
    res = sum_iterable(numbers)

print("result = {}".format(res))
end = timer()
print("{:.2f}ms".format((end - start) * 1000))

~7ms

from timeit import default_timer as timer

def sum_iterable(nums):
    for j, a in enumerate(nums):
        data = a if j == 0 else data + a

    return data

start = timer()
numbers = [1, 3, 5]
for i in range(10000):
    res = sum_iterable(numbers)

print("result = {}".format(res))
end = timer()
print("{:.2f}ms".format((end - start) * 1000))

~8ms

from timeit import default_timer as timer

def sum_iterable(nums):
    for a in nums:
        try:
            data += a
        except NameError:
            data = a

    return data

start = timer()
numbers = [1, 3, 5]
for i in range(10000):
    res = sum_iterable(numbers)

print("result = {}".format(res))
end = timer()
print("{:.2f}ms".format((end - start) * 1000))

~10.23ms

Looking at above I'll move to initialising as numpy arrays can add to scalars without issue so initialising with zero shouldn't be a problem.

znicholls · 2019-02-09T06:05:10Z

* The unit converter converts between this parameter's unit and the _requested one_, so the one of the child parameter might be different (same for the timeframe)

good pick up, will change

* Once we attempt an aggregated view, shall we allow added children?

Thinking about it more, not sure. Yes means more flexibility but you can get confused as the same view will give different answers depending on when it's called.

* Minor: Return type is missing

Yep

* Minor: We should probably make this a protected ("_"-prefixed) method

Yep

* Major: Rather than summing and then caching the sum in `self._parameter._data` I would collect views to the children's values and sum them on `get` (comes with the penality of always doing the sum for each get, even if there have been no changes, but the conversion is done and changes in the child parameters reflect on a new `get` as is the idea of the parameters views)

nice

znicholls · 2019-02-09T09:19:02Z

Alright let's have another go. I think the only thing I didn't implement from your suggestions is this.

Once we attempt an aggregated view, shall we allow added children?

Thinking about it more, not sure. Yes means more flexibility but you can get confused as the same view will give different answers depending on when it's called. Maybe that's the behaviour we want, how much do we want to 'lock the low level'/do views normally imply the data doesn't change?

swillner · 2019-02-09T09:44:45Z

still having a lot of comments. mind if i make a pr to this one later?

znicholls · 2019-02-09T09:47:14Z

still having a lot of comments. mind if i make a pr to this one later?

yep, no rush

swillner · 2019-02-10T17:51:17Z

Thinking about it more, not sure. Yes means more flexibility but you can get confused as the same view will give different answers depending on when it's called. Maybe that's the behaviour we want, how much do we want to 'lock the low level'/do views normally imply the data doesn't change?

Well, the point of the views is that they yield the most up-to-date data for each read via get so the values should be able to change especially when new data is written to them (e.g. for a new run). But adding a new child is something different....

swillner · 2019-02-10T18:23:36Z

Just saw, that I disallow adding child parameters once a parameter has been read from anyway, which makes sense for non-aggregating reads, so we just apply the same rationale here.

znicholls · 2019-02-12T21:34:19Z

@swillner should be good to go now

swillner · 2019-02-13T13:04:44Z

nice!

swillner added the wip Work in progress (for PRs) label Feb 1, 2019

swillner changed the title ~~WIP: Add failing test~~ Add failing test Feb 1, 2019

znicholls changed the title ~~Add failing test~~ Simplify parameterset reading Feb 1, 2019

znicholls changed the title ~~Simplify parameterset reading~~ Add aggregated read functionality Feb 3, 2019

swillner self-assigned this Feb 4, 2019

znicholls force-pushed the simplify-parameterset-reading branch from 6912a63 to 57350fa Compare February 4, 2019 23:45

znicholls force-pushed the add-aggregated-read-functionality branch from 3a5e69b to 809a20f Compare February 5, 2019 04:24

znicholls mentioned this pull request Feb 5, 2019

Add scenario module #89

Closed

4 tasks

znicholls force-pushed the simplify-parameterset-reading branch from 57350fa to 7ba5b20 Compare February 5, 2019 11:21

znicholls commented Feb 5, 2019

View reviewed changes

openscm/parameter_views.py Outdated Show resolved Hide resolved

swillner mentioned this pull request Feb 8, 2019

Parameter views: Aggregation over subregions #69

Open

znicholls force-pushed the simplify-parameterset-reading branch from 7ba5b20 to 390f48c Compare February 8, 2019 23:01

znicholls changed the base branch from simplify-parameterset-reading to master February 9, 2019 00:03

znicholls added 5 commits February 9, 2019 11:08

Update CHANGELOG

ae91767

Add failing test

844fadb

Pass aggregated read basic test

a368091

Update CHANGELOG

81d8bfe

Expand test coverage

3685de1

znicholls force-pushed the add-aggregated-read-functionality branch from 809a20f to 3685de1 Compare February 9, 2019 00:09

swillner reviewed Feb 9, 2019

View reviewed changes

openscm/parameter_views.py Outdated Show resolved Hide resolved

znicholls and others added 3 commits February 9, 2019 11:21

Add failing test of scalar view aggregation

1f34adb

Update openscm/parameter_views.py

69d8279

Co-Authored-By: znicholls <zebedee.nicholls@climate-energy-college.org>

Pass test of scalar view aggregation

4c389cc

znicholls force-pushed the add-aggregated-read-functionality branch from aaf9569 to 4c389cc Compare February 9, 2019 00:37

znicholls added 2 commits February 9, 2019 11:44

Refactor to reduce duplication

3797d84

Make black

818bb13

Remove spurious comment

a722ab2

swillner requested changes Feb 9, 2019

View reviewed changes

znicholls added 2 commits February 9, 2019 17:19

Add failing test of unit source and target storage

f6792f6

Pass test of unit storage

8d2dba8

Update after @swillner's comments

403f7f2

znicholls force-pushed the add-aggregated-read-functionality branch from b5710c8 to 403f7f2 Compare February 9, 2019 09:20

swillner and others added 3 commits February 13, 2019 08:27

Rewrite aggregation to collect list of children views

9ed8bc2

Rename _get_child_data_views and make local

4acb682

Add one more test and make black

861ba69

znicholls removed the wip Work in progress (for PRs) label Feb 12, 2019

Tweak one docstring

3d1b991

znicholls added the enhancement New feature or request label Feb 13, 2019

swillner approved these changes Feb 13, 2019

View reviewed changes

swillner merged commit 3783280 into master Feb 13, 2019

swillner deleted the add-aggregated-read-functionality branch February 13, 2019 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add aggregated read functionality #87

Add aggregated read functionality #87

znicholls commented Feb 1, 2019 •

edited

znicholls commented Feb 5, 2019

znicholls commented Feb 9, 2019

swillner commented Feb 9, 2019

znicholls commented Feb 9, 2019

swillner left a comment •

edited

znicholls commented Feb 9, 2019

znicholls commented Feb 9, 2019 •

edited

znicholls commented Feb 9, 2019

swillner commented Feb 9, 2019

znicholls commented Feb 9, 2019

swillner commented Feb 10, 2019

swillner commented Feb 10, 2019

znicholls commented Feb 12, 2019

swillner commented Feb 13, 2019

Add aggregated read functionality #87

Add aggregated read functionality #87

Conversation

znicholls commented Feb 1, 2019 • edited

Pull request

Adding to CHANGELOG.rst

znicholls commented Feb 5, 2019

znicholls commented Feb 9, 2019

swillner commented Feb 9, 2019

znicholls commented Feb 9, 2019

swillner left a comment • edited

Choose a reason for hiding this comment

znicholls commented Feb 9, 2019

znicholls commented Feb 9, 2019 • edited

znicholls commented Feb 9, 2019

swillner commented Feb 9, 2019

znicholls commented Feb 9, 2019

swillner commented Feb 10, 2019

swillner commented Feb 10, 2019

znicholls commented Feb 12, 2019

swillner commented Feb 13, 2019

znicholls commented Feb 1, 2019 •

edited

swillner left a comment •

edited

znicholls commented Feb 9, 2019 •

edited