Feature/nd arrays #67

tomalrussell · 2017-05-17T15:54:25Z

[Delivers #143885653]

This rewrites SpaceTimeConvertor and its tests, though the tests should next be written to mock the RegionRegister/IntervalRegister which do the work of converting, and currently fail because they still expect SpaceTimeValues.

Will need to handle more than one dimension. May also revisit handling of coefficients (sparse matrix multiplication approach?) instead of current iteration through regions.

SpaceTimeConvertor assumes it gets passed a numpy.ndarray with shape num_regions x num_intervals. Room for more numpy-idiomatic implementation.

TODO - test assertions that all values must be known - or NaN - in a timestep x region x interval data array

willu47

Great work @tomalrussell. I've made a bunch of comments there. A few niggles, some as placeholders for future work, some optimisations and some ideas for refactoring/harmonising the interval/area conversion. If you can fix the niggles, we can discuss the other bits next week.

willu47 · 2017-05-17T16:04:52Z

smif/convert/__init__.py

-        return converted_data
+        num_regions = len(self.regions.get_regions_in_set(to_spatial))
+        num_intervals = data.shape[1]
+        converted = np.empty((num_regions, num_intervals))


Ooh, this is a good one! Beware of np.empty. I have been burnt many times before. np.empty doesn't fill the array with zeros. Use np.zeros for safety, as tracking down the bugs that this causes can be tricky!

Sounds safer - see 7aec9ca shortly

willu47 · 2017-05-17T16:31:41Z

smif/convert/__init__.py

+        # transpose data and iterate through 2nd dimension
+        for idx, region_slice in enumerate(data.transpose()):
+            converted[:, idx] = self.regions.convert(region_slice, from_spatial, to_spatial)
+        return converted



One optimisation here using a numpy function:

def _convert_regions(self, data, from_spatial, to_spatial): """Slice, convert and compose regions """ converted = np.apply_along_axis(self.regions.convert, 0, data, from_spatial, to_spatial) return converted

Great! b20aeb2 deals with this and the below

willu47 · 2017-05-17T16:35:46Z

smif/convert/__init__.py

-        Returns
-        -------
-        bool
+    def _convert_intervals(self, data, from_temporal, to_temporal):


And the contents of this can be replaced with:

def _convert_intervals(self, data, from_temporal, to_temporal): """Slice, convert and compose intervals """ converted = np.apply_along_axis(self.intervals.convert, 1, data, from_temporal, to_temporal) return converted

willu47 · 2017-05-17T16:53:32Z

smif/convert/area.py

+        for from_region_name, from_value in zip(from_set_names, data):
+            for to_region_name, coef in coefficents[from_region_name]:
+                to_region_idx = to_set_names.index(to_region_name)
+                converted[to_region_idx] += coef*from_value


This is a sum product operation over coefficients (which includes the mapping of the regions implicit in from_values and converted), and values. e.g. if values is length 10 (from_set_name has 10 regions), and to_set_name is length 5 (5 regions), coefficients is a sparse 10*5 matrix and you want to find the sum product of length 5.

e.g.

converted = np.dot(from_value, coef)

willu47 · 2017-05-17T16:58:35Z

smif/convert/area.py

@@ -63,12 +64,12 @@ class RegionRegister(object):
    between data values relating to compatible sets of regions.
    """
    def __init__(self):
-        self._register = {}
+        self._register = OrderedDict()
        self._conversions = defaultdict(dict)


It may be more efficient to store a dict of ndarrays (generated by _conversion_coefficients)

Yes, would be more space-efficient not to hold onto the regions - though this is a bigger change.

Currently we hold on to all the region sets while building the conversion register so that adding a region triggers the generation of the conversions to/from the new region and each previously-registered region.

Could change the API so we have to register each conversion with knowledge of the pair of region sets? Or could introduce a builder-type step that holds onto all regions while setting up, and outputs the more parsimonious representation that the converter needs.

willu47 · 2017-05-17T17:29:30Z

smif/sos_model.py

-        ]
-
-        if any([len(results) < 2 for results in model_set_results]):
+        if any([len(results) < 2 for results in self.iterated_results.values()]):


timestep argument isn't used

Sorted in 0076375

willu47 · 2017-05-17T17:31:00Z

smif/sos_model.py


-                data[year][param] = SpaceTimeValue(region, interval, value, units)
+                data[param] = np.empty((num_timesteps, num_intervals, num_regions))


Safer to use np.zeros (really), particularly when filling of arrays is deferred...

Fixed, along with below, in e68d60b

willu47 · 2017-05-17T17:32:44Z

smif/sos_model.py

+        if len(timestep_names) == 0:
+            self.logger.error("No timesteps found when loading %s", param)
+
+        data = np.empty((


willu47 · 2017-05-17T17:35:55Z

tests/convert/test_area.py

-        expected = {'a': 0.5, 'b': 0.5}
-        assert converted == expected
+        expected = np.ones(2) / 2
+        assert all(converted == expected)


numpy.testing contains helper methods assert_equal and all_close` which handle array comparison

That's clearer - changed in b60e16c, and below

willu47 · 2017-05-17T17:36:30Z

tests/convert/test_area.py


    def test_convert_from_half(self, regions_rect, regions_half_squares):
        rreg = RegionRegister()
        rreg.register(regions_rect)
        rreg.register(regions_half_squares)

-        data = {'a': 0.5, 'b': 0.5}
+        data = np.ones(2) / 2


np.array([0.5, 0.5]) would be clearer!

willu47 · 2017-05-17T17:48:59Z

smif/convert/__init__.py

-    @staticmethod
-    def _regionalise_data(data):
-        """
+    def convert(self, data, from_spatial, to_spatial, from_temporal, to_temporal):


(read after comments below) Thinking through optimisations/harmonisations in this module, the conversions in area and intervals could both be brought to this module and performed as a 2-dim matrix operation. This requires coefficient matrices for region and intervals to be generated either together or independently (probably independently and then pre-combined at run-time, as not every combination of regional and temporal conversion will ever be required). Probably a candidate for memoization.

For the avoidance of random-memory initialisation bugs that may come from np.empty if the entire array is not written over.

Absolute and relative tolerances for testing similarity of model outputs, and maximum iterations to run before raising an error

willu47 · 2017-05-22T14:51:17Z

Tests current failing due to problem with v5.2 of pyomo and <= v4.6 of GLPK

coveralls · 2017-05-22T15:07:24Z

Coverage decreased (-0.5%) to 95.692% when pulling c343020 on tomalrussell:feature/nd-arrays into 1a88693 on nismod:master.

coveralls · 2017-05-22T15:13:37Z

Coverage decreased (-0.1%) to 96.062% when pulling d2d57be on tomalrussell:feature/nd-arrays into 1a88693 on nismod:master.

Instead of materialising the list with a comprehension - see https://www.quantifiedcode.com/app/issue_class/53lnAzfW for suggestion and justification.

tomalrussell added 11 commits May 12, 2017 18:15

Use numpy.ndarray as data type for space/time conversions

751802b

This rewrites SpaceTimeConvertor and its tests, though the tests should next be written to mock the RegionRegister/IntervalRegister which do the work of converting, and currently fail because they still expect SpaceTimeValues.

Convert 1D numpy.ndarray of values for regions

6a04182

Will need to handle more than one dimension. May also revisit handling of coefficients (sparse matrix multiplication approach?) instead of current iteration through regions.

Replace TimeSeries with plain numpy.ndarray

14cd160

Convert regions and intervals a-slice-at-a-time

6cc057a

SpaceTimeConvertor assumes it gets passed a numpy.ndarray with shape num_regions x num_intervals. Room for more numpy-idiomatic implementation.

Remove SpaceTimeValue tests

8cd20c0

Replace SpaceTimeValue in test_sos_model

be76cf0

TODO - test assertions that all values must be known - or NaN - in a timestep x region x interval data array

Replace SpaceTimeValue with np.array in fixture

2d79fff

Remove SpaceTimeValue from smif and SosModel

7644e30

Rewrite fixture SectorModels to use np.array

129ddd6

Remove custom SpaceTimeValue YAML representation

f7f2572

Update SosModel to use np.array as data format

668f2cb

tomalrussell requested a review from willu47 May 17, 2017 15:54

tomalrussell mentioned this pull request May 17, 2017

Feature/units metadata #68

Merged

willu47 suggested changes May 17, 2017

View reviewed changes

willu47 reviewed May 17, 2017

View reviewed changes

tomalrussell added 12 commits May 22, 2017 14:05

Converter to start with np.zeros (not empty)

7aec9ca

For the avoidance of random-memory initialisation bugs that may come from np.empty if the entire array is not written over.

Refactor converter to use np.apply_along_axis

b20aeb2

Fix SosModel._convert_data docstring

f1a11bd

Remove unused timestep parameter in ModelSet.converged

0076375

Remove np.zeros use in SosModel

e68d60b

Fix ModelSet.converged() usage in tests

887b9db

Use np.testing.assert_equal for clarity in convert/test_area

b60e16c

Set ModelSet convergence parameters when building SosModel

1d96532

Absolute and relative tolerances for testing similarity of model outputs, and maximum iterations to run before raising an error

Read convergence settings from model.yaml

f7e6d33

Fix pyomo version at ~=5.1

9435dd8

Fix pyomo version at ==5.1.1

a88ac95

Revert pyomo version change and skip test

ca0b655

Travis install GLPK through conda, not apt

c343020

willu47 approved these changes May 22, 2017

View reviewed changes

Undo skipping optimisation test

d2d57be

Pass iterator to all in ModelSet.converged

aa6aa1e

Instead of materialising the list with a comprehension - see https://www.quantifiedcode.com/app/issue_class/53lnAzfW for suggestion and justification.

tomalrussell merged commit 7cfc7a8 into nismod:master May 22, 2017

tomalrussell deleted the feature/nd-arrays branch May 23, 2017 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/nd arrays #67

Feature/nd arrays #67

tomalrussell commented May 17, 2017

willu47 left a comment

willu47 May 17, 2017

tomalrussell May 22, 2017

willu47 May 17, 2017 •

edited

Loading

tomalrussell May 22, 2017

willu47 May 17, 2017 •

edited

Loading

tomalrussell May 22, 2017

willu47 May 17, 2017

willu47 May 17, 2017

tomalrussell May 22, 2017

willu47 May 17, 2017

tomalrussell May 22, 2017

willu47 May 17, 2017

tomalrussell May 22, 2017

willu47 May 17, 2017

willu47 May 17, 2017

tomalrussell May 22, 2017

willu47 May 17, 2017

willu47 May 17, 2017

willu47 commented May 22, 2017 •

edited

Loading

coveralls commented May 22, 2017

coveralls commented May 22, 2017


		data[year][param] = SpaceTimeValue(region, interval, value, units)
		data[param] = np.empty((num_timesteps, num_intervals, num_regions))

Feature/nd arrays #67

Feature/nd arrays #67

Conversation

tomalrussell commented May 17, 2017

willu47 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willu47 May 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willu47 May 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willu47 commented May 22, 2017 • edited Loading

coveralls commented May 22, 2017

coveralls commented May 22, 2017

willu47 May 17, 2017 •

edited

Loading

willu47 May 17, 2017 •

edited

Loading

willu47 commented May 22, 2017 •

edited

Loading