Fix runtime warnings #94

mfitz · 2021-09-14T23:08:22Z

Closes #93

Fixes a few hundred warnings - still a few hanging around.

I also had to bump pytest-related versions to make the parallel test stuff work on my local machine, and modified the number of processes so that it is automatically matched to the number of available processors reported by the machine.

I had a brief look at the xfail test, but couldn't see a quick fix for it.

Before

After

mfitz · 2021-09-14T23:09:45Z

genet/utils/graph_operations.py

@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str
    return df


+def get_pandas_dtype(dict):


As discussed - this feels pretty hacky, but it gets the job done and only broke a single test in the end. I'm still hoping for a more elegant solution, but at least we know this does work.

I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support is not really the best place for it... should we maybe start a new module in utils.. ?

Yes, I also wondered if this was the best place for it. Perhaps a new pandas-utils module in utils is the thing?

mfitz · 2021-09-14T23:10:15Z

requirements.txt

@@ -40,7 +40,7 @@ numpy>=1.18.4
 osmnx==0.15.0
 osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0
 packaging==20.4
-pandas>=1.0.3
+pandas==1.3.3


Upgrading pandas scared me slightly, but it didn't break any unit tests or notebooks.

I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think

mfitz · 2021-09-14T23:10:38Z

requirements.txt

@@ -54,8 +54,8 @@ py>=1.8.1
 Pygments==2.7.4
 pyparsing==2.4.7
 pyproj>=3.1.0
-pytest>=5.4.2
-pytest-cov>=2.8.1
+pytest==6.2.5


Had to upgrade these to make the parallel test stuff work on my machine.

mfitz · 2021-09-14T23:12:17Z

tests/test_core_schedule.py

-         'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')},
+    actual_stops = gtfs['stops'].to_dict()
+    assert_semantically_equal(expected_stops, actual_stops)
+    expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''},


Strangely, I only had to replace NaN with None in expected routes, and not the other GTFS-related dictionaries. Not sure why this is the case, and did not investigate. Can you think of a reason, and does this seem like a problem?

huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna both None and NaN values will be considered

mfitz · 2021-09-14T23:25:47Z

tests/test_data/matsim/network.xml

@@ -2,7 +2,7 @@
 <!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd">
 <network>
 	<attributes>
-		<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute>
+		<attribute name="crs" class="java.lang.String">epsg:27700</attribute>


Wasn't sure of the implications of having to make this change to a test network - could this mean existing networks will be broken now? I used this branch of genet for the network simplification step in a Matesto smoke test pipeline, and nothing fell over, but that doesn't feel like comprehensive reassurance. Presumably any network written by an older version of GeNet (or Puma?) will include the CRS attribute in the deprecated style?

I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though

genet/genet/inputs_handler/read.py

Line 47 in 909e034

if 'simplified' not in n.graph.graph:

which gets read from xml here:

genet/genet/inputs_handler/matsim_reader.py

Line 161 in 909e034

elif elem.attrib['name'] == 'simplified':

I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it

Ah, okay. Yes, reading the projection from XML seems like a nice-to-have at best.

…er the number of CPUs available

…dtype

KasiaKoz

❤️ the new warning numbers
And loving that we no longer have this dictionary 'init' BS all around, thanks @mfitz !

KasiaKoz · 2021-09-15T11:03:24Z

genet/core.py

-        gdf_geometries.crs = self.epsg
+        gdf_geometries = gpd.GeoDataFrame(self.link_attribute_data_under_keys(['geometry']), crs=self.epsg)


KasiaKoz · 2021-09-15T11:04:45Z

genet/utils/graph_operations.py

@@ -213,7 +214,7 @@ def get_attribute_data_under_key(iterator: Iterable, key: Union[str, dict]):
    :param iterator: list or iterator yielding (index, attribute_dictionary)
    :param key: either a string e.g. 'modes', or if accessing nested information, a dictionary
        e.g. {'attributes': {'osm:way:name': 'text'}}
-    :return: dictionary where keys are indicies and values are data stored under the key
+    :return: dictionary where keys are indices and values are data stored under the key


I always do that typo, always

KasiaKoz · 2021-09-15T11:07:41Z

genet/utils/graph_operations.py

@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str
    return df


+def get_pandas_dtype(dict):


I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support is not really the best place for it... should we maybe start a new module in utils.. ?

KasiaKoz · 2021-09-15T11:09:19Z

requirements.txt

@@ -40,7 +40,7 @@ numpy>=1.18.4
 osmnx==0.15.0
 osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0
 packaging==20.4
-pandas>=1.0.3
+pandas==1.3.3


I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think

KasiaKoz · 2021-09-15T11:15:32Z

tests/test_core_schedule.py

-         'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')},
+    actual_stops = gtfs['stops'].to_dict()
+    assert_semantically_equal(expected_stops, actual_stops)
+    expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''},


huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna both None and NaN values will be considered

KasiaKoz · 2021-09-15T11:19:51Z

tests/test_data/matsim/network.xml

@@ -2,7 +2,7 @@
 <!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd">
 <network>
 	<attributes>
-		<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute>
+		<attribute name="crs" class="java.lang.String">epsg:27700</attribute>


I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though

genet/genet/inputs_handler/read.py

Line 47 in 909e034

if 'simplified' not in n.graph.graph:

which gets read from xml here:

genet/genet/inputs_handler/matsim_reader.py

Line 161 in 909e034

elif elem.attrib['name'] == 'simplified':

I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it

mfitz requested a review from KasiaKoz September 14, 2021 23:08

mfitz commented Sep 14, 2021

View reviewed changes

mfitz added 6 commits September 15, 2021 00:28

Fix some deprecation warnings

f1bac53

Bump pytest plugin versions and make pytest-dist automatically discov…

579f96f

…er the number of CPUs available

Fix pyproj deprecation warnings

62f3617

Fix pandas warning about creating Series objects without an explicit …

aef7250

…dtype

Add unit tests for python to pandas data type mapping function

1cda48b

Fix a unit test name

0a22663

mfitz force-pushed the fix_runtime_warnings branch from 6913c27 to 0a22663 Compare September 14, 2021 23:29

KasiaKoz approved these changes Sep 15, 2021

View reviewed changes

mfitz and others added 3 commits September 15, 2021 17:13

Refactor pandas helper functions into a dedicated module

5cf46a7

Fix linting errors

d407bd2

Merge branch 'master' into fix_runtime_warnings

fa25255

mfitz merged commit cd0286f into master Sep 15, 2021

mfitz deleted the fix_runtime_warnings branch September 15, 2021 16:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix runtime warnings #94

Fix runtime warnings #94

mfitz commented Sep 14, 2021 •

edited

mfitz Sep 14, 2021

KasiaKoz Sep 15, 2021

mfitz Sep 15, 2021

mfitz Sep 14, 2021

KasiaKoz Sep 15, 2021

mfitz Sep 14, 2021

mfitz Sep 14, 2021

KasiaKoz Sep 15, 2021

mfitz Sep 14, 2021 •

edited

KasiaKoz Sep 15, 2021

mfitz Sep 15, 2021

KasiaKoz left a comment

KasiaKoz Sep 15, 2021

KasiaKoz Sep 15, 2021

KasiaKoz Sep 15, 2021

KasiaKoz Sep 15, 2021

KasiaKoz Sep 15, 2021

KasiaKoz Sep 15, 2021

		@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str
		return df


		def get_pandas_dtype(dict):

		gdf_geometries.crs = self.epsg
		gdf_geometries = gpd.GeoDataFrame(self.link_attribute_data_under_keys(['geometry']), crs=self.epsg)

Fix runtime warnings #94

Fix runtime warnings #94

Conversation

mfitz commented Sep 14, 2021 • edited

Before

After

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfitz Sep 14, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KasiaKoz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfitz commented Sep 14, 2021 •

edited

mfitz Sep 14, 2021 •

edited