Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix runtime warnings #94

Merged
merged 9 commits into from Sep 15, 2021
Merged

Fix runtime warnings #94

merged 9 commits into from Sep 15, 2021

Conversation

mfitz
Copy link
Contributor

@mfitz mfitz commented Sep 14, 2021

Closes #93

Fixes a few hundred warnings - still a few hanging around.

I also had to bump pytest-related versions to make the parallel test stuff work on my local machine, and modified the number of processes so that it is automatically matched to the number of available processors reported by the machine.

I had a brief look at the xfail test, but couldn't see a quick fix for it.

Before

Screen Shot 2021-09-14 at 23 59 43

After

Screen Shot 2021-09-15 at 00 00 15

@mfitz mfitz requested a review from KasiaKoz September 14, 2021 23:08
@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str
return df


def get_pandas_dtype(dict):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed - this feels pretty hacky, but it gets the job done and only broke a single test in the end. I'm still hoping for a more elegant solution, but at least we know this does work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support is not really the best place for it... should we maybe start a new module in utils.. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also wondered if this was the best place for it. Perhaps a new pandas-utils module in utils is the thing?

@@ -40,7 +40,7 @@ numpy>=1.18.4
osmnx==0.15.0
osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0
packaging==20.4
pandas>=1.0.3
pandas==1.3.3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrading pandas scared me slightly, but it didn't break any unit tests or notebooks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think

@@ -54,8 +54,8 @@ py>=1.8.1
Pygments==2.7.4
pyparsing==2.4.7
pyproj>=3.1.0
pytest>=5.4.2
pytest-cov>=2.8.1
pytest==6.2.5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to upgrade these to make the parallel test stuff work on my machine.

'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')},
actual_stops = gtfs['stops'].to_dict()
assert_semantically_equal(expected_stops, actual_stops)
expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strangely, I only had to replace NaN with None in expected routes, and not the other GTFS-related dictionaries. Not sure why this is the case, and did not investigate. Can you think of a reason, and does this seem like a problem?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna both None and NaN values will be considered

@@ -2,7 +2,7 @@
<!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd">
<network>
<attributes>
<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute>
<attribute name="crs" class="java.lang.String">epsg:27700</attribute>
Copy link
Contributor Author

@mfitz mfitz Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure of the implications of having to make this change to a test network - could this mean existing networks will be broken now? I used this branch of genet for the network simplification step in a Matesto smoke test pipeline, and nothing fell over, but that doesn't feel like comprehensive reassurance. Presumably any network written by an older version of GeNet (or Puma?) will include the CRS attribute in the deprecated style?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though

if 'simplified' not in n.graph.graph:
which gets read from xml here:
elif elem.attrib['name'] == 'simplified':

I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay. Yes, reading the projection from XML seems like a nice-to-have at best.

Copy link
Collaborator

@KasiaKoz KasiaKoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ the new warning numbers
And loving that we no longer have this dictionary 'init' BS all around, thanks @mfitz !

Comment on lines -148 to +147
gdf_geometries.crs = self.epsg
gdf_geometries = gpd.GeoDataFrame(self.link_attribute_data_under_keys(['geometry']), crs=self.epsg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@@ -213,7 +214,7 @@ def get_attribute_data_under_key(iterator: Iterable, key: Union[str, dict]):
:param iterator: list or iterator yielding (index, attribute_dictionary)
:param key: either a string e.g. 'modes', or if accessing nested information, a dictionary
e.g. {'attributes': {'osm:way:name': 'text'}}
:return: dictionary where keys are indicies and values are data stored under the key
:return: dictionary where keys are indices and values are data stored under the key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always do that typo, always

@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str
return df


def get_pandas_dtype(dict):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support is not really the best place for it... should we maybe start a new module in utils.. ?

@@ -40,7 +40,7 @@ numpy>=1.18.4
osmnx==0.15.0
osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0
packaging==20.4
pandas>=1.0.3
pandas==1.3.3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think

'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')},
actual_stops = gtfs['stops'].to_dict()
assert_semantically_equal(expected_stops, actual_stops)
expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna both None and NaN values will be considered

@@ -2,7 +2,7 @@
<!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd">
<network>
<attributes>
<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute>
<attribute name="crs" class="java.lang.String">epsg:27700</attribute>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though

if 'simplified' not in n.graph.graph:
which gets read from xml here:
elif elem.attrib['name'] == 'simplified':

I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it

@mfitz mfitz merged commit cd0286f into master Sep 15, 2021
@mfitz mfitz deleted the fix_runtime_warnings branch September 15, 2021 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GeNet produces many warnings
2 participants