Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial draft for review of NaPTAN point locations #57

Merged
merged 10 commits into from
Jul 3, 2022
Merged

Initial draft for review of NaPTAN point locations #57

merged 10 commits into from
Jul 3, 2022

Conversation

anisotropi4
Copy link
Contributor

The proposed code downloads National Passenger Transport Access Node (NaPTAN) data from the current DfT Universal Resource Identifier, filters on heavy and light rail, and ferry locations, adds the National Rail CRS (Computer Reservation System) codes for heavy rail stations and outputs the as GeoJSON and GeoPackage in EPSG:32630 projection, as well as the National Rail CRScode lookup data in a slightly odd CSV file format.

Any thoughts and comments welcome

@anisotropi4
Copy link
Contributor Author

The current build appears to fail but I'm not sure whether this is due to this push or something else

Copy link
Contributor

@Robinlovelace Robinlovelace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At around 1 MB the stations.geojson and stations.gpkg objects are a bit big. We want to keep the repo as small as possible (currently around 2-3MB I think) to make it easy to download. I suggest we either only release these datasets in the releases section where we have saved other dataset https://github.com/geocompr/py/releases or release the data there and add minimal (e.g. 100 KB max) in the commit history. Apologies but would you mind creating a new PR from a different branch so this big files do not go in the commit history? Overall big 👍 from me though and we can write some nice text around this outstanding worked example.

code/chapters/09-mapping.py Show resolved Hide resolved
code/chapters/09-mapping.py Outdated Show resolved Hide resolved
code/chapters/09-mapping.py Show resolved Hide resolved
code/scratch/index.html Show resolved Hide resolved
@Robinlovelace
Copy link
Contributor

Regarding the repo size, you may want to re-clone this repo and create a create a new branch e.g. along the lines of

gh repo clone geocompr/py
cd py
git checkout -b naptan
git remote add a1 https://github.com/anisotropi4/py.git
# make + commit changes
git push naptan a1

Not sure if that will actually work, hope so!

@Robinlovelace Robinlovelace mentioned this pull request Jun 26, 2022
@anisotropi4
Copy link
Contributor Author

Thanks for the feedback and clear steer on data.

I'll wait for any more comments and then drop the branch and resubmit this with cut-down size. This will probably involve a reduced geographic scope and sorting out some of the daft levels of precision. The GeoJSON is at something like 11 significant figures.

@anisotropi4
Copy link
Contributor Author

As I won't get to look at this until the weekend I will leave it open for comments until Friday. I will then fix the PR based on Yorkshire, and look to host the full data set somewhere else

@Robinlovelace
Copy link
Contributor

Any feedback on this @michaeldorman, @anitagraser, @Nowosad ? Feedback on the great worked example in the Python script here https://github.com/geocompr/py/pull/57/files is what I think @anisotropi4 is after, will be glad to get your thoughts on it if you get a chance. Many thanks again Will for this valuable contribution.

@michaeldorman
Copy link
Collaborator

The code works for me, and looks great! Thank you very much @anisotropi4 for the contribution, and @Robinlovelace !

One suggestion is that in the final mapping part might be better to focus on pure Python methods, because it's the book focus and because it would be beyond the scope to explain the HTML/JavaScript code in the text. Perhaps there can be few examples of .plot to show the data in static maps, and then .explore to create an interactive map?

Screenshot from 2022-06-29 11-30-36

@anisotropi4
Copy link
Contributor Author

The code works for me, and looks great! Thank you very much @anisotropi4 for the contribution, and @Robinlovelace !

One suggestion is that in the final mapping part might be better to focus on pure Python methods, because it's the book focus and because it would be beyond the scope to explain the HTML/JavaScript code in the text. Perhaps there can be few examples of .plot to show the data in static maps, and then .explore to create an interactive map?

There may be another way round this. Although it isn't something I have used it would appear that folium is a way of generating Leaflet.js interactive html files. I just need to work out how to replicate the existing interactive index.html using the library.

@michaeldorman
Copy link
Collaborator

The code works for me, and looks great! Thank you very much @anisotropi4 for the contribution, and @Robinlovelace !
One suggestion is that in the final mapping part might be better to focus on pure Python methods, because it's the book focus and because it would be beyond the scope to explain the HTML/JavaScript code in the text. Perhaps there can be few examples of .plot to show the data in static maps, and then .explore to create an interactive map?

There may be another way round this. Although it isn't something I have used it would appear that folium is a way of generating Leaflet.js interactive html files. I just need to work out how to replicate the existing interactive index.html using the library.

Sure, excellent idea!

I suggest perhaps to also demonstrate some of the extra capabilities of folium, to justify using it instead of .explore. I haven't used folium before so don't have a specific recommendation at the moment, but according to the docs seems like tile layer selection using radio buttons is something that .explore doesn't provide, so perhaps that could be an option.

@anisotropi4
Copy link
Contributor Author

One suggestion is that in the final mapping part might be better to focus on pure Python methods, because it's the book focus and because it would be beyond the scope to explain the HTML/JavaScript code in the text. Perhaps there can be few examples of .plot to show the data in static maps, and then .explore to create an interactive map?

There may be another way round this. Although it isn't something I have used it would appear that folium is a way of generating Leaflet.js interactive html files. I just need to work out how to replicate the existing interactive index.html using the library.

Sure, excellent idea!

I suggest perhaps to also demonstrate some of the extra capabilities of folium, to justify using it instead of .explore. I haven't used folium before so don't have a specific recommendation at the moment, but according to the docs seems like tile layer selection using radio buttons is something that .explore doesn't provide, so perhaps that could be an option.

My issue is I don't know what .explore does, I haven't ever used it myself and a quick search of "python .explore" didn't really help

FWIW my dev environment is typically linux command line witth emacs, a python -m http.server and chromium to hack python and JavaScript/html. Using folium to generate Leaflet.js and html is closest to what I am used to. I'd not advocate this as a reasonable or sane method of working however...

@michaeldorman
Copy link
Collaborator

One suggestion is that in the final mapping part might be better to focus on pure Python methods, because it's the book focus and because it would be beyond the scope to explain the HTML/JavaScript code in the text. Perhaps there can be few examples of .plot to show the data in static maps, and then .explore to create an interactive map?

There may be another way round this. Although it isn't something I have used it would appear that folium is a way of generating Leaflet.js interactive html files. I just need to work out how to replicate the existing interactive index.html using the library.

Sure, excellent idea!
I suggest perhaps to also demonstrate some of the extra capabilities of folium, to justify using it instead of .explore. I haven't used folium before so don't have a specific recommendation at the moment, but according to the docs seems like tile layer selection using radio buttons is something that .explore doesn't provide, so perhaps that could be an option.

My issue is I don't know what .explore does, I haven't ever used it myself and a quick search of "python .explore" didn't really help

FWIW my dev environment is typically linux command line witth emacs, a python -m http.server and chromium to hack python and JavaScript/html. Using folium to generate Leaflet.js and html is closest to what I am used to. I'd not advocate this as a reasonable or sane method of working however...

.explore is a wrapper around folium, intended for GeoDataFrame input:
https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.explore.html

@anisotropi4
Copy link
Contributor Author

Following on from the discussion this week, please find an update which is based on Yorkshire rather than GB with precision rounded to sensible values, which significantly reduces the data size. With an interactive map based on folium. This is still beta as it still needs description and a narrative placing round the content, but thoughts and comments welcome.

If acceptable I will look to provide an interactive rail-line map patch for the same geography based on OSMnx and then an interactive population density vectortile view as part of this chapter.

@Robinlovelace
Copy link
Contributor

One question: how did you remove the big files from the commit history @anisotropi4 ? I assumed they would be there for eternity! In any case as shown above I've approved it. Would like to get feedback from Michael, Anita and Jakub before merging this. One thing to note: code/chapters/09-mapping.py is an ephemeral file and will get overwritten each time we run convert.sh either locally on on Actions. Worth moving your example to somewhere more like code/transport-example.py?

@anisotropi4
Copy link
Contributor Author

One question: how did you remove the big files from the commit history @anisotropi4 ? I assumed they would be there for eternity!
The big files are dynamically generated by the script so if you make a "big file" smaller the automation just seems to clobber it. In this case I did what I said I would:

  1. Restrict the mapping area to Yorkshire. There is now a very small boundary GeoJSON file "data/yorkshire.json". Anything that falls outside this is dropped as follows:
YORKSHIRE = gpd.read_file('data/yorkshire.json').iloc[0, 0]
IDX = STATIONS.within(YORKSHIRE)
STATIONS = STATIONS[IDX]
  1. Rounded to three d.p. for WGS84 coordinates and to whole numbers for the EPSG:32630 projection in $m$ using some coding slight-of-hand:

2a. Define a function constructor:

def _set_precision(precision=0):
  """returns function that rounds a geometry to a given precision""" 
  from functools import partial
  from shapely.ops import transform

  def _precision(x, y, z=None):
    return tuple([round(i, precision) for i in [x, y, z] if i])
  return partial(transform, _precision)

2b. Create a function that rounds to the required precision:

_precision = _set_precision(0)

2c. Then .apply it to the appropriate geometry column

OUTPUT = STATIONS.copy() 
CRS = 'EPSG:32630'
OUTPUT['geometry'] = OUTPUT['geometry'].to_crs(CRS).apply(_precision)

One thing to note: code/chapters/09-mapping.py is an ephemeral file and will get overwritten each time we run convert.sh either locally on on Actions. Worth moving your example to somewhere more like code/transport-example.py?

I've created the stations-example.py file and will create a track-example.py working as a further example for LineString geometries and OSMnx once this PR is cleared.

Census population density and Polygon geometry example will appear later if required.

@michaeldorman
Copy link
Collaborator

Look great! I suggest converting to .ipynb to be in agreement with the other content, and also it will be easier to add text, execute the code, and view the output.

@Robinlovelace
Copy link
Contributor

Agree re. converting to .ipynb but suggest we can do that post merge. Also I suggest converting to .ipynb via .qmd for consistency. Ultimately at least some of this code deserves to be in the final book so we can incorporate it, into a later section of the visualisation chapter, currently chapter 9, is my current thinking.

@michaeldorman
Copy link
Collaborator

Agree re. converting to .ipynb but suggest we can do that post merge. Also I suggest converting to .ipynb via .qmd for consistency. Ultimately at least some of this code deserves to be in the final book so we can incorporate it, into a later section of the visualisation chapter, currently chapter 9, is my current thinking.

Agree, sounds good!

@anisotropi4
Copy link
Contributor Author

Bump.

As I understand that this is waiting on formal approval from @michaeldorman, @Nowosad and @anitagraser I would ask that you either reject or approve this PR.

I ask as I believe I unable to raise another PR against the repository and unless I code branch and then enter branch dependency hell, I am stuck until this is approved.

Thoughts or advice welcome.

@Nowosad Nowosad removed their request for review July 3, 2022 16:14
@Robinlovelace
Copy link
Contributor

Thanks for the bump and agree we should move on this. I think Michael has already signaled 👍 on this with this comment:

Agree, sounds good!

Will await comment from Jakub and Anita, happy for this to be merged? Please signal either way with a Review from the files tab or just a 👍 in the comment. I'm confident it's good to go but as a community project don't want to make unilateral decisions. Many thanks for the contribution, it's looking really good to me and sure it will massively benefit the book.

Copy link
Collaborator

@anitagraser anitagraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 but I don't have time for an in-depth review this weekend

@Nowosad
Copy link
Member

Nowosad commented Jul 3, 2022

You have my 👍🏻

@Robinlovelace
Copy link
Contributor

🎉 thanks for the quickfire responses guys

@Robinlovelace Robinlovelace merged commit ddf108b into geocompx:main Jul 3, 2022
github-actions bot pushed a commit that referenced this pull request Jul 3, 2022
Initial draft for review of NaPTAN point locations ddf108b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants