Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processing] Handle case of folder of shapefiles in build vrt alg #10083

Closed
wants to merge 2 commits into from

Conversation

@volaya
Copy link
Contributor

commented May 23, 2019

#fixes 21519

Description

If a shapefile (i.e /my/data/points.shp) was loaded by loading a directory of shapefiles, its source wont be the filepath, but instead a uri-like thing such as /my/data|layername=points.

Although that file can be read by the external tool that creates the virtual vector, that source string won't work. I added a patch to try to resolve the uri-like source to a regular filepath.

It's just a patch for this tool, but this might affect other tools as well (basically, all the ones with external apps, like SAGA, etc), which wont work with shpafiles loaded as part of a folder.

Ideally, there should be a method to resolve the source to a valid filepath if possible, but I couldnt find it. I am making the PR in any case, to get some discussion about this.

I will be happy to improve this if someone points me in the direction of such method of workflow, in case it exists.

Checklist

Reviewing is a process done by project maintainers, mostly on a volunteer basis. We try to keep the overhead as small as possible and appreciate if you help us to do so by completing the following items. Feel free to ask in a comment if you have troubles with any of them.

  • Commit messages are descriptive and explain the rationale for changes
  • Commits which fix bugs include fixes #11111 in the commit message next to the description
  • Commits which add new features are tagged with [FEATURE] in the commit message
  • Commits which change the UI or existing user workflows are tagged with [needs-docs] in the commit message and contain sufficient information in the commit message to be documented
  • I have read the QGIS Coding Standards and this PR complies with them
  • This PR passes all existing unit tests (test results will be reported by travis-ci after opening this PR)
  • New unit tests have been added for core changes
  • I have run the scripts/prepare-commit.sh script before each commit

@volaya volaya requested a review from nyalldawson May 23, 2019

#might be a shapefile loaded as a directory of shapefiles. We try to resolve the uri to a filepath
sourceParts = QgsProviderRegistry.instance().decodeUri('ogr', inFile)
if sourceParts.get("layerName"):
inFile = os.path.join(sourceParts.get("path"), sourceParts.get("layerName") + ".shp")

This comment has been minimized.

Copy link
@nyalldawson

nyalldawson May 23, 2019

Contributor

Won't this break if a geopackage layer (or similar) was loaded? In that case the layerName parameter will be present, but the file name generated here will be broken.

This is really quite tricky to handle -- I think we'd need a check in here to first see if sourceParts['path'] is actually a directory (and if not, there's nothing required). Then, if it IS a directory, we'd need to check inside that path for files which start with the layername and have known vector data format extensions (because the same situation may arise for say KML files), and use that to determine the actual name of the file. Or alternatively, maybe we could use OGR api to try to load the file using just the path and layername, and let it run whatever internal logic it has to resolve the layername to a file, and then retrieve this actual file name from the OGR layer...

This comment has been minimized.

Copy link
@volaya

volaya May 24, 2019

Author Contributor

It wont actually break anything, since gpkgs dont work now. Any layer with a source that is not an actual filepath cannot be used with that algorithm at the moment. However, I agree that the trick i added is not very general and only fixes the case of shapefiles.

I improved it a bit and put it in a separate function in the processing.tools.vector module. From there, it can be moved to C++ if someone takes care of that.

Still, some cases are not handled. It works now for gpkgs, but as long as the layer is the only one in the gpkg file.

In short, it's an improvement over the previous status of the algorithm, but it wont work with all input layers, like other Processing algorithms do. Other algorithms might be affected of this issue, so it's worth checking and maybe adding a similar mechanism.

@nyalldawson

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Ideally, there should be a method to resolve the source to a valid filepath if possible, but I couldnt find it. I am making the PR in any case, to get some discussion about this.

Agreed - we need a central method for handling this. Maybe in QgsOgrUtils.cpp?

@volaya volaya force-pushed the volaya:vrt_folder_shapefiles branch from 3d0b8cc to 8ac0f90 May 24, 2019

@volaya volaya force-pushed the volaya:vrt_folder_shapefiles branch from 8ac0f90 to b41598f May 24, 2019

@stale

This comment has been minimized.

Copy link

commented Jun 7, 2019

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@stale stale bot added the stale label Jun 7, 2019

@nyalldawson nyalldawson removed the stale label Jun 11, 2019

@nyalldawson

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

@volaya I've asked here https://lists.osgeo.org/pipermail/gdal-dev/2019-June/050380.html to see if there's a cleaner way we can approach this

@volaya

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

cool. Good idea. Let's wait to see what they say

@m-kuhn

This comment has been minimized.

Copy link
Member

commented Jun 22, 2019

Feedback @rouault

My question is -- GetName() here returns just the file base name (i.e.
"lines" for "/path/lines.shp"). Is there anyway to retrieve the actual
layer file path for these?

No, for most formats, there is no direct mapping from a layer to file(s)
containing layer data, so there is nothing in the OGR API for that. So you
have to use the knowledge that this is opened by the shapefile or mapinfo +
that the datasource name is a directory, and then you can form a filename from
that and the layer name.

if os.path.exists(folder):
if os.path.isdir(folder):
for ext in QgsVectorFileWriter.supportedFormatExtensions():
f = os.path.join(folder, "%s.%s" % (layerName, ext))

This comment has been minimized.

Copy link
@m-kuhn

m-kuhn Jun 22, 2019

Member
Suggested change
f = os.path.join(folder, "%s.%s" % (layerName, ext))
f = os.path.join(folder, "{basename}.{extension}".format(basename=layerName, extension=ext))

Just because ;)

@stale

This comment has been minimized.

Copy link

commented Jul 6, 2019

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@stale stale bot added the stale label Jul 6, 2019

@stale

This comment has been minimized.

Copy link

commented Jul 13, 2019

While we hate to see this happen, this PR has been automatically closed because it has not had any activity in the last 21 days. If this pull request should be reconsidered, please follow the guidelines in the previous comment and reopen this pull request. Or, if you have any further questions, just ask! We love to help, and if there's anything the QGIS project can do to help push this PR forward please let us know how we can assist.

@stale stale bot closed this Jul 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.