Clarifications on the use of `base` on the datapackage root and its descriptors #232

Closed
vitorbaptista opened this Issue Nov 26, 2015 · 3 comments

Projects

None yet

4 participants

@vitorbaptista
Contributor

Taking this file as an example:

{
    "base": "http://someplace.com/foo",
    "resources": [
        {
            "path": "file1.json"
        },
        {
            "path": "file2.json",
            "base": "http://otherplace.com/foo"
        },
        {
            "path": "file3.json",
            "base": "/bar"
        }
    ]
}

Say that this datapackage was loaded from the local folder /home/vitor/datasets/foo. From what I understood from the specification, the paths of the resources to be tried will be:

file1.json

  1. /home/vitor/datasets/foo/file1.json
  2. http://someplace.com/foo/file1.json

file2.json

  1. /home/vitor/datasets/foo/file2.json
  2. http://otherplace.com/foo/file2.json

file3.json

  1. /home/vitor/datasets/foo/file3.json
  2. /bar/file3.json

Is that correct? Should file2.json be looked for in http://someplace.com/foo/file2.json as well? Can base be a relative local path, or it MUST be a base URL?

The spec says:

Of course, the path attribute may still be used for Data Packages located online (in this case it determines the relative URL) in combination with the optional base property if it is defined.

What if it's not defined (like with file1.json)? What happens?

All in all, what I'm looking for is understanding in which paths I should look for a file depending on which attributes it has defined.

@vitorbaptista vitorbaptista referenced this issue in frictionlessdata/datapackage-py Nov 30, 2015
Closed

Load remote datapackages #11

@rufuspollock
Contributor

I'm wondering if we should just remove base and force explicit urls ... - base resolution seems and additional complexity for consumers and rarely used by providers (??)

@vitorbaptista @pwalsh @paulfitz @morty @danfowler any thoughts?

@pwalsh
Member
pwalsh commented Dec 18, 2015

I'm a +1 on removal on basis of unnecessary complexity.

@vitorbaptista
Contributor

👍

@pwalsh pwalsh referenced this issue in frictionlessdata/datapackage-py Dec 30, 2015
Closed

Take into account base path for local resources #26

@danfowler danfowler added a commit to frictionlessdata/schemas that referenced this issue Jul 28, 2016
@danfowler danfowler Remove "base" property on Data Package fcd4f06
@rufuspollock rufuspollock added a commit to rufuspollock/fd-specs that referenced this issue Nov 28, 2016
@rufuspollock rufuspollock [dp,!][m]: merge resource property url into path - fixes #250.
Major, breaking change. Major justification is simplication.

In addition to basic change have also addressed a security concern by
introducing limitations on path (no / or ../).

Simplicity.

Logic for this spelled out in detail in the github issue thread. Summary:

At the moment we have path and url. I originally had this to make it super easy
for tool implementors (no lists of web protocols to match against `http://,
https://, ftp://, etc).

At the same time it adds cognitive complexity to the spec and for publishers
and confusion about whether one could use both e.g. #223 #232.

Whilst change increases demand on consumers to parse out urls from simple paths
this is relatively straightforward and consumer could not rely on url vs path
being used correctly anyway
2aab215
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment