Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#41 table relative paths #48

Merged
merged 18 commits into from
Jan 24, 2017

Conversation

dumyan
Copy link
Contributor

@dumyan dumyan commented Jan 16, 2017


Resolves #41 by adding basePath for remote datapackages and after that using url.resolve in Resources class to point to the relative resource.

@dumyan dumyan self-assigned this Jan 16, 2017
@dumyan dumyan requested a review from roll January 16, 2017 16:48
@dumyan dumyan added the review label Jan 16, 2017
})

it('URL resource', async () => {
const descriptor = 'https://dev.keitaro.info/dpkjs/datapackage.json'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary, we will change it to github once this branch is merged.

Copy link
Member

@roll roll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 I've added some comments.

path: resourcePath,
}, basePath)

assert(resource.source === path.normalize(`${basePath}/${resourcePath}`))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change to resource.source === PATH_STRING - we need to be sure it returns exactly the same result we expect. If there is an error in "path.normalize(${basePath}/${resourcePath})" than we just replicate it in tests. But we need to test correctness of this code also.

path: resourcePath,
}, baseURL)

assert(resource.source === url.resolve(baseURL, resourcePath))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above I suppose better will be write resource.source === EXPECTED_URL.

return url.resolve(this._basePath, source)
}

return path.normalize(`${this._basePath}/${source}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we better use something like path join instead of string interpolation to make it cross-platform.

@@ -71,12 +72,19 @@ export default class Resource {
* @returns {String|Array|Object}
*/
get source() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function logic seems a little bit complicated - may be it's possible to make it more straightforward using comments and groups? We also should analyze it for vulnerabilities (accessing e.g. /etc/passwd using relative paths) so this code should be pretty clear to understand.

For example:

// Inline data
if (this.type === 'inline') return source;

// Local data
...

// Remote data
...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for relative paths - it should be addresed on the spec level first. It already has a restriction of usage of dot-paths, but it is not enough (forbids usage only in the start of that path)

Copy link
Member

@roll roll Jan 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fak3
One of the spec WG meetings has decided to ban any traverses outside of datapackage.json basedir for local paths (no ^/ and ..). So we consider it as a rule just waiting official finalization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fak3 @roll

both comments have been addressed. Path checking searches for ^/, and occurences of .. or .\. etc (backslash is being ignored when travesting dirs, so it's first taken out of the path).

@roll
Copy link
Member

roll commented Jan 18, 2017

@dumyan
One other thing - I suppose we need much more tests for path resolutions. Could you write a few additional tests (for resource and datapackage if needed) with comprehensive list of combinations (basePath/path) to cover all edge cases? It could look like this for example:

it('resource source/type correctly works with base path', async () => {
  const paths = [
    {base: 'folder1', path: 'data/data.csv', source: 'folder1/data/data.csv', type: 'local'},
    // ... like 10 more)
  ]
  paths.forEach({base, path, source, type} => {
    const resource = new Resource({path}, base)
    assert(resource.source === source)
    assert(resource.type === type)
  })
})

it('resource source/type correctly works without base path', async () => {

The same technique could be used for other tests to ensure Resource.source/type works correctly on all possible input data.

Copy link
Member

@roll roll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dumyan
I've reviewed latest changes.

self._basePath = basePath || self._getBasePath(descriptor)

// Check if basePath is valid and throw error if needed
const basePathErrors = Utils.checkPath(basePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to always raise on bad base path. valid/errors/raiseInvalid is only for descriptor validation.

@@ -73,7 +73,7 @@ export default class Profiles {

const validation = tv4.validateMultiple(data, schema)
if (validation.valid) {
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return True is correct. See validate - frictionlessdata/software-legacy#2

@@ -46,15 +45,14 @@ export default class Resource {
*
* @returns {String}
*/
get type() {
get typeOfResourcePath() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow frictionlessdata/software-legacy#2 (Resource.type for now)


return this.descriptor[this._sourceKey]
} else if (this._sourceKey === 'path' && this._basePath === '') {
// Local path, no basePath provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this.__sourceKey === 'path' means that a local path:

  get _sourceKey() {
    const inlineData = this.descriptor.data

    if (inlineData) {
      return 'data'
    }

    return 'path'
  }

Based on this code remote path also satisfies this condition.


// we need to check the validity of the paths here because one can use
// only the Resource class to read file contents with `table`
if (this._validPaths) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See basePath comment in constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roll

Error is always thrown here if the paths are invalid.

Copy link
Member

@roll roll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems everything work correct 👍

@roll roll merged commit 58fa700 into frictionlessdata:master Jan 24, 2017
@roll roll removed the review label Jan 24, 2017
@dumyan dumyan deleted the #41-table-relative-paths branch January 25, 2017 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buggy resource path resolution Relative paths in remote datapackages don't work
3 participants