Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash support #31

Merged
merged 8 commits into from
Jun 20, 2017
Merged

Hash support #31

merged 8 commits into from
Jun 20, 2017

Conversation

kashav
Copy link
Owner

@kashav kashav commented Jun 18, 2017

For #9 (started in #30).

Adds SHA1 hash support to SELECT and WHERE clause. Should be very easy to extend this to other hashing algorithms (we'll just need to add them to transform.FindHash).

Some basic examples (note that there's no support for hash length in conditions):

> SELECT name, hash FROM .
> SELECT name, SHA1(hash) FROM .
> SELECT name, SHA1(hash, 20) FROM .
> SELECT all FROM . WHERE hash = e950e1594893642a9629a0a0d9d920fec8e8e73f
> SELECT all FROM . WHERE SHA1(hash) = ./fsql.go

Also introduces an evaluate package which is in charge of all relevant comparison/evaluation work, updated structure:

. (fsql)
├── cmd
│   └── fsql (main)
├── evaluate
├── parser
├── prompt
├── query
├── tokenizer
├── transform

beeceej and others added 5 commits June 11, 2017 23:26
- base for hashes as an attribute, and SHA-1 as a format for Hash
- return _DIR_ for directories, instead of nil.
- add default hash impl
- change order of output
- use truncate to decide length of hash
- refactor from string ->  interface{}
- add comments and refactor pulling out arg as int
- abstract away argAsInt function
- refactor hash truncation
- add asInt comment
- refactor Default behavior
- fix spacing
* master:
  gofmt -s
  Update directives in Makefile and Travis build steps
  Clean-up parser methods
  Fatal with err.Error
  Implement lazy eval for AND conditions
  Clean-up test cases
  Bump version
  Vendor dependencies with Godeps
  Update README
  Add support for globs in FROM clause.
  Dynamically retrieve file mod. time & size.
  Add basic tests for prompt functions
  Add unit tests for high-level usage
  Create test fixtures
- transform.computeHash takes hash.Hash now (instead of crypto.SignerOpts), for
  easier extensibility in the future.
- Add `hash` attr to package fsql tests.
- Introduces support for using the `hash` attribute in WHERE clauses:

  - SELECT all FROM . WHERE SHA1(hash) = /path/to/some/file
  - SELECT all FROM . WHERE hash = <some_hash>

- Moves core evaluate/compare methods to individal package.
@kashav
Copy link
Owner Author

kashav commented Jun 18, 2017

@jonesbrianc26 It'd be great if you could give this a look and/or test it out!

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything huge that jumps out to me. Looks good 👍 I'll pull this guy down and test it out a little bit.


hashFunc := transform.FindHash(hashType)
if hashFunc == nil {
return false, fmt.Errorf("unexpeceted hash algorithm %s", hashType)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo here should be unexpected

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 😅

fsql.go Outdated
// output prints the result value for each SELECTed attribute. Order is based
// on the order the attributes appear in attrs.
// All valid attributes. Not that order of output is based on the order that
// attributes appear in this array.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is kind of out of place given it's sort of in reference to my misguided comment. Probably just "All valid attributes" works.

if err != nil {
return nil, err
}
defer f.Close()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be worth refactoring this so we don't have to open up the file here, not sure if we actually want to pass around an open file though

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too sure of what would be better here either. We could simplify this with ioutil.ReadFile, but that also opens the file under the hood.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using ioutil.ReadFile would probably help with unexpected memory consumption when enountering large files here, so I vote for the simplification 👍

@ghost
Copy link

ghost commented Jun 19, 2017

the info.IsDir() check is giving false when the mode of a file is L (symlink), then ioutil.ReadFile fails because it can't open a symlink.

This works with select * from /opt

	if info.IsDir() || (info.Mode()&os.ModeSymlink) == os.ModeSymlink {
		return strings.Repeat("-", h.Size()*2), nil
	}

We should also check against Sockets

@kashav
Copy link
Owner Author

kashav commented Jun 19, 2017

Nice catch!

There is the possibility that the link points to a file instead of a directory. So your approach works, but we also have the option of following the link:

if info.Mode()&os.ModeSymlink == os.ModeSymlink {
	var err error
	if path, err = filepath.EvalSymlinks(path); err != nil {
		return nil, err
	}
	if info, err = os.Stat(path); err != nil {
		return nil, err
	}
}
if info.IsDir() {
	return strings.Repeat("-", h.Size()*2), nil
}

What do you think about this approach? I'm not sure in what condition filepath.EvalSymlinks fails, but when it does (i.e. err != nil), should we return the error or the fallback hash (---)?

@ghost
Copy link

ghost commented Jun 19, 2017

This is a rabbit hole, because then it get's into whether or not the user wants to follow symlinks.

I suppose the implementation should by default at least try to follow a symlink. to determine if it could directly be hashed.

I think we should return the --- I suspect the user will want to see the rest of the output after it hits the symlink

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@kashav kashav merged commit 5069fa2 into master Jun 20, 2017
@kashav kashav deleted the feature/hash-support branch June 20, 2017 17:15
@kashav
Copy link
Owner Author

kashav commented Jun 20, 2017

Thanks for the review @jonesbrianc26!

This was referenced Jun 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants