-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option to [grep]/[match] to select by line #109
Comments
when using option "-o" in combination with --line, you get the cut notation also: preston match [pattern]
and preston match -o [pattern]
and preston match --no-line [pattern]
PS compare with grep -o
|
…ns to revert original behavior #109
Using the preston-amazon dataset
Default to reporting full lines that contain matches:
With
With
|
@mielliott I was able to reproduce your newly added
and with
and with
|
I am pretty excited about your new feature, and have a way to point to specific lines in an archive / file. I was wondering about two things:
and found preston complaining about the following:
as: select the characters in range 1063-1137 on line 1 of But then I noticed that the --no-line had the same byte range, which makes sense considering that it is the first line. However, when running:
the same byte range was produced using
, which seems a bit counter intuitive because I was expecting the byte offset with the line selection to be counted from the start of the selected line. Curious to hear your comments on the above! |
Aw rats, that’s some awful stuff! Thanks for trying it out though - I’ll have a look at it later tonight |
The feature is pretty awesome . . . my notes are just details I am curious to hear your thoughts on. |
This should now work:
Notice that the file actually is just one big line. No line breaks. So the good news is, the line number and byte ranges actually are working. Looking at another result from matching on the amazon dataset:
Then cat it back (fixed in 89ee74a):
Voila! Edit: oops, the new example I dug up was also using line 1. Hold on a sec |
Attempt number 2:
|
Wow! I was just able to do:
Also, I was able to reproduce:
with inverse lookup:
also, without lines:
where the byte range is counted from start of the content (i.e. Very cool way to express coordinates in a predictable biodiversity data universe! Because, no matter where you are or what you do, the following always holds:
|
Thanks for making this happen @mielliott ! |
fyi @zedomel |
@mielliott Just installed preston 0.3.0 and found that works like a charm (see attached screenshot) . Note that the hash is the (huge) ebird dataset I had the urge to use a line range e.g., L1-2 . Is that something you had in mind too? |
Sweet! Opening the URL is surprisingly speedy too!
Definitely; I didn't expect |
Currently, you can select parts of content that match a specific pattern:
e.g.,
$ preston ls | preston match [some pattern]
...
cut:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/b39887-40069
We'd like to add an option to match only the line number on which the pattern was found:
e.g.,
$ preston ls | preston match --line [some pattern]
...
line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/L23
where the example above expresses that a match was found on line 23.
The text was updated successfully, but these errors were encountered: