Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

pyline

GitHub | PyPi | Warehouse | ReadTheDocs | Travis-CI

https://badge.fury.io/py/pyline.png https://travis-ci.org/westurner/pyline.png?branch=master

Pyline is a grep-like, sed-like, awk-like command-line tool for line-based text processing in Python.

Features

Why

Somewhat unsurprisingly, I found the original pyline recipe while searching for "python grep sed" (see AUTHORS.rst and LICENSE.psf).

I added an option for setting p = Path(line) in the eval/compile command context and added it to my dotfiles ; where it grew tests and an optparse.OptionParser; and is now promoted to a GitHub project with ReadTheDocs documentation, tests with tox and Travis-CI, and a setup.py for PyPi.

What

Pyline is an ordered MapReduce tool:

Input Readers:
  • stdin (default)
  • file (codecs.open(file, 'r', encoding='utf-8'))
Map Functions:
  • Python module imports (-m os)

  • Python regex pattern (-r '\(.*\)')

  • path library (p from --pathpy OR --pathlib)

  • Python codeobj eval output transform:

    ls | pyline -m os 'line and os.path.abspath(line.strip())'
    ls | pyline -r '\(.*\)' 'rgx and (rgx.group(0), rgx.group(1)) or line'
    ls | pyline -p 'p and p.abspath() or ("# ".format(line))'
    
    # With an extra outer loop to bind variables in
    # (because (_p = p.abspath(); <codeobj>) does not work)
    find $PWD | pyline --pathpy -m os -m collections --input-delim='/' \
        'p and [collections.OrderedDict((
                ("p", p),
                ("_p", _p),
                ("_p.split()", str(_p).split(os.path.sep)),
                ("line.rstrip().split()", line.rstrip().split(os.path.sep)),
                ("l.split()", l.split(os.path.sep)),
                ("words", words),
                ("w", w)))
            for _p in [p.abspath()]][0]' \
           -O json
Partition Function:
None
Compare Function:
Result(collections.namedtuple).__cmp__
Reduce Functions:
bool(), sorted()
Output Writers:

ResultWriter classes

pyline -O csv
pyline -O tsv
pyline -O json

Installing

Install from PyPi:

pip install pyline

Install from GitHub as editable (add a pyline.pth in site-packages):

pip install -e git+https://github.com/westurner/pyline#egg=pyline

Usage

Print help:

pyline --help

Process:

# Print every line (null transform)
cat ~/.bashrc | pyline line
cat ~/.bashrc | pyline l

# Number every line
cat ~/.bashrc | pyline -n l

# Print every word (str.split(input-delim=None))
cat ~/.bashrc | pyline words
cat ~/.bashrc | pyline w

# Split into words and print (default: tab separated)
cat ~/.bashrc | pyline 'len(w) >= 2 and w[1] or "?"'

# Select the last word, dropping lines with no words
pyline -f ~/.bashrc 'w[-1:]'

# Regex matching with groups
cat ~/.bashrc | pyline -n -r '^#(.*)' 'rgx and rgx.group()'
cat ~/.bashrc | pyline -n -r '^#(.*)'

## Original Examples
# Print out the first 20 characters of every line
tail access_log | pyline "line[:20]"

# Print just the URLs in the access log (seventh "word" in the line)
tail access_log | pyline "words[6]"

Work with paths and files:

# List current directory files larger than 1 Kb
ls | pyline -m os \
  "os.path.isfile(line) and os.stat(line).st_size > 1024 and line"

# List current directory files larger than 1 Kb
#pip install path.py
ls | pyline -p 'p and p.size > 1024 and line'

Documentation

https://pyline.readthedocs.org/en/latest/

License

Python Software License

About

Pyline is a grep-like, sed-like, awk-like command-line tool for line-based text processing in Python. https://pypi.python.org/pypi/pyline

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages