Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added command line interface. #42

Closed
wants to merge 1 commit into from
Closed

Conversation

rmax
Copy link

@rmax rmax commented Jun 23, 2016

$ python -m parsel --help
usage: python -m parsel [-h] [--xpath] [--re PATTERN] [--encoding ENCODING]
                        [--repr] [-v]
                        EXPRESSION [FILE]

Parsel command line interface.

positional arguments:
  EXPRESSION           A CSSexpression, or a XPath expression if --xpath is
                       given.
  FILE                 If missing, it reads the HTML content from the standard
                       input.

optional arguments:
  -h, --help           show this help message and exit
  --xpath              Given expression is a XPath expression.
  --re PATTERN         Apply given regular expression.
  --encoding ENCODING  Input encoding. Default: utf-8.
  --repr               Output result object representation instead of as text.
  -v, --version        show program's version number and exit

$ curl http://scrapy.org/ | python -m parsel --xpath '//a/@href'  | sort | uniq -c
   1 http://scrapy.org
   1 ../community/
   2 ../companies/
   1 ../doc/
   1 ../download/
   1 ../support/
   1 http://codecov.io/github/scrapy/scrapy?branch=master
   1 http://doc.scrapy.org/en/1.1/intro/overview.html
   1 http://doc.scrapy.org/en/latest/faq.html
   1 http://doc.scrapy.org/en/latest/topics/ubuntu.html
   1 http://github.com/scrapy/scrapy/graphs/contributors
   1 http://pypi.python.org/pypi/Scrapy
   1 http://scrapinghub.com
   1 http://scrapinghub.com/scrapy-cloud/
   1 http://stackoverflow.com/tags/scrapy/info
   1 https://anaconda.org/scrapinghub/scrapy
   2 https://github.com/scrapy/scrapy
   1 https://github.com/scrapy/scrapy/archive/1.1.zip
   1 https://github.com/scrapy/scrapy/wiki/Python-3-Porting
   1 https://github.com/scrapy/scrapyd
   1 https://groups.google.com/forum/?fromgroups#!aboutgroup/scrapy-users
   3 https://pypi.python.org/pypi/Scrapy
   2 https://twitter.com/ScrapyProject

@codecov-io
Copy link

Current coverage is 98.29%

No coverage report found for master at fde9087.

Powered by Codecov. Last updated by fde9087...9043836

@rmax
Copy link
Author

rmax commented Jun 23, 2016

If #41 gets merged, I have a patch to add a --base-url option to return absolute links. This could lead to a nicer CLI interaction:

$ URL=https://sites.google.com/site/automl2016/accepted-papers; curl -q -s $URL | python -m parsel --base-url $URL 'a[href*=pdf]::attr(href)' | xargs -n1 wget
...
2016-06-23 15:23:39 (300 KB/s) - 'AutoML challenge_ system description of Lisheng Sun.pdf?attredirects=0&d=1' saved [212855/212855]
...
2016-06-23 15:23:41 (527 KB/s) - 'Bayesian optimization for automated model selection.pdf?attredirects=0&d=1' saved [454152/454152]
...

@eliasdorneles
Copy link
Member

Wouldn't this be better in a separate project?

It would be easier to maintain and could have more dependencies (click, tabulate, colorama, etc).

@rmax
Copy link
Author

rmax commented Jun 23, 2016

@eliasdorneles indeed, I aimed for the simpler cli but having a separate project would lead to more useful features. Is parsel-cli a good name?

@eliasdorneles
Copy link
Member

Yea, looks pretty cool, I thought it deserves its own project. :)

parsel-cli is a decent name, +1
(nothing you can't change later, also)

@rmax
Copy link
Author

rmax commented Jun 23, 2016

Moving to its own project 🎉

@rmax rmax closed this Jun 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants