Skip to content

simonw/strip-hidden-form-values

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

strip-hidden-form-values

PyPI Changelog Tests License

CLI tool for stripping hidden form values from an HTML document

Why would you need this? Imagine you're running a Git scraper against a website that includes hidden form fields (such as those produced by __VIEWSTATE fields) that change on every request. You can pipe the HTML through this tool to strip those hidden form values such that a change is only recorded if the rest of the page is modified in some way.

scrape-ca-wildlife-rules is an example of a repository that uses this tool for that, see the scrape.yml workflow there for details.

Installation

Install this tool using pip:

$ pip install strip-hidden-form-values

Usage

You can pipe HTML into this tool:

curl http://... | strip-hidden-form-values > output.html

Or pass it a filename:

strip-hidden-form-values input.html > output.html

The tool will replace the value= attribute of any hidden form fields with a blank string, so the following:

<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="p8nVm4PgVPA" />

Will be replaced with:

<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="" />

All other HTML will remain unchanged.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd strip-hidden-form-values
python -m venv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

About

CLI tool for stripping hidden form values from an HTML document

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Languages