Installation for general usage:
Note: pydocparser was only tested for python3 (not guaranteed to work for python2)
pip install pydocparser
or if you have python3 pip3 install pydocparser
OR
You can download the release of your choice from here
Unzip the file
change directory to the unziped folder
run python setup.py install
or python3 setup.py install
Installation for development:
git clone https://github.com/tman540/pydocparser
pip install -r requirements.txt
To use pydocparser, you must create an instance of the Parser
class from the pydocparser
module:
import pydocparser
parser = pydocparser.Parser()
Next, you must obtain your secret API key (which you can get from here)
Now, pydocparser requires this key to be able to access your account. You can do that like this:
parser.login(YOUR_API_KEY_HERE)
The docparser API has a function for testing connection to the API
result = parser.ping()
print(result)
# pong
If parser.ping()
returns ‘pong’, then you have a successful connection to the docparser API. If you get an output like this: Invalid API key. Use Parser.login(api_key)
and you entered your API key, make sure your API key is correct.
You can get a list of current parsers like this:
parsers = parser.list_parsers()
This will return a list of the names of all available parsers.
To upload a file to docparser, you can use the upload
function:
id = parser.upload_file_by_path("fileone.pdf", "PDF Parser") #args: file to upload, the name of the parser
# Note that "fileone.pdf" was in the current working directory
The function will return the document ID of the file that was just uploaded. To retrieve the parsed data, you can call the fetch
function:
data = parser.get_one_result("PDF Parser", id) # The id is the doc id that was returned by `parser.upload()`
fetch
returns all the parsed data from the file you selected
This project started from the need to use docparser through python at work. I noticed that there was no API library for python, so I decided to make it myself. I am a one man operation so I am glad to accept any help I can get. You can contribute by making your changes, submitting a pull request with a detailed description of what you added. I will review your changes, and if I decide that your changes will make it into the next release, I will credit you accordingly. You can also contribute by submitting bug reports/feature request through GitHub issues.
This library is available as open source un the MIT License.
V1.0 (7/11/19) Initial release
V1.1 (7/12/19) Bug Fixes + New Functions
- Change function names to more closely resemble those in the PHP/Node/AJAX clients
- Update setup.py to include install requirements
- Fix README.md to work better on PyPi