Skip to content

AlexScotland/DocuScan-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

DocuScan Package

DocuScan is a lightweight document scanner.

DocuScan allows users to open up document types docx,doc,pdf and return the information inside as strings.

DocuScan also allows for manipulation of this information via regular expressions.

Check out my other projects!

Requirements:

  1. zipfile

  2. io

  3. re

  4. XML

Installation:

  1. run pip install DocuScan

  2. import DocuScan

Usage:

  1. class DocuScan('fileName') to a variable.

###It is worth noting that the fileName must be in the directory.

########### example :DocuScan("C:\Users\Person\Desktop\folder1\test.pdf")

  1. use print(variable.returnFileText())

  2. use print(variable.executeRegex('regex here'))

  3. use print(executeHeaderRegex('regex here'))

  4. use print(executeFooterRegex('regex here'))

Functionality:

  1. returnFileText() - Returns the text of a file.

  2. executeRegex(regexExpression) - creates a list of all matching cases of regexExpression

  3. executeHeaderRegex(regularExpression) - creates a list of all matching cases of regexExpression in the header XML.

  4. executeFooterRegex(regularExpression) - creates a list of all matching cases of regexExpression in the Footer XML.

About

DocuScan scans documents and returns strings (pdf, docx, doc)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages