Skip to content

Commit

Permalink
added sample to load word files in ZincSearch
Browse files Browse the repository at this point in the history
  • Loading branch information
prabhatsharma committed May 11, 2022
1 parent 01ee7ea commit 23d3da6
Show file tree
Hide file tree
Showing 6 changed files with 34 additions and 0 deletions.
5 changes: 5 additions & 0 deletions examples/word-files/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Load word files

You can load word files data in ZincSearch using the script.

Edit the name of the folder in the load-word-files.py file . All the word files wil be parsed and pushed to the ZincServer.
29 changes: 29 additions & 0 deletions examples/word-files/load-word-files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import textract
import httpx
from os import walk

folder = "sample-files"

def main():
zinc_server = "https://playground.dev.zincsearch.com/api/books/document"
zinc_uid = "admin"
zinc_pwd = "Complexpass#123"

f = []
for (dirpath, dirnames, filenames) in walk(folder):
f.extend(filenames)
break

for file in f:
print(file)
text = textract.process(folder + "/" + file)
text = text.decode("utf-8")
data = {
"file": file,
"text": text
}

httpx.put(zinc_server, json=data, auth=(zinc_uid, zinc_pwd))

if __name__ == "__main__":
main()
Binary file added examples/word-files/sample-files/bible.docx
Binary file not shown.
Binary file added examples/word-files/sample-files/golden-age.docx
Binary file not shown.
Binary file not shown.
Binary file added examples/word-files/sample-files/rigveda.docx
Binary file not shown.

0 comments on commit 23d3da6

Please sign in to comment.