Skip to content

obaskly/Docai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation



Docai

Docai is a GPT-3 based Question Answering System that can provide answers based on a PDF, DOCX, and TXT files.

Key FeaturesHow To UseRequirementsCopyright

Key Features

  • File handling
    • The script supports PDF, DOCX, and TXT files
    • Read the content using the pdfplumber, docx, and built-in open() functions
  • GPT-3 integration
    • The script uses the OpenAI GPT-3 model, specifically the text-davinci-003 engine, to generate answers to questions.
  • Confidence scoring
    • The script calculates confidence scores for the generated answers using log probabilities returned by the GPT-3 API.
  • Concurrency
    • It uses the concurrent.futures.ThreadPoolExecutor to process questions concurrently, potentially speeding up the process.
  • Text preprocessing
    • The script splits the input document into chunks to fit within GPT-3's token limit, and post-processes the answers to remove duplicate sentences.
  • Saving conversation history
    • The script allows users to save the conversation history to a text file.
  • Caching
    • The script uses lru_cache decorator to cache the answers generated by GPT-3. This way, if a user asks the same question again, the cached answer can be returned instead of making another API call.
  • Gui
    • The script provides a friendly graphical user interface built using the tkinter library and ttkthemes allowing users to select a file, input a question, view the answer, and save the conversation history.

How To Use

  • Put you api key in line 45
  • Run the script
  • Select your file
  • Enter your question and click submit

It's as simple as that

Note We will provide an executable version soon

Requirements

pip install openai pdfplumber python-docx

Copyright

All rights reserved to Bropocalypse Team.

About

GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and answers questions based on the content.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages