A powerful Streamlit application that leverages Generative AI to create custom documents based on templates and various knowledge sources.
- Multiple Knowledge Sources: Upload documents (PDF, DOCX, TXT), search the web, or specify a URL to gather information
- Template Management: Use predefined templates, search for templates, or upload custom templates
- Jinja2 Template Support: Utilize the power of Jinja2 templating for flexible document creation
- RAG (Retrieval-Augmented Generation): Improve document relevance by focusing on your knowledge source
- Multi-format Export: Download your generated documents in PDF, DOCX, or Markdown format
- Template Library: Save and reuse templates for future document generation
- Python 3.8+
- Required Python packages (see
requirements.txt) - Google API key for Gemini AI (set as environment variable)
- SERP API key for web searches (set as environment variable)
-
Clone the repository:
git clone https://github.com/rahulg202/Document-Generation-Bot.git
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.envfile with your API keys:GOOGLE_API_KEY=your_google_api_key SERP_API_KEY=your_serp_api_key -
Run the application:
streamlit run main.py
-
Input Information:
- Enter your document requirements
- Select a knowledge source (upload a document, search the web, or specify a URL)
- Choose a template (predefined, search for one, or upload custom)
-
Verification:
- Review your selections
- Choose between standard AI generation or RAG (Retrieval-Augmented Generation)
- Perform web searches if needed
-
Results:
- View your generated document
- Download in your preferred format (PDF, DOCX, or Markdown)
- Provide feedback for improvement
The application uses Jinja2 for templating, which allows for powerful and flexible document creation:
- Variables are defined using
{{ variable_name }}syntax - Templates can be saved for future use
- Custom templates can be uploaded in various formats (TXT, J2, JINJA, HTML, MD)
The system leverages Google's Gemini AI to:
- Generate document content based on templates
- Extract relevant information from knowledge sources
- Create new templates based on user requirements
- Implement RAG for improved relevance
genai-document-generator/
│
├── main.py # Main application entry point
├── components/ # UI components
│ ├── input_page.py # Document requirements input
│ ├── verify_page.py # Verification and generation
│ └── results_page.py # Results display and export
│
├── utils/ # Utility functions
│ ├── ai_tools.py # AI integration tools
│ ├── document_processing.py # Document handling
│ ├── pdf_tools.py # PDF generation utilities
│ ├── rag_tools.py # RAG implementation
│ ├── template_manager.py # Template management
│ └── web_tools.py # Web searching and scraping
│
├── templates/ # Template storage
│ ├── index.json # Template index
│ └── *.txt # Template files
│
└── requirements.txt # Python dependencies
The application includes a robust implementation of Retrieval-Augmented Generation:
- Text is preprocessed and chunked for efficient retrieval
- TF-IDF vectorization is used for embedding
- Cosine similarity determines the most relevant chunks
- Relevant information is fed to Gemini AI with the user query
- The generated content is formatted and rendered using the chosen template
- Vector database integration for more efficient RAG
- Template categories and tags for better organization
- Collaborative document editing
- API access for programmatic document generation
- Additional output formats and styling options