Multi-file AI assistant powered by Gemini 3 Flash with 1M token context
GemDesk is a desktop application that enables deep analysis across multiple files simultaneously using Google's Gemini 3 Flash model. Upload up to 50 files, ask questions, generate reports, find contradictions, and visualize data - all with intelligent context caching and adaptive thinking modes.
- 1M token context window - Analyze massive amounts of data simultaneously
- 50 file limit - Upload documents, spreadsheets, images, videos, and more
- 100+ file formats supported - PDFs, Office docs, images, videos, code files
- Smart conversion - DOCX β PDF with images, XLSX β CSV, and more
- Context caching - Files cached separately for faster responses
- Adaptive thinking modes - Adjust reasoning depth (minimal/low/medium/high)
- Specialized analysis presets via slash commands:
/report- Executive summaries and cohesive reports/synthesize- Pattern identification and novel insights/error-check- Contradiction and inconsistency detection
- Automatic chart generation - Just ask! "Plot sales over time" generates visualizations
- Function calling - Gemini intelligently uses tools when appropriate
- Smart charting - Bar, line, pie, and scatter plots generated on demand
- Export conversations - Save chats as formatted PDFs
- Export charts - Save visualizations as high-quality PNGs
- Markdown rendering - Code highlighting, tables, and formatting
- Dark/light themes - Toggle between modes
- Organized file shelf - Auto-categorized by type with collapsible folders
- Real-time token metering - Track context usage
- URL scraping - Add web pages and direct file downloads
- Thumbnail previews - Visual preview for images, PDFs, and videos
- Python 3.8 or higher
- Gemini API key (Get one free)
git clone https://github.com/openconstruct/gemdesk.git
cd gemdeskpip install -r requirements.txtCreate a .env file in the project root:
GEMINI_API_KEY=your_api_key_here
python gem.pyThe app will open in your default browser!
- Upload files - Click "Add Files" or paste URLs
- Ask questions - Type naturally or use slash commands
- Get insights - Receive analysis with citations
- Visualize data - Ask for charts when needed
- Export results - Save conversations and charts
Analysis Presets:
/report- Generate executive summary with key findings/synthesize- Identify patterns and generate novel insights/error-check- Find contradictions across sources
Example:
/report focus on Q4 financial performance
Help:
/help- Show all available commands
No buttons needed! Just ask naturally:
- "Plot the sales data over time"
- "Show me customer acquisition vs revenue"
- "Create a pie chart of market share"
Gemini will automatically analyze your files and generate appropriate visualizations.
Adjust reasoning depth via the dropdown:
- Minimal - Fast, basic reasoning
- Low - Light thinking
- Medium - Balanced (default for reports)
- High - Deep reasoning (default for synthesis/error-checking)
- PDF - Native support
- DOCX - Converted to PDF (preserves inline images)
- TXT, MD, RTF, HTML - Plain text formats
- ODT - OpenDocument Text
- XLSX - Converted to CSV (all sheets)
- ODS - OpenDocument Spreadsheet
- CSV - Native support
- PPTX - Native support (Gemini 3)
- ODP - OpenDocument Presentation (converted to text)
- Images - JPG, PNG, GIF, WEBP, HEIC, SVG
- Videos - MP4, MOV, AVI, WEBM, FLV
- Audio - MP3, WAV, FLAC, AAC, OGG
80+ programming languages supported including:
- Python, JavaScript, TypeScript, Java, C/C++, Go, Rust
- HTML, CSS, PHP, Ruby, Swift, Kotlin, Scala
- And many more...
Files are cached separately from conversation history for optimal performance:
- All uploaded files β Single cache (1 hour TTL)
- Conversation history β Sent as text only
- Cache updates automatically when files added/removed
Benefits:
- Faster responses (files not resent with every message)
- Lower token costs
- Efficient multi-turn conversations
User Upload
β
Type Detection
β
Conversion (if needed)
- DOCX β PDF (with images)
- XLSX β CSV (all sheets)
- ODT/ODP β Text
β
Upload to Gemini
β
Token Counting
β
Add to Cache
Gemini uses function calling to generate charts:
- User requests visualization
- Gemini analyzes data in files
- Calls
generate_charttool with parameters - Chart rendered with matplotlib
- Displayed in popup dialog
GEMINI_API_KEY- Your Gemini API key (required)
Edit gem.py to customize:
MODEL_ID = "gemini-3-flash-preview" # Model to use
MAX_CONTEXT_TOKENS = 1000000 # Context window size
MAX_FILES = 50 # File upload limit- 1M token context - Analyze extensive documents
- Context meter - Real-time usage tracking
- Smart caching - Reduce redundant processing
- Max files: 50 (configurable)
- Max file size: Limited by Gemini API
- Max context: 1M tokens total
Contributions welcome! Areas for improvement:
- Additional file format support
- More chart types
- Enhanced error handling
- Performance optimizations
- UI/UX improvements
# Clone repository
git clone https://github.com/openconstruct/gemdesk.git
cd gemdesk
# Install dev dependencies
pip install -r requirements.txt
# Run with debug logging
python gem.py"GEMINI_API_KEY not found"
- Create
.envfile with your API key - Ensure file is in project root directory
"Module not found" errors
- Run:
pip install --upgrade -r requirements.txt
File upload fails
- Check file size (must be under API limits)
- Verify file format is supported
- Check internet connection
Charts not generating
- Ensure matplotlib is installed:
pip install matplotlib - Verify chart request is clear (e.g., "plot X over time")
Context cache errors
- Files are automatically re-uploaded if cache expires
- Try removing and re-adding problematic files
MIT License - See LICENSE for details
- Google Gemini - Powerful multimodal AI capabilities
- Flet - Beautiful Python UI framework
- matplotlib - Chart generation
- python-docx & reportlab - Document processing
- Web deployment support
- Drag & drop file upload
- Google Search grounding integration
- Persistent chat history
- Multi-tab conversations
- Voice input support
- Collaborative features
Built for the Gemini 3 Hackathon 2026
Showcasing advanced multimodal analysis with intelligent context management and adaptive reasoning