A unified auxiliary script system for India that preserves the phonetic fidelity and cultural heritage of 50+ Indian languages while embracing the power of modern AI technology.
- Advanced Script Converter: AI-powered transliteration between NeoGrammy and 50+ Indian languages with 95%+ phonetic accuracy
- Corpus Explorer: Interactive database of ancient and modern texts with parallel views
- AI Assistant: Smart suggestions for transliteration help and morphological analysis
- Community Feedback: Collaborative system for continuous improvement
- Unicode-Ready: PUA block mapping (U+16000–160FF) for NeoGrammy characters
- Speech Integration: Text-to-speech and speech-to-text capabilities
# Install dependencies
npm install
# Start development server
npm run dev
# Build for production
npm run build
# Deploy to GitHub Pages
npm run deploy:github-pages
# Preview production build locally
npm run preview-
GitHub Pages (Recommended)
npm run deploy:github-pages
Your site will be available at:
https://goyalayush-tech.github.io/script-bridge-forge/ -
Manual Deployment
npm run build # Upload the 'dist/' folder to your web server -
Automated CI/CD
- GitHub Actions workflow automatically deploys on main branch pushes
- Staging environment available via develop branch
- Production deployment via manual approval
- Build optimization complete
- Static asset optimization
- SEO and meta tags configured
- Error boundaries implemented
- Performance monitoring ready
- Analytics integration prepared
NeoGrammy leverages a comprehensive ecosystem of Indian NLP datasets. See our detailed research documentation:
- Research & Datasets Guide - Complete overview of datasets powering NeoGrammy
- Text Corpora: IndicCorp, Samanantar, BPCC, ILCI
- Speech Datasets: Shrutilipi, IndicVoices, Kathbath
- Task-Specific: IndicXlit, Dakshina, Naamapadam
- React + Vite: Modern, fast development experience
- TailwindCSS + shadcn/ui: Beautiful, accessible UI components
- Framer Motion: Smooth animations and transitions
- FastAPI (Python): High-performance REST & WebSocket APIs
- PostgreSQL/Neo4j: Advanced database for linguistic data
- AI Models: IndicXlit, morphological analyzers, semantic bridges
- IndicXlit: 11M-parameter Transformer for transliteration
- Three-Tier Pipeline: Phonological → Morphological → Semantic processing
- Unicode Integration: Custom font with ligatures for conjuncts
- Indo-Aryan: Hindi, Bengali, Marathi, Gujarati, Punjabi, Urdu, Assamese, Odia, Kashmiri, Konkani, Maithili, Nepali, Sindhi, Dogri
- Dravidian: Tamil, Telugu, Kannada, Malayalam
- Austro-Asiatic: Santali
- Tibeto-Burman: Manipuri, Bodo
- Sanskrit, Vedic Sanskrit, Pali, Prakrit, Apabhraṃśa, Gandhari, Classical Tamil
- Bhojpuri, Magahi, Awadhi, Tulu, Gondi, Ho, Mundari, Khasi, Sora, and more
- Phonetic Preservation: ≥95% accuracy
- Semantic Preservation: ≥90% accuracy
- Morphological Accuracy: ≥85%
- Accent/Prosody Fidelity: ≥80%
- Model Size: ≤500MB
- Inference Latency: ≤100ms per sentence
We welcome contributions from linguists, developers, and researchers! See our Contributing Guide for details.
- Dataset expansion and annotation
- Model training and fine-tuning
- UI/UX improvements
- Documentation and localization
- Research partnerships
This project is licensed under the MIT License - see the LICENSE file for details.
- AI4Bharat & Bhashini for foundational datasets and models
- Mozilla Common Voice for speech data contributions
- Academic Partners for historical corpus digitization
- Open Source Community for collaborative development
- Project Lead: NeoGrammy Project Team
- Email: contact@neogrammy.ai
- GitHub: github.com/goyalayush-tech/script-bridge-forge
- Twitter: @neogrammy_ai
Bridging 5,000 years of linguistic heritage with cutting-edge AI technology ✨