CogniHuman is a research foundation registered in India, dedicated to advancing open and ethical Voice AI and Large Language Models for public benefit. We believe that artificial intelligence should be a common good—transparent, accessible, and culturally inclusive.
Our work focuses on bridging the digital divide, preserving linguistic heritage, and promoting ethical standards in the age of automation.
We pursue seven core objectives:
| # | Objective | What We Do |
|---|---|---|
| 1 | Advancement of Scientific Research | Conduct and publish open research at the intersection of human cognition and AI |
| 2 | Preservation of Linguistic Heritage | Build open-source Voice AI and LLMs for regional languages and underrepresented dialects |
| 3 | Promotion of Ethical Standards | Create benchmarks and explainable frameworks that protect human rights and privacy |
| 4 | Relief of the Digital Divide | Provide free/subsidized AI educational tools to marginalized communities |
| 5 | Education & Vocational Training | Organize workshops and digital skills training for youth and the general public |
| 6 | Dissemination of Public Knowledge | Maintain open repositories of research findings and datasets as a common good |
| 7 | Global Social Collaboration | Foster an inclusive ecosystem for researchers and social workers to solve humanitarian challenges |
We focus on real research contributions, not API wrappers. Our current and planned projects include:
A systematic evaluation framework for Voice AI systems on Indian languages. Standardized test sets, evaluation metrics (WER, MOS, latency), and baseline results—all open-sourced.
High-quality, ethically-sourced speech datasets for underrepresented regional languages and dialects, with professional documentation and consent protocols.
- Fine-tuning existing speech models (Moshi, Mimi, Whisper) for low-resource languages
- Optimization research for CPU-only Voice AI pipelines
- Parameter-efficient adaptation techniques (LoRA, QLoRA) for Indic languages
We are not a wrapper project.
We distinguish ourselves by:
- Publishing reproducible methodologies — not just demos
- Creating novel datasets — with ethical consent and annotation standards
- Open-sourcing weights and code — not just calling APIs
- Rigorous benchmarking — not "it works on my test"
If it doesn't advance open science, we don't build it.
We welcome volunteers, researchers, linguists, and developers who share our mission.
- Data Collection — Help record or transcribe speech for underrepresented dialects
- Model Evaluation — Run benchmarks and document results
- Fine-Tuning — Adapt existing models for new languages
- Documentation — Improve our methodology papers and guides
- Outreach — Connect us with communities, donors, or CSR partners
- Read our Code of Conduct
- Check open issues on repositories
- Join our community discussions (coming soon)
- Reach out at
inquiries@cognihuman.org
| Asset | License |
|---|---|
| Code | Apache 2.0 |
| Model Weights | Apache 2.0 or MIT |
| Datasets | CC BY-SA 4.0 |
| Documentation | CC BY 4.0 |
All our work remains open and freely usable for academic and social progress.
- Incorporation: Section 8 Company (Not-for-Profit), Government of India
- Tax Exemptions: 12A & 80G registered
- Government Recognition: NITI Aayog (Darpan) registered
- CSR Eligibility: CSR-1 registered
- Commencement: INC-20A filed
| Purpose | |
|---|---|
| General Inquiries | inquiries@cognihuman.org |
| Research Collaborations | research@cognihuman.org |
| CSR / Partnerships | partners@cognihuman.org |
CogniHuman exists because of the open research community. We stand on the shoulders of Kyutai, Hugging Face, EleutherAI, and every researcher who believes AI should be a common good.
Built with purpose. Open for collaboration.
Last updated: April 2026