Skip to content

The parallel treebank of speeches by Narendra Modi, Prime Minister of India

Notifications You must be signed in to change notification settings

unipv-larl/IndiaPM-Treebank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

The parallel treebank of speeches by Narendra Modi, PM of India

This is the website of the project A parallel treebank of speeches by Narendra Modi, Prime Minister of India.

On this website, you can find the latest updates on the project and everything you need to know to become part of it.

The Project

Overview

The project aims to construct a parallel treebank featuring speeches delivered by Narendra Modi, the Prime Minister of India, in several languages of India. The project is embarking on its initial phase, which involves incorporating speeches in Hindi, Marathi, Telugu, and English. However, as the project progresses, additional languages will be introduced in subsequent phases. The motivation behind this initiative is to create a valuable linguistic resource that spans multiple languages spoken in India. By providing syntactic and morphological annotations, the treebank facilitates the analysis of cross-lingual similarities and differences in morphology, syntax, and lexicon. Thus, the Treebank aims to be a valuable resource for linguistic research, offering insights into language variation and usage in the context of Prime Minister Modi's speeches.

Why Modi's Speeches?

The uniqueness of Modi's speeches lies in their availability on the Indian government website, where hundreds of speeches are transcribed and translated into multiple official languages of India. This project addresses the scarcity of freely accessible parallel treebanks, especially for Indian languages. While other parallel sources exist, Modi's speeches offer distinct advantages:

  • Contemporary Nature: Unlike traditional parallel sources like the Bible, Modi's speeches are contemporary.
  • Natural Language: In contrast to legal texts, the speeches exhibit a more natural language, enabling the study of trends in contemporary standard Hindi across various linguistic levels.

Workflow

The project involves downloading transcriptions of Prime Minister Modi's speeches, publicly accessible on the Indian government website (PM India). The transcriptions are available in all official languages of India. The collected data undergoes syntactic and morphological annotation by a team of annotators, with subsequent alignment to ensure comparability across languages.

Join the Poject

If you are an expert or a native speaker of any of the official languages of India and are interested in contributing to the project, please let us know by filling out the this form.

Stay tuned for updates as we progress through the various phases of this exciting multilingual treebank initiative.

Contacts

For any inquiries, please contact:

About

The parallel treebank of speeches by Narendra Modi, Prime Minister of India

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published