Munazarat 1.0 Corpus

Welcome to the Munazarat 1.0 Corpus repository! This repository contains a valuable collection of competitive Arabic debates, transcribed and organized for research and analysis. Below, you'll find information on how to access and use this corpus effectively.

About Munazarat 1.0 Corpus

The Munazarat 1.0 Corpus is a unique resource for researchers interested in various aspects of Arabic competitive debating, Arabic linguistics studies, argumentation studies, education, and Arabic Natural Language Processing (NLP). It consists of approximately 50 hours of transcribed competitive debates, hosted by QatarDebate, covering university and school-level debates held between 2013 and 2023.

Accessing the Corpus

You can download the Munazarat 1.0 Corpus as a ZIP file containing 73 debate files in TXT format from the following link: Download Corpus

File Naming Convention

Each TXT file in the corpus is named descriptively to provide essential information about the debate, including:

Serial number
Tournament name
Year
Gender of speakers
Whether the speakers are native or non-native Arabic speakers

For example, a file named 028-IUDC-2017-MFMFMF-AA represents a debate with serial number 028, from the International Universities Debating Championship (IUDC), featuring three male speakers in the proposition team and three female speakers in the opposition team, all of whom are Arabic speakers.

Metadata Information

Along with the debate transcript files, we provide a detailed Excel sheet that offers metadata for each debate. This metadata includes information such as the tournament, university or school level, debate motion, proposition and opposition teams, the number of male and female debaters, word count, YouTube link, the winning team, and the debate topic genre (e.g., Politics, Economy, Human Rights, Law, etc.). Researchers can use this metadata for various analytical purposes and to filter debates based on specific criteria.

Potential Research Applications

The Munazarat 1.0 Corpus can be utilized for various research applications, including:

Annotation of argument schemes in speeches using tools like UBIAI.
Sentiment analysis on the corpus using tools such as Repustate.
Linguistic analysis through tools like AntConc.

Citation

If you use the Munazarat 1.0 Corpus in your research, please cite it using the following format:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Munazarat 1.0 TXT x73		Munazarat 1.0 TXT x73
README.md		README.md
Voice-to-text Human Review Guideline		Voice-to-text Human Review Guideline
Voice-to-text human review guidelines.pdf		Voice-to-text human review guidelines.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Munazarat 1.0 Corpus

About Munazarat 1.0 Corpus

Accessing the Corpus

File Naming Convention

Metadata Information

Potential Research Applications

Citation

About

Releases

Packages

moh72y/Munazarat1.0

Folders and files

Latest commit

History

Repository files navigation

Munazarat 1.0 Corpus

About Munazarat 1.0 Corpus

Accessing the Corpus

File Naming Convention

Metadata Information

Potential Research Applications

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages