Skip to content

moh72y/Munazarat1.0

Repository files navigation

Munazarat 1.0 Corpus

Welcome to the Munazarat 1.0 Corpus repository! This repository contains a valuable collection of competitive Arabic debates, transcribed and organized for research and analysis. Below, you'll find information on how to access and use this corpus effectively.

About Munazarat 1.0 Corpus

The Munazarat 1.0 Corpus is a unique resource for researchers interested in various aspects of Arabic competitive debating, Arabic linguistics studies, argumentation studies, education, and Arabic Natural Language Processing (NLP). It consists of approximately 50 hours of transcribed competitive debates, hosted by QatarDebate, covering university and school-level debates held between 2013 and 2023.

Accessing the Corpus

You can download the Munazarat 1.0 Corpus as a ZIP file containing 73 debate files in TXT format from the following link: Download Corpus

File Naming Convention

Each TXT file in the corpus is named descriptively to provide essential information about the debate, including:

  • Serial number
  • Tournament name
  • Year
  • Gender of speakers
  • Whether the speakers are native or non-native Arabic speakers

For example, a file named 028-IUDC-2017-MFMFMF-AA represents a debate with serial number 028, from the International Universities Debating Championship (IUDC), featuring three male speakers in the proposition team and three female speakers in the opposition team, all of whom are Arabic speakers.

Metadata Information

Along with the debate transcript files, we provide a detailed Excel sheet that offers metadata for each debate. This metadata includes information such as the tournament, university or school level, debate motion, proposition and opposition teams, the number of male and female debaters, word count, YouTube link, the winning team, and the debate topic genre (e.g., Politics, Economy, Human Rights, Law, etc.). Researchers can use this metadata for various analytical purposes and to filter debates based on specific criteria.

Potential Research Applications

The Munazarat 1.0 Corpus can be utilized for various research applications, including:

  1. Annotation of argument schemes in speeches using tools like UBIAI.
  2. Sentiment analysis on the corpus using tools such as Repustate.
  3. Linguistic analysis through tools like AntConc.

Citation

If you use the Munazarat 1.0 Corpus in your research, please cite it using the following format:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published