Skip to content

Low-ResourceDialectology/DialectMapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

DialectMapping

Towards a Complete Interactive Mapping of Kurdish Dialectology
Explore the docs »

View Demo (TODO) · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Project Name Screen Shot

This will be an interactive map for various dialects from low-resource languages, with a focus on and starting with Kurdish.

There exist a plathora of Kurdish dialects that are often unintelligible to the native speakers of each other. However, Kurdish is considered to be a dialect continuum with at times sudden language shifts and sometimes smooth transitions from one dialect to the neighboring one. At first it might seem confusing and counter intuitive that it has been found, that Central Kurdish is considered to be very understandable by speakers of Northern Kurdish and Southern Kurdish to be understandable by speakers of Central Kurdish- but speakers of Northern Kurdish hardly being able to understand Southern Kurdish. An explanation for this can be found in the prior mentioned dialect continuum- where "Central Kurdish" is not necessarily "Central Kurdish", since there exist many differen subdialects of Central Kurdish.

Project Motivation Screen Shot

Utilizing an interactive map, visualizing the different aspects and geographical distribution of the separate dialects might enable new insights and at the very least an easier understanding of Kurdish linguistics.

(back to top)

Poster presented at the ICKL-6 in 2023

The International Conference on Kurdish Linguistics (ICKL) is a biannual conference serving as a forum of scientific exchange for linguists working on any aspect of Kurdish, including the interactions with its neighboring languages.

Project Poster Screen Shot The poster as pdf

Built With

List of major frameworks/libraries used to bootstrap this project.

  • R

(back to top)

Getting Started

Prerequisites

TODO

Installation

TODO: More Details

  • Install R
  • Set library directory for R
echo 'export R_LIBS_USER="/media/CrazyProjects/LowResDialectology/DialectMapping/DialectMapping/R"' >> ~/.bashrc
source ~/.bashrc
  • Install packages (via setup script)

(back to top)

Usage

Example: German

    1. We assume German to be spoken in Germany and maybe in the surrounding countries, so we look on google maps (or more old-school: A real atlas) and write the country names into the DialectMapping/data/info/german_countries.txt file (line-by-line).
    1. We run the bash script to convert our list of country names into country codes used by gadm and executable R-Code.
bash country_names2codes.sh
  • Found countries are written into: ./data/inter/german_countries_codes.txt
  • Missing countries are written into: ./data/inter/german_countries_not_found.txt
  • (TODO: Link with DialectOntology to automate this part)

For more examples, please refer to the Documentation

(back to top)

Roadmap

  • Set up this Repository
  • Section 1: Dialect Information
    • Classification/Typology
      • (partially) Languege Family
      • (partially) (Sub-)Dialect(-Group)
      • Linguistic Information
    • (Meta) Data
      • (partially) Areas where spoken
      • Native Speaker
      • (partially) Link to Publications
  • Section 2: Geolocation Data
    • Data Source(s)
      • Comfortable Access
      • Up to date
      • (partially) Proper Format
    • Preprocess and Aggregate
      • Normalize Naming
      • Link GeoData to Dialect Data
  • Section 3: Interactive Map
    • Map Functionalities
      • (partially) R-Scripts and Packages
      • Accessible Visualisations
      • Stress-Testing
    • Online Hosting
      • Server Setup
      • Stress-Testing

See the TODO: open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the Apache License. See LICENSE.txt for more information.

(back to top)

Contact

Christian Schuler - christianschuler8989(4T)gmail.com

Raman Ahmad - raman.ahmad2022(4T)gmail.com

Related project of the authors: Analysis of Phonology and Morphology in the Kobani Dialect

(back to top)

Acknowledgments

A list of helpful resources we would like to give credit to:

  • Best-README-Template
  • GADM provides maps and spatial data for all countries and their sub-divisions. You can browse our maps or download the data to make your own maps.

(back to top)

About

Towards a complete Mapping of Kurdish Dialectology

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published