In [2]:
!jupyter-nbconvert --to slides gdh2020.ipynb  && open http://localhost:8001/gdh2020.slides.html

[NbConvertApp] Converting notebook gdh2020.ipynb to slides
[NbConvertApp] Writing 296261 bytes to gdh2020.slides.html


# Visualizing Poetic Meter in South Asian Languages

## A. Sean Pue
### pue@msu.edu @seanpue (Michigan State University)
## Ahmad Atta
### Govt. Zamindar Post Graduate College
## Rajiv Ranjan
### Michigan State University

##  Overview


* Explaining poetic meter in the modern languages of South Asia is tough even for experienced poets.
* Less familiar readers or listeners have difficulty learning them.
* The trouble: *traditional prosodic systems do not align well with the phonological features of modern South Asian languages*. 
* Modern scholars have offered alternative ways to think of meter.
* We augment that work by presenting software to visualize poetic meter using directed graphs 
* In multiple languages and scripts, it make poetic knowledge accessible to readers, scholars, and poets.

## Background


### In General

* Poetic prosody in South Asian languages is ***durational*** (based on metrical unit length) rather than ***accentual*** (based on stress)

* Existing tools for graphical exploration of poems, which are mostly designed for English and confine their “prosodic domains” to stress, are insufficient for the task

* In South Asia, there are two competing theories of prosody, one derived from Arabic (*‘uruz* عروض) and one from Sanskrit (*chhanda* छंद)

## Theories of Prosody in South Asia
![South Asia Map](./images/southasia.png)

### Perso-Arabic System
* The Arabic system traces its origins to the *revelation* of the eighth century Arab prosodist al-Khalil Ibn Ahmad of Basra.
    * Came to South Asia via Persian/Farsi, a previous lingua franca.
    * Basic unit is orthography, the written Arabic letter. 
    * Precise for classical Arabic, but combinatorially explosive in South Asian languages (units form across words).
    

### Sanskrit System
* Defines long or short units based on its abugida (segmental writing system), e.g. शाहरुख़ ख़ान-> शा,ह,रु,ख़,ख़ा,न
* Precise for Sanskrit, but short vowel endings are often dropped in modern languages.   

### Problems for Modern Languages
* Despite their aura of authenticity, both systems— especially in their nomenclatures—do not align well with modern languages and appear overly complex.

## Modern Prosody

Modern prosodists have attempted to make prosody more accessible by referring to patterns of long and short metrical units (Pybus 1924, Pritchett and Khaliq 1987, Faruqi 1968, Tabassum 1983, Nagasaki and Kim 2012). These are often represented using macrons, breves, or other symbols.

## Representing Poetic Meter as a Directed Graph

* Representing poetic meter as a walk (sequence of vertices/nodes and edges) through a directed graph (digraph) offers a significant advancement over previous metrical representations.
* Allows South Asian language poetic texts to be represented visually in their poetic meter without the complications of traditional prosody

### Graphical Model
* Assume start and end nodes
* Nodes represent short and long metrical units

### Visual Model

![Overview Visualization](./images/Overview.png)



* Long metrical units as rectangles

* Short metrical units as circles

* Metrical feet as clusters (when applicable)

* Uncounted short metrical units as dashed circles (when applicable)

### Advantages

* Resolves issues of metrical flexibility and complexity that, in traditional prosody, lead to excessive categorization (to accomodate Arabic/Sanskrit metrical categories)

* Visualizes the patterns of durational sound that produce meaning for poets and their listeners

## Graphical Representation of Poetic Data

* A Python and Javascript module (urdubiometer) converts poetic texts into poetic data

* Original text is transliterated, capturing orthography and pronunciation, both of which are necessary for the metrical analysis of South Asian language poetries.
    * Utilizes graphtransliterator, a rules-based approach to transliteration (https://graphtransliterator.readthedocs.org)

* Tokens are categorized and assigned to metrical units, long or short, and labeled based on what specific metrical rule is applied. 

In [82]:
import urdubiometer
urdubiometer.GhazalScanner().scan("saarii mastii sharaab kii sii hai")[0]
# using = for long and - for short

ScanResult(scan='=-==-=-===', matches=[UnitMatch(type='=', rule_found='l_bcv', orig_tokens=[' ', 's', 'aa']), UnitMatch(type='-', rule_found='s_cv<b>', orig_tokens=['r', 'ii']), UnitMatch(type='=', rule_found='l_bcsc', orig_tokens=[' ', 'm', 'a', 's']), UnitMatch(type='=', rule_found='l_cv', orig_tokens=['t', 'ii']), UnitMatch(type='-', rule_found='s_bcs', orig_tokens=[' ', 'sh', 'a']), UnitMatch(type='=', rule_found='l_cv', orig_tokens=['r', 'aa']), UnitMatch(type='-', rule_found='s_c', orig_tokens=['b']), UnitMatch(type='=', rule_found='l_bcv', orig_tokens=[' ', 'k', 'ii']), UnitMatch(type='=', rule_found='l_bcv', orig_tokens=[' ', 's', 'ii']), UnitMatch(type='=', rule_found='l_bcv', orig_tokens=[' ', 'h', 'ai'])], meter_key=14)

## Implementation
Graph layout is done using GraphViz library (Gansner and North 2000), including graphviz (Python) and supplemented Dagre-D3 (Javascript)

## Discussion

### Advancements 

* Works across the multiple scripts of South Asian reading and listening publics.

* Advances earlier methods of visualizing meter by affording new sorts of interaction, particularly in web-based environments

#### For Scholars
* Directed graphs allow an elegant means to visualize metrical complexity.
* All of the possible meters of a particular poet can be compared to those of another. 
* The flow through a network also opens new sorts of metrics for comparative analysis.

#### For Listeners
* A walk through a directed graph can be colored in time with a particular audio or visual recording that has been marked up for phoneme timings as well as metrical units, allowing new sorts of insights that are not easily available in text alone.|

* Interactive versions of directed graphs can have instructive qualities for those with various levels of knowledge or intuition about meter

* Listeners can learn the rules of prosodic systems by clicking on nodes that represent poetic data.

#### For Poets

* This method offers a visual means for composing verse.

## Conclusion

While based in Urdu and Hindi, the methodologies described can be easily adapted and applied to a large number of South Asian and other languages to provide renewed access to poetry, conceived as data, in the digital age.

## Acknowledgements


