# Rhyme

In [1]:
# !pip install -r ../requirements.txt
import sys
sys.path.append('../')
from generative_formalism import *

In [2]:
txt = """
From fairest creatures we desire increase,
That thereby beauty’s rose might never die,
But as the riper should by time decease,
His tender heir might bear his memory;
But thou, contracted to thine own bright eyes,
Feed’st thy light’s flame with self-substantial fuel,
Making a famine where abundance lies,
Thyself thy foe, to thy sweet self too cruel.
Thou that art now the world’s fresh ornament
And only herald to the gaudy spring,
Within thine own bud buriest thy content,
And, tender churl, mak’st waste in niggarding.
   Pity the world, or else this glutton be,
   To eat the world’s due, by the grave and thee.
"""

In [3]:
documentation(parse_text)

parses = parse_text(txt)
parses.to_html()

##### `parse_text`

```md
Parse poem text using Prosodic for meter and stress analysis.

    Uses the Prosodic library to analyze the metrical structure of the input text,
    parsing at the line level with single-threaded processing.

    Parameters
    ----------
    txt : str
        The poem text to parse for metrical analysis.

    Returns
    -------
    prosodic.Text
        Parsed Prosodic Text object containing meter and stress information.

    Calls
    -----
    - prosodic.Text(txt).parse(parse_unit='line', num_proc=1).best
    
```
----


                                                                     1.19it/s]

In [4]:
documentation(get_parses_for_txt)
parses_df = get_parses_for_txt(txt)
parses_df

##### `get_parses_for_txt`

```md
Get prosodic parses for a text, with caching and optional postprocessing.

    Retrieves cached prosodic parse data if available and not forced to regenerate.
    If no cached data exists or force=True, parses the text using Prosodic and caches
    the result. Optionally postprocesses the parse data into rhythm measurements.

    Parameters
    ----------
    txt : str
        The poem text to parse.
    stash : HashStash, default=STASH_RHYTHM
        Cache storage for parsed data.
    force : bool, default=False
        If True, re-parse even if cached data exists.
    postprocess : bool, default=False
        If True, apply postprocessing to extract rhythm measurements.

    Returns
    -------
    pd.DataFrame or dict
        Raw parse DataFrame if postprocess=False, or processed rhythm measurements
        dict if postprocess=True. Returns empty DataFrame if parsing fails.

    Calls
    -----
    - parse_text(txt) [if no cached data or force=True]
    - postprocess_parses_data(odf) [if postprocess=True]
    
```
----


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,parse_score,parse_num_viols,parse_ambig,parse_is_bounded,parse_num_sylls,parse_num_words,*w_peak,*w_stress,*s_unstress,*unres_across,...,*total_sylls,*total,*w_peak_norm,*w_stress_norm,*s_unstress_norm,*unres_across_norm,*unres_within_norm,*foot_size_norm,*total_sylls_norm,*total_norm
stanza_num,line_num,line_txt,linepart_num,parse_rank,parse_txt,parse_meter,parse_stress,sent_num,sentpart_num,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1
1,1,"From fairest creatures we desire increase,",1,1,from FAI rest CREA tures WE de SIRE in CREASE,-+-+-+-+-+,-+-+---+-+,1,1,1.0,1,1,0,10,6,0.0,0.0,1.0,0.0,...,1,1,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.1
1,2,"That thereby beauty's rose might never die,",2,1,that THE reby BEA uty's ROSE might NE ver DIE,-+-+-+-+-+,-+++-+-+-+,1,2,1.0,1,1,0,10,7,0.0,1.0,0.0,0.0,...,1,1,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.1
1,3,"But as the riper should by time decease,",3,1,but AS the RIPER should.by TIME de CEASE,-+-+--+-+,---+--+-+,1,3,1.0,1,2,0,9,8,0.0,0.0,1.0,0.0,...,1,1,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.111111
1,4,His tender heir might bear his memory;,4,1,his TEN der HEIR might BEAR his ME mo RY,-+-+-+-+-+,-+-+-+-+--,1,4,1.0,1,3,0,10,7,0.0,0.0,1.0,0.0,...,1,1,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.1
1,5,"But thou, contracted to thine own bright eyes,",5,1,but.thou CON trac TED to.thine OWN bright EYES,--+-+--+-+,--+----+++,1,5,2.0,2,5,0,10,8,0.0,1.0,1.0,0.0,...,2,2,0.0,0.1,0.1,0.0,0.0,0.0,0.2,0.2
1,6,"Feed'st thy light's flame with self-substantial fuel,",7,1,FEED'ST thy LIGHT'S.FLAME with SELF subs TAN tial FU el,+-++-+-+-+-,+-++-+-+-+-,1,7,1.0,1,5,0,11,8,0.0,0.0,0.0,1.0,...,1,1,0.0,0.0,0.0,0.090909,0.0,0.0,0.090909,0.090909
1,7,"Making a famine where abundance lies,",8,1,MA king.a FA mine WHERE a BUN dance LIES,+--+-+-+-+,+--+-+-+-+,1,8,1.0,1,4,0,10,6,0.0,0.0,0.0,1.0,...,1,1,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.1
1,8,"Thyself thy foe, to thy sweet self too cruel.",9,1,thy SELF thy FOE to.thy SWEET self TOO cruel,-+-+--+-+-,-+-+--++++,1,9,2.0,2,7,0,10,9,0.0,2.0,0.0,0.0,...,2,2,0.0,0.2,0.0,0.0,0.0,0.0,0.2,0.2
1,9,Thou that art now the world's fresh ornament,11,1,thou THAT art NOW the WORLD'S fresh OR na MENT,-+-+-+-+-+,-+-+-+++--,2,11,2.0,2,2,0,10,8,0.0,1.0,1.0,0.0,...,2,2,0.0,0.1,0.1,0.0,0.0,0.0,0.2,0.2
1,10,"And only herald to the gaudy spring,",12,1,and ON ly HE rald TO the GAU dy SPRING,-+-+-+-+-+,-+-+---+-+,2,11,1.0,1,2,0,10,7,0.0,0.0,1.0,0.0,...,1,1,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.1


In [5]:
documentation(get_rhythm_for_txt)

rhythm_data = get_rhythm_for_txt(txt)
assert rhythm_data == postprocess_parses_data(parses_df)
rhythm_data

##### `get_rhythm_for_txt`

```md
Get rhythm measurements for a single poem text.

    Convenience function that retrieves processed rhythm measurements for a text,
    equivalent to calling get_parses_for_txt with postprocess=True.

    Parameters
    ----------
    txt : str
        The poem text to analyze.
    **kwargs
        Additional keyword arguments passed to get_parses_for_txt.

    Returns
    -------
    dict
        Dictionary of rhythm measurements, or empty dict if parsing fails.

    Calls
    -----
    - get_parses_for_txt(txt, postprocess=True, **kwargs)
    
```
----


{'is_iambic_pentameter': 0.35714285714285715,
 'is_unambigously_iambic_pentameter': 0.14285714285714285,
 'syll01_stress': 0.21428571428571427,
 'syll02_stress': 0.6428571428571429,
 'syll03_stress': 0.21428571428571427,
 'syll04_stress': 0.9285714285714286,
 'syll05_stress': 0.21428571428571427,
 'syll06_stress': 0.5714285714285714,
 'syll07_stress': 0.21428571428571427,
 'syll08_stress': 0.8571428571428571,
 'syll09_stress': 0.21428571428571427,
 'syll10_stress': 0.75,
 'forth_syllable_stressed': 0.9285714285714286,
 'perc_ww_in_meter': 0.06153846153846154}

In [6]:
documentation(get_rhythm_for_shakespeare_sonnets)
df_shak_rhythm = get_rhythm_for_shakespeare_sonnets()
df_shak_rhythm

##### `get_rhythm_for_shakespeare_sonnets`

```md
Load and analyze rhythm in Shakespeare's sonnets.

    Reads the complete text of Shakespeare's sonnets, splits into individual poems,
    and computes rhythm measurements for each sonnet.

    Parameters
    ----------
    force : bool, default=False
        If True, re-parse sonnets even if cached data exists.

    Returns
    -------
    pd.DataFrame
        DataFrame with rhythm measurements for each of the 154 sonnets,
        indexed by sonnet ID (e.g., 'shakespeare_sonnet_001').

    Calls
    -----
    - get_id_hash(s) [for each sonnet text]
    - get_rhythm_for_txt(txt, force=force) [for each sonnet]
    
```
----


* Getting rhythm for shakespeare sonnets: 100%|██████████| 154/154 [00:00<00:00, 229.19it/s]


Unnamed: 0_level_0,id_hash,txt,is_iambic_pentameter,is_unambigously_iambic_pentameter,syll01_stress,syll02_stress,syll03_stress,syll04_stress,syll05_stress,syll06_stress,syll07_stress,syll08_stress,syll09_stress,syll10_stress,forth_syllable_stressed,perc_ww_in_meter
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
shakespeare_sonnet_001,707167,"FROM fairest creatures we desire increase,\nTh...",0.357143,0.142857,0.214286,0.642857,0.214286,0.928571,0.214286,0.500000,0.285714,0.785714,0.285714,0.692308,0.928571,0.061538
shakespeare_sonnet_002,911590,"When forty winters shall beseige thy brow,\nAn...",0.714286,0.428571,0.071429,0.714286,0.214286,0.928571,0.214286,0.714286,0.285714,0.928571,0.071429,0.846154,0.928571,0.037879
shakespeare_sonnet_003,369626,"Look in thy glass, and tell the face thou view...",0.357143,0.071429,0.285714,0.642857,0.071429,0.785714,0.000000,0.785714,0.071429,0.642857,0.214286,0.692308,0.785714,0.060150
shakespeare_sonnet_004,891098,"Unthrifty loveliness, why dost thou spend\nUpo...",0.500000,0.214286,0.285714,0.571429,0.214286,0.857143,0.071429,0.785714,0.071429,0.500000,0.142857,0.833333,0.857143,0.070866
shakespeare_sonnet_005,644683,"Those hours, that with gentle work did frame\n...",0.500000,0.285714,0.357143,0.642857,0.071429,0.785714,0.142857,0.785714,0.071429,0.714286,0.285714,0.714286,0.785714,0.059701
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
shakespeare_sonnet_150,876499,"O, from what power hast thou this powerful mig...",0.642857,0.142857,0.000000,0.642857,0.000000,0.714286,0.071429,0.785714,0.142857,0.714286,0.214286,0.785714,0.714286,0.036765
shakespeare_sonnet_151,169495,Love is too young to know what conscience is;\...,0.428571,0.142857,0.214286,0.571429,0.214286,0.714286,0.214286,0.785714,0.357143,0.714286,0.142857,0.714286,0.714286,0.062500
shakespeare_sonnet_152,962496,"In loving thee thou know'st I am forsworn,\nBu...",0.357143,0.214286,0.071429,0.571429,0.071429,0.928571,0.357143,0.857143,0.071429,0.571429,0.357143,0.928571,0.928571,0.036232
shakespeare_sonnet_153,775847,"Cupid laid by his brand, and fell asleep:\nA m...",0.571429,0.285714,0.071429,0.785714,0.428571,0.785714,0.142857,0.785714,0.214286,0.785714,0.230769,1.000000,0.785714,0.030303


In [7]:
documentation(get_rhythm_for_sample)

df_smpl = get_chadwyck_corpus_sampled_by('sonnet_period')
df_smpl_rhythm = get_rhythm_for_sample(df_smpl)

##### `get_rhythm_for_sample`

```md
Extract rhythm measurements for a sample of poems.

    Computes rhythm measurements (meter, stress patterns, etc.) for each poem
    in the sample, returning a DataFrame with one row per poem.

    Parameters
    ----------
    df_smpl : pd.DataFrame
        DataFrame containing poem texts in a 'txt' column, indexed by poem IDs.
    stash : HashStash, default=STASH_RHYTHM
        Cache storage for parsed data.
    force : bool, default=False
        If True, re-parse even if cached data exists.
    gen : bool, default=True
        If True, generate new parses; if False, only use cached data.
    verbose : bool, default=DEFAULT_VERBOSE
        If True, show progress information.
    **kwargs
        Additional keyword arguments (unused).

    Returns
    -------
    pd.DataFrame
        DataFrame with rhythm measurements, indexed by poem ID, or empty
        DataFrame if no valid measurements found.

    Calls
    -----
    - _clean_df(df_smpl)
    - get_rhythm_for_txt(txt, stash=stash, force=force, postprocess=True) [if gen=True]
    - postprocess_parses_data(stash.get(txt)) [if gen=False]
    
```
----


* Getting rhythm for sample: 100%|██████████| 999/999 [00:05<00:00, 175.67it/s]
