Skip to content

Latest commit

 

History

History
88 lines (64 loc) · 4.18 KB

metrics.rst

File metadata and controls

88 lines (64 loc) · 4.18 KB

Metrics extraction ----------------

Overview

This package allows to extract metrics that are commonly used from annotations produced by the LENA or other pipelines.

child-project metrics --help

The list of supported metrics is shown below:

Variable Description pipelines
voc_fem/mal/och_ph number of vocalizations by different talker types per hour ACLEW,LENA,Period
voc_dur_fem/mal/och_ph total duration of vocalizations by different talker types in seconds per hour ACLEW,LENA,Period
avg_voc_dur_fem/mal/och average vocalization length (conceptually akin to MLU) by different talker types ACLEW,LENA,Period
wc_adu_ph adult word count (collapsing across males and females) ACLEW,LENA
wc_fem/mal_ph adult word count by different talker types ACLEW,LENA
sc_adu_ph adult syllable count (collapsing across males and females) ACLEW
sc_fem/mal_ph adult syllable count by different talker types ACLEW
pc_adu_ph adult phoneme count (collapsing across males and females) ACLEW
pc_fem/mal_ph adult phoneme count by different talker types ACLEW
freq_n frequency of child voc out of all vocs based on number of vocalizations ACLEW,LENA
freq_dur frequency of child voc out of all vocs based on duration of vocalizations ACLEW,LENA
cry_voc_chi_ph number of child vocalizations that are crying ACLEW,LENA
can_voc_chi_ph number of child vocs that are canonical ACLEW
non_can_vpc_chi_ph number of child vocs that are non-canonical ACLEW
sp_voc_chi_ph number of child vocs that are speech-like (can+noncan for ACLEW) ACLEW,LENA
cry_voc_dur_chi_ph total duration of child vocalizations that are crying ACLEW,LENA
can_voc_dur_chi_ph total duration of child vocs that are canonical ACLEW
non_can_voc_dur_chi_ph total duration of child vocs that are non-canonical ACLEW
sp_voc_dur_chi_ph total duration of child vocs that are speech-like (can+noncan for ACLEW) ACLEW,LENA
avg_cry_voc_dur_chi average duration of child vocalizations that are crying ACLEW,LENA
avg_cran_voc_dur_chi average duration of child vocs that are canonical ACLEW
avg_non_can_voc_dur_chi average duration of child vocs that are non-canonical ACLEW
avg_sp_voc_dur_chi average duration of child vocs that are speech-like (can+noncan for ACLEW) ACLEW,LENA
lp_n linguistic proportion = (speech)/(cry+speech) based on number of vocalizations ACLEW,LENA
cp_n canonical proportion = canonical /(can+noncan) based on number of vocalizations ACLEW
lp_dur linguistic proportion = (speech)/(cry+speech) based on duration of vocalizations ACLEW,LENA
cp_dur canonical proportion = canonical /(can+noncan) based on duration of vocalizations ACLEW

Note

Average rates are expressed in counts/hour (for events) or in seconds/hour (for durations).

LENA Metrics

child-project metrics /path/to/dataset output.csv lena --help

ACLEW Metrics ~~~~~~~~~~~~

child-project metrics /path/to/dataset output.csv aclew --help

Period-aggregated metrics

The Period Metrics pipeline aggregates vocalizations for each time-of-the-day-unit based on a period specified by the user. For instance, if the period is set to 15Min (i.e. 15 minutes), vocalization rates will be reported for each recording and time-unit (e.g. 09:00 to 09:15, 09:15 to 09:30, etc.).

The output dataframe has r × p rows, where r is the amount of recordings (or children if the -by option is set to child_id), and p is the amount of time-bins per day (i.e. 24 × 4 = 96 for a 15-minute period).

The output dataframe includes a period column that contains the onset of each time-unit in HH:MM:SS format. The duration columns contains the total amount of annotations covering each time-bin, in milliseconds.

If --by is set to e.g. child_id, then the values for each time-bin will be the average rates across all the recordings of every child.

child-project metrics /path/to/dataset output.csv period --help

..note:

Average rates are expressed in seconds/hour regardless of the period.