# Tutorial on repertoire traditions discovery with PyCantus

In this tutorial we demonstrate possible ways how to research repertoire structure of Gregorian chant with [Pycantus library](https://github.com/dact-chant/PyCantus/tree/main) on [CantusCorpus v1.0](https://github.com/DvorakovaA/CantusCorpus/tree/main/cantuscorpus_1.0) dataset.

For that we are going to use Python `networkx` library.

In [1]:
import pycantus
from pycantus import data

## Load dataset

In [10]:
cantuscorpus = data.load_dataset("cantuscorpus_v1.0", load_editable=True, 
                                  create_missing_sources=True)
print(f'Number of chants in corpus before any processing: {len(cantuscorpus.chants)}')
print(f'Number of sources in corpus before any processing: {len(cantuscorpus.sources)}')

Loading chants and sources...
Creating missing sources...
0 missing sources created!
Data loaded!
Number of chants in corpus before any processing: 888010
Number of sources in corpus before any processing: 2278


## Filter chants
1. We do not want to work with fragments of sources. 

2. We want to work with antiphons and responsories.

3. Drop sources that are empty after this filtration.

In [11]:
# 1. We do not want to work with fragments of sources.
# So we would try to get to that via dropping sources with less than 150 chants.
cantuscorpus.drop_small_sources_data(min_chants=150)
print(f'Number of chants after dropping sources with less than 150 chants: {len(cantuscorpus.chants)}')
print(f'Number of sources after dropping sources with less than 150 chants: {len(cantuscorpus.sources)}')

Number of chants after dropping sources with less than 150 chants: 860792
Number of sources after dropping sources with less than 150 chants: 473


In [12]:
# 2. We want to work with antiphons and responsories.
# So we would construct Filter wih such parameters.
from pycantus.filtration import Filter
a_r_filter = Filter('a_r_filter')
a_r_filter.add_value_include('genre', ['A', 'R']) # A = antiphon, R = responsory
cantuscorpus.apply_filter(a_r_filter)
print(f'Number of chants after filtering only antiphons and responsories: {len(cantuscorpus.chants)}')

Number of chants after filtering only antiphons and responsories: 382578


In [13]:
# 3. Drop sources that are empty after this filtration.
cantuscorpus.drop_empty_sources()
print(f'Number of sources after dropping empty sources: {len(cantuscorpus.sources)}')

Number of sources after dropping empty sources: 427


## Feasts selection
For more meaningful search we want to look at only a subset of repertoire at each search. Good way to divide the mass of chants is based on feasts they are prescribed for.  
  
We would pick Marian feasts days tradition for example. 
That includes 

In [None]:
feast_filter = Filter('feast_filter')
feast_filter.add_value_include('feast', ['Purificatio Mariae'])

## Community detection

In [None]:
from networkx import louvain_communities