<a href="https://colab.research.google.com/github/EA-Digifolk/EA-Digifolk-Dataset/blob/main/EADigifolk.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# EA-Digifolk Explorer



Links:

* [EA-Digifolk dataset](https://github.com/EA-Digifolk/EA-Digifolk-Dataset.git)
* [Extract Features from MEI Parser](https://github.com/EA-Digifolk/MEIParser_features)
* [Presentation](https://)

## Setup

This section covers downloading the the [EA-Digifolk dataset](https://github.com/EA-Digifolk/EA-Digifolk-Dataset.git) and the [Parser](https://github.com/EA-Digifolk/MEIParser_features) to extract features from MEI files, and installing the required libraries for the parser to function, and the [Musescore](https://musescore.org) software for displaying the musical scores.

In [1]:
%%capture
#@title Download the EA-Digifolk Dataset from Github
%cd /content

import os
if os.path.exists('EA-Digifolk-Dataset'):
  !git -C EA-Digifolk-Dataset pull
else:
  !git clone https://github.com/EA-Digifolk/EA-Digifolk-Dataset.git

In [7]:
%%capture
%cd /content
#@title Download the MEI Parser

import os
if os.path.exists('MEIParser_features'):
  !git -C MEIParser_features pull
else:
  !git clone https://github.com/EA-Digifolk/MEIParser_features


!pip install -r MEIParser_features/requirements.txt -q

import sys
if not '/content/MEIParser_features' in sys.path:
  sys.path.append('/content/MEIParser_features')

In [3]:
%%capture
#@title Install Musescore
!apt-get update -q && apt-get install musescore lilypond -q
%env QT_QPA_PLATFORM=offscreen

In [4]:
#@title Install Music21 and setup Musescore in the Music21 Environment
!pip install music21 -q

import music21
env = music21.environment.Environment()
env['pdfPath'] = '/usr/bin/musescore'
env['graphicsPath'] = '/usr/bin/musescore'
env['musicxmlPath'] = '/usr/bin/musescore'
env['musescoreDirectPNGPath'] = '/usr/bin/musescore'
env['autoDownload'] = 'allow'
env['warnings'] = 0

## Extract features from MEI files

This section covers the processing of the dataset: extracting the features from the MEI files and save as a pandas dataframe for easy exploration.

This section is optional, as the saved pandas dataframe is provided in the EA-Digifolk dataset folder by default.

In [None]:
# @title Process Dataset

# Import Libs from Python
import importlib
import glob
from fractions import Fraction
from tqdm import tqdm

# Import External Libs
import music21 as m21
import pandas as pd

# Import Parser
import parser_mei_features
from parser_mei_features import MeiParser

songs = reversed(sorted(list(glob.glob('EA-Digifolk-Dataset/Spanish/*.mei') + glob.glob('EA-Digifolk-Dataset/Mexican/*.mei'))))
songs = [so for so in songs if so not in [f'EA-Digifolk-Dataset/Spanish/{s}' for s in ['ES-1948-AS-FP-006.mei', 'ES-1948-CB-CO-376.mei', 'ES-1948-CB-CO-418.mei', 'ES-1991-CL-KS-147.mei']] ]

songs = list(reversed(songs))

errors = []
EADIGIFOLKNT = pd.DataFrame()

for song in tqdm(songs):

    try:
      mei_parser = MeiParser()
      song_features = mei_parser.parse_mei(song, verbose=False)
      EADIGIFOLKNT = pd.concat([EADIGIFOLKNT, pd.DataFrame().from_dict(song_features)], axis=1)
    except Exception as e:
      errors.append((song, e))

print('\n Files with errors:')
for err in errors:
  print(err)

# Transpose Dataframe so songs' IDs are now the index and create country column from ID
EADIGIFOLK = EADIGIFOLKNT.T
EADIGIFOLK.set_index('id', inplace=True)
EADIGIFOLK['country'] = EADIGIFOLK.index.to_series().apply(lambda x: x.split('-')[0])

# Save Dataframe to compressed file to save
EADIGIFOLK.to_pickle('EADIGIFOLKT.gzip', compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1})

## Exploring the EA-Digifolk Dataset

This section covers possible ways of exploring the dataset

In [None]:
#@title Import Saved pandas dataframe (can be )

import pandas as pd

EADIGIFOLK = pd.read_pickle("EADIGIFOLKT.gzip", compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1})