# Tune sources

In [None]:
from pyabc2.sources import the_session, norbeck

## Norbeck

In [None]:
tunes = norbeck.load("jigs")
tunes[0]

## The Session

In [None]:
tune = the_session.load_url("https://thesession.org/tunes/21799#setting43712")
tune

In [None]:
tune.print_measures()

The Session data archive (<https://github.com/adactio/TheSession-data>) is a wealth of data,
which we can use in other ways besides parsing to {class}`~pyabc2.Tune`.

In [None]:
%%time

df = the_session.load_meta("tunes", convert_dtypes=True)
df

In [None]:
df.info()

For example, we can look for the most common ABC notes in the corpus.

In [None]:
from pyabc2.note import _RE_NOTE as rx

rx

In [None]:
%%time

note_counts = (
    df.abc
    .str.findall(rx)
    .explode()
    .str.join("")
    .value_counts()
)
note_counts

In [None]:
note_counts[:10]

👆 We can see that `A` is the leader, being a prominent note in many of the common keys
* 5 in Dmaj
* 2 in Gmaj
* 1 in Ador, Amin, Amix, Amaj

In [None]:
from textwrap import wrap

print("\n".join(wrap("  ".join(note_counts[note_counts == 1].index))))

👆 A variety of ABC note specs appear only once. Many of these have unusual durations or accidentals.

What if we ignore everything except the natural note name?

In [None]:
nat_cased_counts = (
    note_counts
    .reset_index(drop=False)
    .rename(columns={"index": "note", "abc": "count"})
    .assign(nat=lambda df: df.note.str.extract(r"([a-gA-G])"))
    .groupby("nat")
    .aggregate({"count": "sum"})["count"]
    .sort_values(ascending=False)
)
nat_cased_counts

👆 `A` is still our leader, but otherwise things have shifted a bit.
Note `C`, which generally implies a pitch outside of the range of most whistles and flutes,
has the lowest count.
Although `b` is inside that range, many tunes don't have one.

In [None]:
from pyabc2 import Note

(
    nat_cased_counts
    .to_frame()
    .assign(value=lambda df: df.index.map(lambda x: Note.from_abc(x).value))
    .sort_values("value")["count"]
    .plot.bar(
        xlabel="ABC letters\n(accidentals, octave indicators, and context in key ignored)",
        rot=0,
        ylabel="Count",
        title="ABC prevalance in The Session",
    )
);