Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CR] Indian classical support #1203

Merged
merged 26 commits into from Jul 21, 2020
Merged

[CR] Indian classical support #1203

merged 26 commits into from Jul 21, 2020

Conversation

bmcfee
Copy link
Member

@bmcfee bmcfee commented Jul 3, 2020

Reference Issue

Resolves #641

What does this implement/fix? Explain your changes.

This PR adds support for Hindustani and Carnatic music.

The core of this implementation is the unit converters:

  • midi_to_svara_h: convert (fractional) midi numbers to Hindustani svara
  • midi_to_svara_c: convert (fractional) midi numbers to Carnatic svara, conditional on a melakarta raga

Wrapper methods are provided to facilitate conversion from Hz (hz_to_svara_*) and western notes (note_to_svara_*).

The function mela_to_svara provides spelling of all 12 chromatic degrees under a given melakarta name or index (0 through 71). This is analogous to key_to_notes for resolving enharmonic equivalences conditional on a given key.

Degree locators are also provided, which provide Sa-relative scale degrees for svara included in the raga (Carnatic) or thaat (Hindustani). Since spelling is unambiguous in Hindustani, no thaat_to_svara method is implemented.

Any other comments?

TODOS:

  • core unit conversion
  • display helpers for cqt (easy) and chroma (less easy)
    • axis decorators (use svara names)
    • tick locators (use _degrees)
  • raga list utilties
  • full documentation
  • unit tests
  • review by an expert

EDIT: updates 2020-07-16:

I suspect it would be helpful to provide a function to enumerate the names of melakarta ragas as we've encoded them here. Right now, we store this as a big dictionary of {name: index}. Should the helper for this pretty-print, or return the structure?

I opted for the simple version: return a dict (melakarta) or list (thaat). These are copies of our internal data structures, just to be safe.

I'd like some review by an expert on the Carnatic svara-spelling function. I wasn't sure how to handle out-of-gamut svara, and couldn't find any standards for how this is done in practice.

Still waiting on some confirmation here, but I think the implementation I went with is correct?

Does anyone have translation expertise for anglicizing terms in a consistent way? I'm mostly hacking this together from various sources, and don't speak any languages of the subcontinent, so almost surely some of this is incorrect/unusual.

Still outstanding.

How should we handle the movable Sa issue for chroma display? Our implementations are set up to make the first dimension correspond to C, which makes comparison across pieces easy. This obviously won't work in a solfege-like representation though, so the question is do we change the feature extractor (so that the first dimension is always Sa) or dynamically adapt the display decorations (so that Sa can move, but the data is always fixed)? I can see arguments for both sides, but don't have a strong preference.

Resolved in favor of keeping the chroma features fixed in absolute position, and rotating the scale degrees around the placement of Sa.


Aside, in implementing the cqt_svara decorators, I found a small bug in the CQT note placements. This is now fixed, but required regenerating the baseline images for the image regression tests.

@bmcfee bmcfee added enhancement Does this improve existing functionality? functionality Does this add new functionality? labels Jul 3, 2020
@bmcfee bmcfee added this to the 0.8.0 milestone Jul 3, 2020
@bmcfee bmcfee self-assigned this Jul 3, 2020
@bmcfee
Copy link
Member Author

bmcfee commented Jul 3, 2020

@kaustuvkanti can you check this out when you have a moment, and let me know if there are any glaring errors?

@bmcfee
Copy link
Member Author

bmcfee commented Jul 3, 2020

Axis decoration for CQT is working now. Here's a demo video using the trumpet demo track to display CQT with first Hindustani, then Carnatic note decorations:

https://www.screencast.com/t/6brFbxNk

The choice of Sa and mela are basically arbitrary here, but show how the functionality would work in general.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 4, 2020

One more potential issue that just came to mind: is it unreasonable to start counting melakarta at 0 instead of 1? Or are we okay doing it this way?

@kaustuvkanti
Copy link

@kaustuvkanti can you check this out when you have a moment, and let me know if there are any glaring errors?

Absolutely, will be at it by tomorrow evening.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 4, 2020

To expand on point 4 above, here are a few ways that I could see chroma viz working.

Option 1: absolute chroma, absolute display

This is (kind of) what we currently do for western chroma. Features always(*) start at C and span the octave. Visualization can assume the first bin is C (octave relations of 261.62), and decorate axes accordingly.

For svara decorations, we could still provide a Sa= parameter, and data would be rendered directly in the plot. The Sa= would determine the starting position of the tick labels, and thaat= or mela= would determine which bins get tick marks, like how we presently support key=.

Pro:

  • easy to implement
  • data representation (chroma) is stable across tracks

Con:

  • might not be as useful as other options below
  • to work properly, you'd still want to adapt your chroma based on tuning to Sa

Option 2: absolute chroma, adaptive display

Option 2 is much like the above, except at display time, we roll the chroma matrix around so that Sa is at the bottom. This way, we have consistent axis labeling, but the display doesn't directly correspond to the data.

We have some precedent for this kind of thing in the tempogram plots, where we do various inversion and axis scaling tricks to make the plots behave as expected, even if the connection to data is a bit fictionalized. I think that's okay in the tempogram case, because I expect users to view tempograms as approximating continuous (high-dimensional) functions, rather than discrete (low-dimensional) functions, and are therefore less likely to try interpreting the plot literally.

Users will still have to handle intonation up front in the feature extraction step, but I think there's no avoiding that.

Pro:

  • easy to implement
  • data representation (chroma) is stable across tracks
  • visualization is probably what the user expects

Con:

  • visualization is less closely related to underlying data
  • still need to handle intonation

Option 3: adaptive chroma, absolute display

The last option is to extend the chroma feature extractors to support arbitrary base notes. This isn't technically challenging on the internals side, but getting the API correct and consistent across the package (eg, for chroma_cqt/stft/cens, tonnetz, related display and filter constructors, etc) could be difficult.

This would make it so that you can have chroma features in relative pitch, which might be independently useful.

Following that, the display code is easy, as it's the same as option 1.


A compromise solution would be to provide a helper function that takes a C-based chroma and rotates it around as described in option 2. This would keep the chroma features simple, allow for the display to be simple, but it adds a little complexity on the user side because they would have to apply the rotation prior to visualization.

If we go the compromise route, it might be worth adding western solfege notation to the display decoration as well, since it would basically come for free and amount to a basic string substitution of the existing note decorators. (I'll leave that as a totally separate issue though.)

@bmcfee
Copy link
Member Author

bmcfee commented Jul 4, 2020

One more aside: the core.time_frequency module is getting a bit unwieldy.

Thinking about refactoring this into core.convert (for the basic unit stuff) and core.theory (for the notational stuff).

@hideodaikoku
Copy link

3. Does anyone have translation expertise for anglicizing terms in a consistent way?  I'm mostly hacking this together from various sources, and don't speak any languages of the subcontinent, so almost surely some of this is incorrect/unusual.

I can help out with translation, if nobody has opted for that yet. I speak hindi and am familiar with most sanskrit. I know a few hindustani and carnatic musicians who can advise as well.

4. How should we handle the movable Sa issue for chroma display?  Our implementations are set up to make the first dimension correspond to C, which makes comparison across pieces easy.  This obviously won't work in a solfege-like representation though, so the question is do we change the feature extractor (so that the first dimension is always Sa) or dynamically adapt the display decorations (so that Sa can move, but the data is always fixed)?  I can see arguments for both sides, but don't have a strong preference.

I prefer the dynamically adapting display decorations, as an option to the a function call but with the first dimension as Sa as the default parameter.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 5, 2020

Thanks @hideodaikoku -- after thinking it over, my hunch is that keeping the chroma features fixed for display and only moving the tick decorations is the way to go.

API-wise, specifying sa as a bit offset is a little problematic because it overloads the parameter interpretation. An alternative would be to specify sa as frequency, and then calculate the bin offset relative to C when determining tick positions. We already do this in the tick locator for cqt_svara, so it wouldn't be hard to implement.

None of this would rule out having a separate helper function to implement relative pitch rotation, and that way, we could support both without too much work.

Related question: right now there is no default value for Sa, so it must always be provided. Would it make sense to set one (eg middle C), or is that inviting trouble?

@hideodaikoku
Copy link

Thanks @hideodaikoku -- after thinking it over, my hunch is that keeping the chroma features fixed for display and only moving the tick decorations is the way to go.

Fair! Good call.

Related question: right now there is no default value for Sa, so it must always be provided. Would it make sense to set one (eg middle C), or is that inviting trouble?

So long as it says clearly in the documentation that you are setting this as the default value and allows for an option to override it I think it should be fine. In most texts and keyboards I've seen, Sa and middle C are usually assumed to be equivalent. That being said, i think one should test it out with known ragas to see if it gives strange results.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 9, 2020

This issue is now the last standing blocker on the 0.8 release (aside from the usual cleanup / documentation audit).

It'd be great if folks who expressed interest in helping out on this could provide feedback ASAP. Your assistance is greatly appreciated!

@hskaushik
Copy link

hskaushik commented Jul 9, 2020

This issue is now the last standing blocker on the 0.8 release (aside from the usual cleanup / documentation audit).

It'd be great if folks who expressed interest in helping out on this could provide feedback ASAP. Your assistance is greatly appreciated!

Hi @bmcfee : I discovered librosa just a couple of days earlier and still exploring. This is an extraordinary piece of work.

Quick answer to your question "Related question: right now there is no default value for Sa, so it must always be provided. Would it make sense to set one (eg middle C), or is that inviting trouble?"
Is there a possibility to give the option of either "middle C" or "G"? In Carnatic (South Indian) music "middle C" is usually taken as Sa by male and "G" as Sa by female singers. Though other keys can be used as Sa (for example for playing Veena, a string instrument), "middle C" and "G" are most commonly used references for Sa.

Long answer:
Me and my wife have Carnatic musical training (and my wife actually teaches Carnatic music). Between us, we can read/write/speak 6 Indian languages (other than English). Please let me know if I could be of any help in the project.
Thanks for this great project,
Kaushik

@bmcfee
Copy link
Member Author

bmcfee commented Jul 9, 2020

s there a possibility to give the option of either "middle C" or "G"? In Carnatic (South Indian) music "middle C" is usually taken as Sa by male and "G" as Sa by female singers. Though other keys can be used as Sa (for example for playing Veena, a string instrument), "middle C" and "G" are most commonly used references for Sa.

I wasn't aware of that, but that's really interesting. Thanks!

We could support a C-or-G kind of setup (eg by a flag option), but I don't particularly like it. We'd need to support an explicit override anyway (eg to support tuning deviation if nothing else), so I'm inclined to leave it as the current "no default Sa" option to force users to be explicit about their intent.

Me and my wife have Carnatic musical training (and my wife actually teaches Carnatic music). Between us, we can read/write/speak 6 Indian languages (other than English). Please let me know if I could be of any help in the project.

That's great. I think the most useful things that you could provide right now would be opinions about 0-based vs 1-based indexing in melakarta, and a quick check of the raga name spellings. I see a lot of divergence on the latter point in various texts that I've consulted, so all input here is helpful.

Thanks for this great project,

Glad you enjoy it!

@hskaushik
Copy link

That's great. I think the most useful things that you could provide right now would be opinions about 0-based vs 1-based indexing in melakarta, and a quick check of the raga name spellings. I see a lot of divergence on the latter point in various texts that I've consulted, so all input here is helpful.

Do you please mind pointing me to the right place so, I can get to reweving it right away? Sorry, I am still all new to librosa and still exploring my way.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 9, 2020

Do you please mind pointing me to the right place so, I can get to reweving it right away? Sorry, I am still all new to librosa and still exploring my way.

If you click the Files Changed tab on this page (the pull request) it should take you to a screen with the proposed changes to the code that implement everything.

Copy link

@hskaushik hskaushik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bmcfee ,

I would prefer 1-based indexing in melakarta over 0-based. Please find my suggestion/edits of raga spellings.

Thank you,
Kaushik

'vachaspati', 'mechakalyani', 'chitrambhari',
'sucharitra', 'jyotiswarupini', 'dhatuvardhini',
'nasikabhushani', 'kasalam', 'rasikapriya'])}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bmcfee,
As you have already noted, there will be inconsistencies in how melakarta names (originally in Sanskrit) is written in English. I have suggested a few modifications to the English spellings which closely matches their Sanskrit pronunciations. These suggestions are based on me being Carnatic music student for ~15 years (& learnt Sanskrit in school) and married to a person who has been a Carnatic music teacher for the last decade.

I would prefer 1-based indexing in melakarta over 0-based. Also, I have generally seen "svara" written as "swara". Sorry for being pedantic but, please feel free to reject any/all suggestions.

['kanakaangi', 'ratnaangi', 'gaanamurti',
'vanaspati', 'maanavati', 'taanarupi',
'senaavati', 'hanumatodi', 'dhenuka',
'natakapriya', 'kokilapriya', 'rupavati',
'gayakapriya', 'vakulaabharanam', 'mayaamaalavagowla',
'chakravaakam', 'suryakantam', 'hatakaambhari',
'jhankaaradhwani', 'natabhairavi', 'keeravani',
'kharaharapriya', 'gowrimanohari', 'varunapriya',
'maararanjani', 'chaarukesi', 'sarasaangi',
'harikambhoji', 'dheerasankarabharanam', 'naganandini',
'yaagapriya', 'raagavardhini', 'gaangeyabhushani',
'vagadheeswari', 'shulini', 'chalanaatta',
'saalagam', 'jalaarnavam', 'jhaalavaraali',
'navaneetam', 'paavani', 'raghupriya',
'gavaambodhi', 'bhavapriya', 'subhapanthuvaraali',
'shadvigamargini', 'suvarnaangi', 'divyamani',
'dhavalambari', 'naamanarayani', 'kaamavardhini',
'raamapriya', 'gamanasrama', 'viswambhari',
'syaamalaangi', 'shanmukhapriya', 'simhendramadhyamam',
'hemavati', 'dharmavati', 'neetimati',
'kaantaamani', 'rishabhapriya', 'lataangi',
'vachaspati', 'mechakalyani', 'chitrambhari',
'sucharitra', 'jyotiswarupini', 'dhatuvardhini',
'nasikabhushani', 'kosalam', 'rasikapriya']

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic is the name of the game here, so thanks for this!

I would prefer 1-based indexing in melakarta over 0-based.

Gotcha. If I can push on that a little more, would it be common to refer by number like this? IE if I said melakarta #5, knowledgeable people would know what is meant?

Also, I have generally seen "svara" written as "swara".

I've been seeing both in relatively equal quantities. For example, the wikipedia article uses Svara as the title, and swara and svara interchangeably throughout the body. (Not that wikiepedia is a canonical reference here, but I expect it is at least somewhat representative of diversity of interested in parties. I see similar divergence across different reference texts, so I'm really not sure here.)

I imagine this comes from deviations in regional dialect and maybe inconsistent conventions for translation orthography, and I don't want to unduly privilege one translation over another without good reason. This quora post links out to IAST, but this also doesn't seem to be of much help here.

I have suggested a few modifications to the English spellings which closely matches their Sanskrit pronunciations.

It's difficult for me to eyeball this and quickly get a sense of the changes you're suggesting. Could you briefly explain the thinking behind them, or at least a couple of examples?

Copy link

@hskaushik hskaushik Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic is the name of the game here, so thanks for this!

Thanks :-)

I would prefer 1-based indexing in melakarta over 0-based.

Gotcha. If I can push on that a little more, would it be common to refer by number like this? IE if I said melakarta #5, knowledgeable people would know what is meant?

Absolutely!
In fact, the melakarta number can be deduced from the name of each raga. The first two-syllable of the Raga name has a specific number associated with it based on the ancient Katapayadi system. For example, the first two-syllable of the raga dheerasankarabharanam: dhee has the number 9 associated with it and ra has the number 2 associated with it. Hence, the first two syllables of this raga represent the number 92. However, according to the Katapayaadi Sankya the syllable ra appears before the syllable dhee hence, the numbers are to be reversed and you get 29 which is the number associated with deerasankarabharanam on the melakartha system. Going further, given a number of melakarta, it is possible to deduce all the svaras (or swaras) that should exist in that raga along with the first two syllables of the name. Hence, 1-based indexing makes absolute sense to a trained musician while 0-based means nothing.

Also, I have generally seen "svara" written as "swara".

I've been seeing both in relatively equal quantities. For example, the wikipedia article uses Svara as the title, and swara and svara interchangeably throughout the body. (Not that wikiepedia is a canonical reference here, but I expect it is at least somewhat representative of diversity of interested in parties. I see similar divergence across different reference texts, so I'm really not sure here.)

I imagine this comes from deviations in regional dialect and maybe inconsistent conventions for translation orthography, and I don't want to unduly privilege one translation over another without good reason. This quora post links out to IAST, but this also doesn't seem to be of much help here.

Agreed! It is always a challenge to translate from Sanskrit, which has more letters (and different grammar) to English. I can live with 'svara'! 👍 :-)

I have suggested a few modifications to the English spellings which closely matches their Sanskrit pronunciations.

It's difficult for me to eyeball this and quickly get a sense of the changes you're suggesting. Could you briefly explain the thinking behind them, or at least a couple of examples?

The changes are mostly to do with what syllables should have extended pronounciations. Sorry, I don't know how exactly to convey this in the text (I wish I could record my voice and send!). Here is an analogy: the word marquee has an extended pronunciation of the e. This wouldn't be captured if I were to write it as marki (but, English is a funny language and we don't always pronounce like the word is written).
Example: ganamurti -> gaanamurti; Taking the first two-syllable; the a of ga syllable is pronounced for slightly more duration than the a of na syllable. Hence, writing ganamurti as gaanamurthi would indicate it. However, strictly speaking, it should be written as gānamūrti but, it is not ASCII and hence I didn't follow those conventions of writing these names.

Some of the names were wrong. kasalam -> kosalam

Sorry about the extended response. But, I am so glad that you are putting such great effort to get these things right and I feel compelled to do my part right as well (the least I could at this moment).

Please feel free to question / disagree and I will be happy to discuss these things further.

P.S.: I actually consulted my wife (who I consider a better musician than myself) about the spelling of some of my suggestions and we disagreed on a couple of them! That's mostly because we both have different native languages and our conventions of writing Sanskrit words in English differ! In short, it's impossible to get it "right" but, we can tend towards it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence, 1-based indexing makes absolute sense to a trained musician while 0-based means nothing.

Ok, that's an easy enough change. 0-based makes the math a little simpler, but it's easy to adapt. I'll push that change shortly.

I can live with 'svara'!

I guess I'm more worried about, eg, if one might be more commonly used for Carnatic and the other for Hindustani (similar to re/ri). I don't personally have a preference on this one.

However, strictly speaking, it should be written as gānamūrti but, it is not ASCII and hence I didn't follow those conventions of writing these names.

Well, we could go unicode with this (since we're now in python 3 territory exclusively, this is pretty easy). But I worry that it might not be so user-friendly that way.

The take-home message that I'm getting here is that your suggested corrections are there to emphasize stressed syllables, and that makes sense. Still, I'd feel better about this if we could point to a reference (ideally text/book, but other reliable resources could be appropriate) rather than cobble it together piece-meal.

Also, it sounds like whatever we do, a helper function to list out our string names will be necessary.

Some of the names were wrong. kasalam -> kosalam

Good catch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The take-home message that I'm getting here is that your suggested corrections are there to emphasize stressed syllables, and that makes sense. Still, I'd feel better about this if we could point to a reference (ideally text/book, but other reliable resources could be appropriate) rather than cobble it together piece-meal.

Please give me a couple of days and I will identify a textbook/ reference book that we could follow for the spelling of ragas.

Also, it sounds like whatever we do, a helper function to list out our string names will be necessary.

It will be necessary, as some of the ragas are more commonly known in their shortened form (dheerasankarabharanam-> sankarabharanam; hanumatodi -> todi) or has an entirely different non-melakarta name (kamavardhini->panthuvarali).

Would you be happy if I can put this information together? Please let me know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be happy if I can put this information together? Please let me know.

That'd be great, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little more digging through textbooks, and (unsurprisingly) came up with yet more disagreements.

The original transcriptions that I implemented in this PR came from "A gentle introduction to Carnatic Music" by Mahadevan Ramesh (2009) (MR below).

I also dug up an older reference text: "Ragas in Carnatic Music" by S. Bhagyalekshmy (1990) (SB below).

Here's a quick listing of the two transcriptions, alongside @hskaushik 's (KH below) proposed corrections (only showing where they disagree):

MR KH SB
1 kanakanki kanakaangi kanakangi
2 ratnangi ratnaangi ratnangi
3 ganamurti gaanamurti ganamurthi
4 vanaspati vanaspati vanaspathi
5 manavati maanavati manavathi
6 tanarupi taanarupi tanarupi
7 senavati senaavati senavathi
8 hanumatodi hanumatodi hanumathodi
12 rupavati rupavati rupavathi
14 vakulabharanam vakulaabharanam vakulabharanam
15 mayamalavagoulai mayaamaalavagowla mayamalavagaula
16 chakravaham chakravaakam chakravakom
17 suryakantam suryakantam suryakantham
18 hatakambhari hatakaambhari hatakambari
19 jhankaradhwani jhankaaradhwani jhankaradhwani
23 gowrimanohari gowrimanohari gaurimanohari
25 mararanjani maararanjani mararanjini
26 charukesi chaarukesi charukesi
27 sarasangi sarasaangi sarasangi
28 harikambhoji harikambhoji harukambhoji
31 yagapriya yaagapriya yagapriya
32 ragavardhini raagavardhini ragavardhini
33 gangeyabhusani gaangeyabhushani gangeyabhushani
35 sulini shulini sulini
36 chalanattai chalanaatta chalanatta
37 salagam saalagam salagam
38 jalarnavam jalaarnavam jalarnavam
39 jhalavarali jhaalavaraali jhalavarali
40 navaneetam navaneetam navaneetham
41 pavani paavani pavani
43 gavambodhi gavaambodhi gavambodhi
45 subhapantuvarali subhapanthuvaraali sudhapanthuvarali
46 shadvigamargini shadvigamargini shadvidhamargini
47 suvarnangi suvarnaangi suvarnangi
50 namanarayani naamanarayani namanarayani
51 kamavardhini kaamavardhini kamavardhini
52 ramapriya raamapriya ramapriya
55 syamalangi syaamalaangi syamalangi
58 hemavati hemavati hemavathi
59 dharmavati dharmavati dharmavathi
60 nitimati neetimati neethimathi
61 kantamani kaantaamani kanthamani
63 latangi lataangi latangi
64 vachaspati vachaspati vachaspathi
66 chitrambhari chitrambhari chitrambari
68 jyotiswarupini jyotiswarupini jyotisvarupini

Differences I can spot:

  1. KH's suggestions emphasize stress since we probably don't want to use diacritical marks and extended unicode here.
  2. MR and SB differ primarily in t vs th. 26 differences in total, about 12 that are more substantial than t/th.

I expect that we'll see a similar number of differences between every pair of texts we compare. I think our best option here is, if possible, to pick a suitably popular and widely available reference text and go with that, even if there will be disagreements. However, I am certainly not qualified to judge which (if any) text is appropriate for this, so input from the experts would be much appreciated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this another look through, and found some typos (my fault) in my transcription of SB. (Going off a not-so-great scan of the text.)

Having corrected those, the disagreements to MR are reduced to the following:

MR SB
1 kanakanki kanakangi
3 ganamurti ganamurthi
4 vanaspati vanaspathi
5 manavati manavathi
7 senavati senavathi
8 hanumatodi hanumathodi
12 rupavati rupavathi
15 mayamalavagoulai mayamalavagaula
16 chakravaham chakravakom
17 suryakantam suryakantham
18 hatakambhari hatakambari
23 gowrimanohari gaurimanohari
25 mararanjani mararanjini
33 gangeyabhusani gangeyabhushani
36 chalanattai chalanatta
40 navaneetam navaneetham
45 subhapantuvarali sudhapanthuvarali
46 shadvigamargini shadvidhamargini
58 hemavati hemavathi
59 dharmavati dharmavathi
60 nitimati neethimathi
61 kantamani kanthamani
64 vachaspati vachaspathi
66 chitrambhari chitrambari
68 jyotiswarupini jyotisvarupini

Aside from vowel elongations, SB looks pretty close to KH's suggestions. The one big oddball is 45, where "bha" (the apparently more popular choice) is "dha" in SB. I'm hesitant to "correct" this, but it does seem unusual (if not erroneous).

Apart from that, I think it makes sense to adopt SB's translations across the board, and otherwise not worry about it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from vowel elongations, SB looks pretty close to KH's suggestions. The one big oddball is 45, where "bha" (the apparently more popular choice) is "dha" in SB. I'm hesitant to "correct" this, but it does seem unusual (if not erroneous).

A quick note: I am certain the "dha" in SB should be typo! It should be "bha". Except that, I too am happy with SB over MR's version.

P.S: The Government of Karnataka (one of the southern states of India) conducts several grades (equivalent to undergraduate and Masters) carnatic music exams. The government publishes text books for these music courses and I am trying to source it through some contacts (unfortunately, the English version of this book is not available online). Hopefully, this could be our reference for Melakarta, when I manage to get it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick note: I am certain the "dha" in SB should be typo! It should be "bha". Except that, I too am happy with SB over MR's version.

Thanks for confirming! 😁 That was my sense as well.

I've now pushed up the changes to SB's versions, so I think this should be good to go.

Hopefully, this could be our reference for Melakarta, when I manage to get it?

That sounds like a great resource. Not to be hasty, but I do want to get this released ASAP (this week, ideally), so I don't want to wait on additional references, nor do I want to put pressure on you to dig those up for me.

That said, I think we have a pretty good setup at this point. One nice thing about the way that it's implemented is that we can support alternative spellings fairly easily in the future without breaking backward compatibility. So maybe if there's sufficient demand, we can revisit it in a future version.

@kaustuvkanti
Copy link

kaustuvkanti commented Jul 11, 2020

Sincere apologies @bmcfee for joining the conversation late, let me summarise my observations and we can engage in detailed discussion if need be.

If I understand correctly, the purpose of using thaat for _h and melakarta for _c is just to display the right tick locations in the chroma representation. My argument is that the transcription algorithms are independent of the scale, then why do we limit to only a subset. Moreover, (i) the raga_to_thaat mapping in _h is not exhaustive unlike the melakarta_to_janyaRaga mapping in _c, (ii) only a small proportion of melakarta ragas are widely performed (refer to Dunya or Dunya CC collections) that makes this an overkill. For _h, the pentatonic ragas will then have two redundant tick locations.

The challenge I see is the contourShape_to_svaraName mapping, as the name of the svara can depend on the raga for the same contour. These are challenging cases, and can of course be overlooked for the first release.

I second the opinion of keeping the Sa open as opposed to limiting it to C/G, the distribution of tonic frequencies in the Dunya _h/_c collections is rather flat than strict bimodal, the fractional midi numbers will be particularly useful for old recordings.

The last post from issue #641 looks all reasonable to me. I do not feel a need for octave decorations more than +-1 octaves, especially for vocal concerts, the inverse transform is not a requirement. As of the _to_degrees visualization, it is generally useful to have the tonics to be normalized for better visual comparison but is not a crucial factor. Again, I have reservations for thaat_to_degrees, as there are enough examples of ragas consisting of different solfege than that of the attributed thaat (e.g. raga Patdeep hailing from thaat Kafi has a N whereas Kafi consists of n; Gujri Todi having the solfege of thaat Todi but attributed to thaat Marwa because of its phraseology). Long story short, the thaat in _h is constructed as a compilation of ragas [Jairazbhoy, N. A. (1995). The rāgs of North Indian music: their structure and evolution. Popular Prakashan.] as opposed to the melakarta system in _c where the hierarchy is clearly defined.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 11, 2020

Thanks @kaustuvkanti for your comments!

If I understand correctly, the purpose of using thaat for _h and melakarta for _c is just to display the right tick locations in the chroma representation.

It's a bit more than that, but ticker positioning is the first obvious consumer.

The main use for melakarta (in carnatic) and key (in western) is for pitch spelling. For _h, and my understanding might be off here, pitch spelling is always fixed and independent of thaat, correct? In western notation, the pitch class C could also be called B# or Dbb, depending on key context. Similarly, in Carnatic notation, the raga determines which name we choose for certain pitch classes (ri, ga, etc), so we need the raga id information for context.

In all three cases (carnatic, hindustani, and western), we could annotate all 12 pitch classes for, eg a chroma display. (The CQT display actually does this if you zoom in far enough).
However, we don't usually care about all 12, and thaat/mela/key give a way to specify which ones are important to visualize.

My argument is that the transcription algorithms are independent of the scale, then why do we limit to only a subset.

We're not only interested in transcription algorithms here, but also raw data visualization (eg chroma or cqt). These will definitely not be scale-invariant, so we should provide a way to let the user control the visualization to do the right thing.

More generally, display is the main consumer, but I also want to provide conversion utilities to make it easy to convert between systems and units.
The lack of an octave notation for _c and _h will make it hard to go back to midi/hz/western, but I still think there's value in automatic conversion into _c or _h.

Moreover, (i) the raga_to_thaat mapping in _h is not exhaustive unlike the melakarta_to_janyaRaga mapping in _c,

I'm happy to add more!

(ii) only a small proportion of melakarta ragas are widely performed (refer to Dunya or Dunya CC collections) that makes this an overkill.

Absolutely. However, it seemed much simpler (programmatically) to implement the full system than to try leaving some out. If most of them are never used, that's fine.

For _h, the pentatonic ragas will then have two redundant tick locations.

I think I see what you mean here. I only included heptatonic thaat, but that was mainly because it was easy to find a list of them. Adding pentatonic options would be trivial.

The challenge I see is the contourShape_to_svaraName mapping, as the name of the svara can depend on the raga for the same contour. These are challenging cases, and can of course be overlooked for the first release.

I'm not sure what you mean? We're not doing anything with pitch tracking or transcription here. The utilities provided here are only for converting from some representation of frequency (Hz or midi) to svara name conditional on raga. Anything beyond that (eg estimating raga from an f0 curve) is out of scope.

I second the opinion of keeping the Sa open as opposed to limiting it to C/G, the distribution of tonic frequencies in the Dunya _h/_c collections is rather flat than strict bimodal, the fractional midi numbers will be particularly useful for old recordings.

Great!

The last post from issue #641 looks all reasonable to me. I do not feel a need for octave decorations more than +-1 octaves, especially for vocal concerts, the inverse transform is not a requirement.

I tend to agree, but I'm generally a completionist when it comes to this kind of thing. If an inverse transform is possible, it'd be great to have. If it's not, I won't force it.

As of the _to_degrees visualization, it is generally useful to have the tonics to be normalized for better visual comparison but is not a crucial factor.

I agree with that, but I'm really leaning toward making that a separate function. I could imagine not just viz, but also downstream analysis depending on either normalized or unnormalized representation, and we should give the user that flexibility.

Again, I have reservations for thaat_to_degrees, as there are enough examples of ragas consisting of different solfege than that of the attributed thaat (e.g. raga Patdeep hailing from thaat Kafi has a N whereas Kafi consists of n; Gujri Todi having the solfege of thaat Todi but attributed to thaat Marwa because of its phraseology). Long story short, the thaat in _h is constructed as a compilation of ragas [Jairazbhoy, N. A. (1995). The rāgs of North Indian music: their structure and evolution. Popular Prakashan.] as opposed to the melakarta system in _c where the hierarchy is clearly defined.

Sure, that all makes sense, but how much of a problem is this? A user can always select a different thaat, and all that would really change is the selection of pitch classes which are explicitly notated in the display. I'm not sure what else one could expect to happen here?

@kaustuvkanti
Copy link

kaustuvkanti commented Jul 11, 2020

I am glad @bmcfee that this is useful, I will be available for prompt responses now on.

In all three cases (carnatic, hindustani, and western), we could annotate all 12 pitch classes for, eg a chroma display. (The CQT display actually does this if you zoom in far enough).
However, we don't usually care about all 12, and thaat/mela/key give a way to specify which ones are important to visualize.

I believe you are aware of this website: https://autrimncpa.wordpress.com/. They offer a similar representation from the pitch contour extracted from mono voice in a multi-track audio recording.

We're not only interested in transcription algorithms here, but also raw data visualization (eg chroma or cqt). These will definitely not be scale-invariant, so we should provide a way to let the user control the visualization to do the right thing.

I understood the purpose now, fair enough!

More generally, display is the main consumer, but I also want to provide conversion utilities to make it easy to convert between systems and units.
The lack of an octave notation for _c and _h will make it hard to go back to midi/hz/western, but I still think there's value in automatic conversion into _c or _h.

The octave notation for midi mapping is easy though. E.g. male artists with Sa at pitch-class D is always D3 = 146.83 Hz, similar for a female artist with A4 = 220 Hz.

I'm happy to add more!

Great!

Absolutely. However, it seemed much simpler (programmatically) to implement the full system than to try leaving some out. If most of them are never used, that's fine.

Fair enough!

I think I see what you mean here. I only included heptatonic thaat, but that was mainly because it was easy to find a list of them. Adding pentatonic options would be trivial.

Absolutely, this is a trivial extension.

I'm not sure what you mean? We're not doing anything with pitch tracking or transcription here. The utilities provided here are only for converting from some representation of frequency (Hz or midi) to svara name conditional on raga. Anything beyond that (eg estimating raga from an f0 curve) is out of scope.

Totally agree, my only concern is the melodic accompaniment which often follows the main artist and the chroma at the note onsets generally has a leakage to previous note from the accompanying instrument. This is nothing to do with the programmatic scheme of things, the users have to familiarize themselves with this new representation.

I tend to agree, but I'm generally a completionist when it comes to this kind of thing. If an inverse transform is possible, it'd be great to have. If it's not, I won't force it.

I am ready to give some more thought, would you care to elaborate a bit.

I agree with that, but I'm really leaning toward making that a separate function. I could imagine not just viz, but also downstream analysis depending on either normalized or unnormalized representation, and we should give the user that flexibility.

I can see your point, I have no further questions on it.

Sure, that all makes sense, but how much of a problem is this? A user can always select a different thaat, and all that would really change is the selection of pitch classes which are explicitly notated in the display. I'm not sure what else one could expect to happen here?

There are a handful of exceptions still, but we can safely ignore them without loss of generality.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 14, 2020

The octave notation for midi mapping is easy though. E.g. male artists with Sa at pitch-class D is always D3 = 146.83 Hz, similar for a female artist with A4 = 220 Hz.

That's fine if we're in the same octave, but the notation itself has no way to specify any frequencies outside a 3 octave band. This means that the converter cannot be invertible in general. (Scientific pitch notation is invertible because we can have an arbitrary octave number tacked to the end of the note name.) In this case, I think it would be better to not provide an inverse function if it can't be complete.

Absolutely, this is a trivial extension.

I did a bit more reading on this, and my understanding is that thaat are heptatonic by assumption? If that's the case, I don't think we should mix in pentatonic scales. If we really want/need to mix in pentatonic scales, we might want to call it something different.

As an aside, we don't currently support pentatonic scales (or even modes) in western notation either, so we'd still have a couple of extraneous note decorations in that case. Maybe eventually we can extend both systems to support pentatonic scales, but in the meantime, it's a limitation I can live with.

Totally agree, my only concern is the melodic accompaniment which often follows the main artist and the chroma at the note onsets generally has a leakage to previous note from the accompanying instrument. This is nothing to do with the programmatic scheme of things, the users have to familiarize themselves with this new representation.

I don't understand what the problem is here. Can you give an example?

@bmcfee
Copy link
Member Author

bmcfee commented Jul 16, 2020

Chroma display seems to be working now. For chroma displays, I went with the convention that Sa is in chroma bin units rather than Hz. (We still use Hz in cqt ; this is a little inconsistent, but I think it's the most natural way to do it.)

Here's our trumpet example track (F:dorian ~= Eb:maj):

librosa.display.specshow(chroma, y_axis='chroma', key='Eb:maj', x_axis='time');

image

Plotting the same data as Hindustani (Sa=5 gives us F, thaat=kafi ~= dorian)

librosa.display.specshow(chroma, y_axis='chroma_h', Sa=5, thaat='kafi', x_axis='time');

image

And in Carnatic (Sa=5 as before, now mela=22 = kharaharapriya ~= dorian):

librosa.display.specshow(chroma, y_axis='chroma_c', Sa=5, mela=22, x_axis='time');

image

Finally, we can skip the thaat note selection and use the full set of ticks:

librosa.display.specshow(chroma, y_axis='chroma_h', Sa=5, x_axis='time');

image

No such analogous method is possible with Carnatic because the spelling is dependent on raga.


Some last todos on this:

  • Expose abbreviation support
  • Test out multiple-octave (multi-band chroma) support
  • Test out high-res chroma

The latter two already work for western chroma, so we should make it work here as well. EDIT: confirmed, these work out of the box.

Full svara names are also possible, but it's not exposed through the specshow API. I expect there won't be a huge demand for this feature, but if people want it, it's possible as follows:

>>> fig, ax = plt.subplots()
>>> librosa.display.specshow(chroma, y_axis='chroma_h')
>>> ax.yaxis.set_major_formatter(librosa.display.ChromaSvara(abbr=False))  # can also support sa, mela, etc. jazz

Example from above (Carnatic, full note names):
image

I'm ready to stick a fork in the display functionality and start writing tests now.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 16, 2020

Okay, display tests are in. Assuming this all passes CI, I think the only thing left on this is to finalize the standardization of melakarta (and thaat) name spellings.

Hopefully we can get that done by the end of this week and :shipit:

@bmcfee bmcfee changed the title [WIP] Indian classical support [CR] Indian classical support Jul 17, 2020
@bmcfee bmcfee merged commit bc52dc4 into main Jul 21, 2020
@bmcfee bmcfee deleted the indian-classical-support branch July 21, 2020 18:34
@kaustuvkanti
Copy link

kaustuvkanti commented Jul 24, 2020

That's fine if we're in the same octave, but the notation itself has no way to specify any frequencies outside a 3 octave band. This means that the converter cannot be invertible in general. (Scientific pitch notation is invertible because we can have an arbitrary octave number tacked to the end of the note name.) In this case, I think it would be better to not provide an inverse function if it can't be complete.

Sure, I second you. By any chance, is there a way to unwrap the chroma into three octaves? There are melodic phrase shapes that span consecutive octaves and the visualization would regard continuity in the temporal axis.

Also, given that the perception of modes is relative in Indian music, that is to say it does not matter whether Sa=5 or 7, is it reasonable to rotate the chroma to bring the Sa to the 0-th bin. This is just a thought from my familiarity with the paradigm, and may please be contested!

I did a bit more reading on this, and my understanding is that thaat are heptatonic by assumption? If that's the case, I don't think we should mix in pentatonic scales. If we really want/need to mix in pentatonic scales, we might want to call it something different.

As an aside, we don't currently support pentatonic scales (or even modes) in western notation either, so we'd still have a couple of extraneous note decorations in that case. Maybe eventually we can extend both systems to support pentatonic scales, but in the meantime, it's a limitation I can live with.

Affirmative, a thaat is heptatonic by virtue. The latter sounds reasonable to me!

I don't understand what the problem is here. Can you give an example?

From your example of Hindustani: librosa.display.specshow(chroma, y_axis='chroma_h', Sa=5, thaat='kafi', x_axis='time');
We see salience in the top-most bin (#11 or 'M' (not shown) in solfege) around 1.2 sec which is theoretically an unused note in Kafi thaat. This phenomenon of "touch note" usage is common but should not add confusion when we restrict the y-ticks to the thaat notes only. Given that the raga-to-thaat mapping is not exhaustive (often disagreed among schools), I would attract some attention to this issue.

As of the leakage phenomenon, it is observed around 2.0 sec where the most salient bin is 'm' (#10) whereas the 'S' (#5) has activation. I suspect that this is contributed from a melodic accompaniment that continued the trail of the previous note 'S'. This is not an issue with the computation but the users need to familiarize themselves with the environment.

@bmcfee
Copy link
Member Author

bmcfee commented Jul 27, 2020

Thanks for the feedback.

We see salience in the top-most bin (#11 or 'M' (not shown) in solfege) around 1.2 sec which is theoretically an unused note in Kafi thaat. This phenomenon of "touch note" usage is common but should not add confusion when we restrict the y-ticks to the thaat notes only. Given that the raga-to-thaat mapping is not exhaustive (often disagreed among schools), I would attract some attention to this issue.

Well, the piece in question is decidedly not from any Indian tradition, so it should be taken with heavy skepticism. Everything you say also applies to the corresponding western notation (F:dorian/Eb:maj); the note is technically out of key, but it goes by quickly in context.
The purpose here is just to demonstrate how to set up the axis ticks, not to imply a correct and detailed analysis of the recording.

@kaustuvkanti
Copy link

Absolutely, all your remarks are well taken, this was just to gently share an observation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Does this improve existing functionality? functionality Does this add new functionality?
Development

Successfully merging this pull request may close these issues.

RFC: support for non-western systems
4 participants