Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable resolution support #1173

Draft
wants to merge 3 commits into
base: development
Choose a base branch
from

Conversation

cme
Copy link
Contributor

@cme cme commented Feb 12, 2021

After doing a bit of digging into just how much work it would be to remove the fixed resolution of 48 ticks per quarter note that Hydrogen currently uses, I was quite surprised to find that the core already mostly supports this with no changes. Existing songs already have a resolution property that determines the mapping between "position" and time. It's just initialised to 48 by default and there's no way of changing it.

The assumption of 48TPQN is baked into the GUI more than it is the core, so this change takes care of some of that. There's a bit more to do, but not much.

This change:

  • adds a resolution field to individual patterns
    • This is also stored in the XML for importing and exporting patterns
  • changes most uses of "MAX_NOTES", the assumed ticks per whole note, to more appropriately use the resolution of the pattern or song.
  • addresses most of the GUI issues on scaling
  • Provides routines to get the minimum resolution to represent all the notes that currently exist in a pattern, and to retime a pattern to a given resolution (assumed to be a multiple of the minimum possible resolution)

The core does not support multiple resolutions in the same song. It's provided on a per-pattern basis mostly for import/export.

Things that still need done:

  • Importing patterns should retime if necessary to change the resolution of the song & new pattern to something compatible
  • Any sort of GUI support for actually changing it (ie. tuplets support GUI work! Working on True Tuplets support #1127 :) )
  • Pattern length / denominator GUI can be cleaned up as it does a lot of work to work around the 48TQPN limitation :)
  • Pattern editor zoom is still on the basis of a fixed-width tick rather than a fixed time.
  • MIDI export uses 4 times the resolution that it needs to, and always has done. Hydrogen's run at 48TPQN but the exported MIDI files are all 192TPQN. Because I didn't feel like changing the test reference files today, I've left it at 4x resolution for the moment.

  - store resolution in h2song and h2pattern files
  - correct most uses "MAX_NOTES" to be in terms of song or pattern resolution
  - MIDI export works
@cme cme marked this pull request as draft February 12, 2021 00:33
@oddtime
Copy link
Contributor

oddtime commented Feb 12, 2021

Ah! I am ok with this, and I think that MAX_NOTES = 192 can be easily be multiplied by further factors. (Still it whould be multiple of 192, unless you use some compensation like in #1127 which allows to work at any max-resolution to get any true resolution, up to sample rate freq)
What if I want to set in the same song triplet, then a quintuplet, then a septuplet somewhere, a 11tuplet and a 1/64?
The resolution would be 64 * 5 * 3 * 7 * 11= 73920 but I guess that we would limit the max resolution to some smaller value

@theGreatWhiteShark
Copy link
Contributor

Importing patterns should retime if necessary to change the resolution of the song & new pattern to something compatible

AFAIU the resolution of the Song must be a common multiple of the resolutions used in the patterns. Do we restrict the latter? Otherwise loading specific combinations of patterns might be forbidden.

Pattern editor zoom is still on the basis of a fixed-width tick rather than a fixed time.

You talked about fixed time and about rounding errors in the ticks in #1127. I want to plan to make the transport in Hydrogen based on frames rather than on ticks to make the application compatible with the JACK server (#983) (to get rid of the various synchronization bugs once and for all). I have not looked into this yet and have scheduled it after the release/update of the doc is done. But maybe (just thinking aloud with nothing to back it up) the time in here would be in frames and the ticks don't have to be integers anymore (with a function mapping between frames and ticks).

int nPos = it.first;
nDenominator = std::gcd( nDenominator, nPos );
}
return m_nResolution / nDenominator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this please?
If m_nResolution = 3 (3 ticks per quarter notes or whole notes?), and the pattern has a note at position = 7,
the ratio m_nResolution / nDenominator equals (int) 3/7 = 0, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a bug here. nDenominator should be initialised to 1, not 0.

Copy link
Contributor

@oddtime oddtime Feb 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 1? then at the end of the loop nDenominator will be always 1

@@ -60,12 +60,16 @@ class Song : public H2Core::Object
SONG_MODE
};

static constexpr int nDefaultResolutionTPQN = 48;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my recent experience on H2 code, it seems better to have this expressed in Ticks per whole notes, because the grid resolutions, i.e. the inverses of quantum note values (1/4, 1/8, 1/16...) refer to that unit...
Otherwise should all the formulas like getColumn() have an additional 4 factor?
What is the advantage of having this in TPQN?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only because that's what MIDI uses, and that's what the core already uses (because it looks like it was heavily influenced by MIDI). Multiplying up by 4 when needed (seems to be a fairly small number of uses) is a fairly small price to pay for the consistency (if not, a macro can be defined).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.
a macro would be good enough

@oddtime
Copy link
Contributor

oddtime commented Feb 12, 2021

@theGreatWhiteShark

But maybe (just thinking aloud with nothing to back it up) the time in here would be in frames and the ticks don't have to be integers anymore (with a function mapping between frames and ticks).

Maybe with "time" here @cme meant a fixed music duration in whole notes and not in seconds, otherwise bpm would change the zoom

@cme
Copy link
Contributor Author

cme commented Feb 12, 2021

Ah! I am ok with this,

Thanks. I was honestly quite worried since I know this goes a different direction from work you've already done :)

and I think that MAX_NOTES = 192 can be easily be multiplied by further factors. (Still it whould be multiple of 192, unless you use some compensation like in #1127 which allows to work at any max-resolution to get any true resolution, up to sample rate freq)

I don't see why it would need to be limited to multiples of 192?

What if I want to set in the same song triplet, then a quintuplet, then a septuplet somewhere, a 11tuplet and a 1/64?
The resolution would be 64 * 5 * 3 * 7 * 11= 73920 but I guess that we would limit the max resolution to some smaller value

I don't see any need to limit it. So long as we make sure the time-keeping in the core is up to the job and can allow non-integer numbers of frames per tick (and make sure there's nothing that occupies time proportional to individual ticks if the frames per tick should be smaller than 1).

For exporting to MIDI things would have to be retimed slightly because you can't express a resolution of more than 32k in a MIDI file, but 32k is so fine grained as to not make an audible difference.

@cme
Copy link
Contributor Author

cme commented Feb 12, 2021

Importing patterns should retime if necessary to change the resolution of the song & new pattern to something compatible

AFAIU the resolution of the Song must be a common multiple of the resolutions used in the patterns. Do we restrict the latter? Otherwise loading specific combinations of patterns might be forbidden.

The core currently only has a single per-song resolution, so all patterns are assumed to be in the same resolution. It probably isn't trivial to support more than one, but it's easy to re-time a pattern (and the rest of the song) to a common resolution when importing.

@oddtime
Copy link
Contributor

oddtime commented Feb 12, 2021

I don't see why it would need to be limited to multiples of 192

to allow precision for triplets and 64th notes.

@cme
Copy link
Contributor Author

cme commented Feb 12, 2021

I don't see why it would need to be limited to multiples of 192

to allow precision for triplets and 64th notes.

If the song contains triplets and 64ths, yes. But otherwise there's no need to make that stipulation. Since the resolution is dynamically variable and patterns / songs can be retimed arbitrarily at run time, there's no need to set limits. :)

(Although setting some minimum resolution for the sake of having meaningful position indication in the position rulers is probably a good idea)

@oddtime
Copy link
Contributor

oddtime commented Feb 12, 2021

yes of course, when it comes dynamic

@oddtime
Copy link
Contributor

oddtime commented Feb 12, 2021

I think loud here: why do we need an integer position in ticks depending on TPQN??

I am thinking to something that is quite different but maybe the modification is so brief that it will be much preferable than implementing this pull request and will avoid all the work to find out the minimal resolution and rewriting the note positions.

What if the position of notes was expressed directly in whole notes durations? a float would be enough (since the max resolution is really limited by the sample rate), but not precise to mantain the information (think to GUI note representation issues). two int (num/den) values would be perfect to store any rational number.

@oddtime
Copy link
Contributor

oddtime commented Feb 13, 2021

some more reflections

the time in here would be in frames and the ticks don't have to be integers anymore (with a function mapping between frames and ticks)

@theGreatWhiteShark I initially didn't pay attention to what you wrote here about non integer ticks, but now I think that this something similar to what I was thinking in my previous comment.

If there are not major troubles involved, in these weeks I would like to try to implement such different way of storing the positions of notes, and probably this is the right moment to try it. Once we allow float ticks, is there need to deal with max_resolution? Probably not, the time unit would be ideally the whole note duration instead. I am still in doubt about the eventual necessity of having the note position store in num/den fashion, while thinking to tuplets visualization, but except for the fact that you have to calculate the position with a division at run time (not a big deal), basically the concept is similar:
for example the corresponding positions for tick = 48 (for MAX_NOTES = 192) is 1/4. Tuplet notes positioning is simply straight forward, e.g. the n-th grid line with a grid resolution of 1/10 (8th-quintuplets) is simply at position n/10 (that is n* 1/8 * 4/5 of course, in fact the tuplet ratio for standard quintuplets is 5/4).
Of course midi export has to choose a tick resolution because midi works like that (calculating the minimal tick resolution is even simpler with num/den position fashion) and convert positions into ticks, but to choose the resolution I think we should consider the humanization parameters finally (which are not taken into account now if I am not wrong).

@cme
Copy link
Contributor Author

cme commented Feb 13, 2021

I think loud here: why do we need an integer position in ticks depending on TPQN??

I am thinking to something that is quite different but maybe the modification is so brief that it will be much preferable than implementing this pull request and will avoid all the work to find out the minimal resolution and rewriting the note positions.

What if the position of notes was expressed directly in whole notes durations? a float would be enough (since the max resolution is really limited by the sample rate), but not precise to mantain the information (think to GUI note representation issues). two int (num/den) values would be perfect to store any rational number.

I initially thought float note positions representations would break a lot, but the more I think on it, the more I think it's a reasonable idea.

I don't think there's any need for an explicitly rational representation of the note positions; particularly if it's just for the sake of the GUI, it shouldn't be in the fundamental representation. If the issue is just accuracy for telling what's exactly on a particular fraction and what isn't, for the sake of manipulation in the GUI, the GUI should accept a range of times rather than an absolute time.

It would be good to encapsulate that functionality into its own class to represent the position, that would largely remove the need for changes to the GUI code I think.

@oddtime
Copy link
Contributor

oddtime commented Feb 13, 2021

My thought about the explicit representation was not for the GUI only but for a precise representation of how rhythms are written in scores (which the GUI has to represent). It is very ideal maybe.
Have you already thought how to deal with midi tick resolution in midi export, in case of float positions?

Always in case of float positions,

the GUI should accept a range of times rather than an absolute time

how small should the range be? Should it depend on the architecture?
And how to get the tuplet numerator (which act as denominator, if you remember/read the comments in #1127...) of a note from its float position (if we want to show it in the Gui)? I don't know if there exist smart methods to do it... I mean getting the explicit rational representation from the float ratio.

@cme
Copy link
Contributor Author

cme commented Feb 13, 2021

(ETA: assume double every time I say float here!)

My thought about the explicit representation was not for the GUI only but for a precise representation of how rhythms are written in scores (which the GUI has to represent). It is very ideal maybe.
Have you already thought how to deal with midi tick resolution in midi export, in case of float positions?

Haven't thought about it yet specifically much, but it doesn't seem too hard a problem. Even a very naive optimisation approach doesn't take much compute time. For example, doing a sweep of possible resolution values up to a given maximum, computing overall timing error for each resolution and using branch-and-bound to quickly prune out higher error resolutions, and then pick the resolution that gives the minimum error. That would be O(notes * max_resolution). Making a pessimistic guess of about 100 cycles per note*resolution on a 1GHz machine, and the maximum possible resolution MIDI can represent that would allow searching about 300 notes per second before considering branch-and-bound pruning. Not great, but not bad.

I'm sure there are much better ways, that's just the first thing I thought of :)

Always in case of float positions,

the GUI should accept a range of times rather than an absolute time

how small should the range be? Should it depend on the architecture?

It should depend on the use. Mostly I think that use tends to be mapping click positions to notes (that isn't as well isolated as it should be) in which case the range ought to be proportional to the pixel size. Are there other use cases that I'm just being too dim to remember?

And how to get the tuplet numerator (which act as denominator, if you remember/read the comments in #1127...) of a note from its float position (if we want to show it in the Gui)? I don't know if there exist smart methods to do it... I mean getting the explicit rational representation from the float ratio.

I don't remember seeing that being done in the PR and don't remember reading anything, but there are a few different algorithms for approximating a rational representation of a float in a very short amount of time if it should be necessary.

@oddtime
Copy link
Contributor

oddtime commented Feb 13, 2021

It should depend on the use. Mostly I think that use tends to be mapping click positions to notes (that isn't as well isolated as it should be) in which case the range ought to be proportional to the pixel size. Are there other use cases that I'm just being too dim to remember?

Not sure to follow, but it seems to me that you are thinking to a GUI behaviour that doesn't exist yet...
Since now the grid is "magnetic", the range I was talking about doesn't depend on pixels but on the possible approximation errors made by floats when searching if there is or not a note in the nearest grid line to the clicked pixel (to decide to add or remove the note on click).
For example (in decimal digits) triplets are at n * 0.33333... positions (in whole note units) but something may happen (in copying and paste for example) altering some last digits.

@cme cme changed the base branch from master to development February 13, 2021 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants