Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refractoring HAAQI #225

Closed
groadabike opened this issue Mar 15, 2023 · 8 comments
Closed

Refractoring HAAQI #225

groadabike opened this issue Mar 15, 2023 · 8 comments
Assignees
Labels
Cadenza enhancement New feature or request

Comments

@groadabike
Copy link
Contributor

Running HAAQI in full-length song is very slow.
This affects both tasks of this first round of Cadenza

@groadabike groadabike added enhancement New feature or request Cadenza labels Mar 15, 2023
@groadabike groadabike self-assigned this Mar 15, 2023
@groadabike
Copy link
Contributor Author

Hi @ns-rse, @jonbarker68

I've been trying to speed up the HAAQI code.
It is pretty slow because they have several (around 8) full autocorrelations of very long signals (about a 3-min song).

In the code, we are using scipy which is not supported by numba but suppose to be faster than np.correlate for long signals.
But, I did some profiling using both and couln't find a significant difference.

We can't just add @jit(nopython=True) on top of this function because Numba only supports the default of numpy.correlate, i.e., correlation with mode = 'valid' and not 'full'

https://numpy.org/doc/stable/reference/generated/numpy.correlate.html
https://github.com/numba/numba/blob/07f824e49e5e2f158b79e7ba787674272baf1da0/numba/np/arraymath.py#L4199

I tried to overwrite the numba's numpy.correlate overload but I wasn't able to do it successfully.
Have you ever tried something like that?
Is it worth trying to define our own overload to try to accelerate the code?

https://numba.pydata.org/numba-doc/dev/extending/overloading-guide.html?highlight=overload

Thank you for your help.

@groadabike
Copy link
Contributor Author

groadabike commented Mar 24, 2023

@ns-rse, @jonbarker68

After the release of pyclarity 0.3.0 I want to start working on improving HAAQI.
There are many changes we can do to make it more pythonic and improve its readability and performance.

For example:

  1. We can cache most of the functions. This would help in all cases when the references need to be processed more than one time. And/or,
  2. We can refactor the code to an abstract class. Then haaqi, haspi and hasqi can implement or overwrite the methods. Some functions in haspi.eb are common like ear_model that should be a method under the abstract class, and others functions are metric specific like eb.melcor9 which is only used by haaqi and only need to be a method in haaqi class.
  3. We need to add type hints.
  4. Vectorise the for loops when possible.

What are your thoughts? What else would you do to improve the metrics?

@ns-rse
Copy link
Contributor

ns-rse commented Mar 24, 2023

These all seem like eminently sensible ideas @groadabike

Obviously vectorising loops are going to offer good speed ups and restructuring into abstract classes are really useful.

Has your profiling covered more than just numpy vs scipy for correlation?

If not it might be worth using profiling against the code base to determine which areas might benefit most from careful refactoring.

@groadabike
Copy link
Contributor Author

The profiling I am running evaluates the time of each function when running haaqi.
35% - 40% of the time is in function eb.bm_covary which has 6 correlations in it and nested for loops. (eb.bm_covary)(

def bm_covary(
).

@groadabike
Copy link
Contributor Author

@ns-rse , @jonbarker68

In a quick review, I can see 5 possible classes

              HA_METRICS
                   |
               __ _|_______
              |           |
      HA_QUALITY        HASPI 
HAAQI         HASQI

we could have a HA_PERCEPTUAL as the father of haspi but I can't see a benefit of doing that.
I have a class ha_quality as parent for haaqi and haspi in order to have the common methods in an abstract class.
obviously, we could have the common methods as part of the parent ha_metrics

@ns-rse, how would you organise these classes?
Any advise would be great.
Thank you

Below, I have 2 tables,

  1. The functions each metric is calling
  2. The constructor parameters

These have to be reimplemented in a way that doesn't affect the previous use.

Table one

Functions each metric is using

Haaqi Hasqi Haspi
eb.ear_model eb.ear_model eb.ear_model
eb.env_smooth eb.env_smooth  
eb.melcor9    
eb.spectrum_diff eb.spectrum_diff  
eb.bm_covary eb.bm_covary  
eb.ave_covary2 eb.ave_covary2  
  eb.mel_cepstrum_correlation  
    ebm.env_filter
    ebm.cepstral_correlation_coef
    ebm.fir_modulation_filter
    ebm.modulation_cross_correlation
    ip.get_neural_net
    ip.nn_feed_forward_ensemble

Table Two

Constructor Parameters

reference: np.ndarray, reference, reference: np.ndarray,
reference_freq: int, reference_freq, reference_freq: int,
processed: np.ndarray, processed, processed: np.ndarray,
processed_freq: int, processed_freq, processed_freq: int,
hearing_loss: np.ndarray, hearing_loss, hearing_loss,
equalisation: int, equalisation=1,  
level1: int = 65, level1=65, level1: int = 65,
silence_threshold: float = 2.5, silence_threshold=2.5,  
add_noise: float = 0.0, add_noise=0.0,  
segment_covariance: int = 16, segment_covariance=16,  
    f_lp: int = 320,
    itype: int = 0,

@ns-rse
Copy link
Contributor

ns-rse commented Mar 27, 2023

Hi @groadabike ,

I've not masses of experience with design patterns (I've no formal training in computer science so pick things up as I go along) and usually refer to Refactoring Guru when encountering them/needing to implement them.

It sounds like you're proposing using Abstract Factory and so ha_metrics would decide which constuctor to return and in turn, ha_quality would be deciding which quality metric (haaqi or hasqi) constructor to return.

Another possible design pattern to use is Builder which allows construction of complex interfaces but they don't always have to the same construction processes, so maybe that would be preferable here given there are some differences in what each is required.

Not sure if any of that helps at all I'm afraid.

@groadabike groadabike changed the title Reduce HAAQI times Refractoring HAAQI Apr 18, 2023
@jonbarker68
Copy link
Contributor

For reference, there is also a python HASPI translation here
https://github.com/nii-yamagishilab/NELE-GAN/tree/master/pyHASPI

@groadabike groadabike mentioned this issue Apr 24, 2023
3 tasks
@groadabike groadabike linked a pull request Apr 24, 2023 that will close this issue
3 tasks
@groadabike
Copy link
Contributor Author

HAAQI refactoring was done in #364
Closing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cadenza enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants