Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve InterpolateAndExtrapolate performance for array valued maps #1347

Merged
merged 12 commits into from
Apr 4, 2024

Conversation

l-althueser
Copy link
Contributor

This is a first try to improve the performance of InterpolateAndExtrapolate. The current main issue is that we do a lot of unnecessary array operations to be more flexible. However, for special use cases we can improve by at least a factor 2 in small tests if we do not store the intermediate results and use Einstein summation convention.

This makes the code more complex but we can save a lot of time for any pattern map operation. I kept the behavior identical to previous implementation for all cases except the array valued maps used for S1 and S2 patterns.

I tested locally with several use cases.

@coveralls
Copy link

coveralls commented Mar 20, 2024

Coverage Status

coverage: 91.341% (+0.02%) from 91.317%
when pulling bc59ce2 on itp_map_speedup
into d6d96da on master.

@l-althueser
Copy link
Contributor Author

Ready for review!

@HenningSE HenningSE requested a review from dachengx March 25, 2024 12:57
@yuema137 yuema137 self-requested a review March 25, 2024 21:33
@yuema137
Copy link
Collaborator

yuema137 commented Apr 2, 2024

Hi @l-althueser , thanks for this PR!
I tested this function and for single-dimension values the result looks consistenct between the old and new method:
image
image

However for multiple dimension values, there is a problem of the dimension of weights. Minimal example to reproduce it:

def generate_multidimensional_dataset(n_points, value_functions):
    """
    Generate a 3D dataset where each point has multiple values (e.g., temperature, pressure, humidity).
    
    :param n_points: Number of points along each axis.
    :param value_functions: A list of functions to generate the values for each dimension.
    :return: points, multidimensional values
    """
    axes = np.linspace(-5, 5, n_points)
    xx, yy, zz = np.meshgrid(axes, axes, axes, indexing='ij')
    points = np.vstack([xx.ravel(), yy.ravel(), zz.ravel()]).T
    
    values = np.stack([func(xx, yy, zz) for func in value_functions], axis=-1).reshape(-1, len(value_functions))
    
    return points, values

# Example value functions for each dimension
def d1_func(x, y, z):
    return x**2 + y**2 + z**2

def d2_func(x, y, z):
    return np.sin(x) + np.cos(y) + np.tanh(z)

def d3_func(x, y, z):
    return np.exp(-(x**2 + y**2 + z**2))

# Generate the dataset
n_points = 10  # Adjust based on your computational capacity
value_functions = [d1_func, d2_func, d3_func]
points, multidimensional_values = generate_multidimensional_dataset(n_points, value_functions)
new_points = points
interpolator = InterpolateAndExtrapolate(points, multidimensional_values)
interpolated_df = interpolator(new_points)

In this example I got an error:

[540](https://vscode-remote+ssh-002dremote-002bmidway2-002dlogin2-002ercc-002euchicago-002eedu.vscode-resource.vscode-cdn.net/home/yuem/package_testers/~/.local/lib/python3.9/site-packages/numpy/lib/function_base.py:540) if wgt.shape[0] != a.shape[axis]:
    [541](https://vscode-remote+ssh-002dremote-002bmidway2-002dlogin2-002ercc-002euchicago-002eedu.vscode-resource.vscode-cdn.net/home/yuem/package_testers/~/.local/lib/python3.9/site-packages/numpy/lib/function_base.py:541)     raise ValueError(
    [542](https://vscode-remote+ssh-002dremote-002bmidway2-002dlogin2-002ercc-002euchicago-002eedu.vscode-resource.vscode-cdn.net/home/yuem/package_testers/~/.local/lib/python3.9/site-packages/numpy/lib/function_base.py:542)         "Length of weights not compatible with specified axis.")

TypeError: 1D weights expected when shapes of a and weights differ.

Also I'm not sure what does array_valued account for. Could you kindly add the docstring for it? Thanks a lot!

@dachengx
Copy link
Collaborator

dachengx commented Apr 3, 2024

Also I'm not sure what does array_valued account for. Could you kindly add the docstring for it? Thanks a lot!

Hey @yuema137 . array_valued means that the result for a position is not a single value but an array. So in your example, you should set array_valued=True.

Copy link
Collaborator

@dachengx dachengx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @l-althueser . I tested the PR with a run, and it seems to not change any result.

@dachengx dachengx merged commit d6b5e00 into master Apr 4, 2024
8 checks passed
@dachengx dachengx deleted the itp_map_speedup branch April 4, 2024 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants