Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate and improve the WRITE performance of On Demand Feature View #4207

Closed
shuchu opened this issue May 17, 2024 · 3 comments
Closed

Evaluate and improve the WRITE performance of On Demand Feature View #4207

shuchu opened this issue May 17, 2024 · 3 comments
Assignees

Comments

@shuchu
Copy link
Collaborator

shuchu commented May 17, 2024

Expected Behavior

The On Demand Feature View has implementations with using Python nativ object and Pandas object as the input/output.
We want to evaluate the performance in terms of Time. Moreover, we want to understand the bottleneck of the performance in code running efficiency level.

Current Behavior

Steps to reproduce

Specifications

  • Version: 0.37.1
  • Platform: Linux/Ubuntu
  • Subsystem:

Possible Solution

use cProfile and existing unit test functions.

@shuchu
Copy link
Collaborator Author

shuchu commented May 19, 2024

An easy run shows the performance of Pandas is really bad comparing to Python native objects.

Screenshot 2024-05-18 231742
Screenshot 2024-05-18 231751

@franciscojavierarceo
Copy link
Member

Example code of cProfile with snakeviz.

import cProfile

from scipy.optimize import minimize


def main():
    # c.f. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
    func = lambda x: (x[0] - 1) ** 2 + (x[1] - 2.5) ** 2
    x0 = (2, 0)

    constraints = (
        {"type": "ineq", "fun": lambda x: x[0] - 2 * x[1] + 2},
        {"type": "ineq", "fun": lambda x: -x[0] - 2 * x[1] + 6},
        {"type": "ineq", "fun": lambda x: -x[0] + 2 * x[1] + 2},
    )

    bounds = ((0, None), (0, None))

    result = minimize(func, x0, method="SLSQP", bounds=bounds, constraints=constraints)
    print(f"result:\n{result}")
    print(f"best fit parameters: {result.x}")


if __name__ == "__main__":
    profiler = cProfile.Profile()
    profiler.enable()
    main()
    profiler.disable()
    profiler.dump_stats("example.prof")

@franciscojavierarceo
Copy link
Member

franciscojavierarceo commented Jun 8, 2024

Done here: franciscojavierarceo/Python#23

Pandas

image

Python

image

Comparison

Local performance
Pandas runtime = 8.65ms
Python runtime = 0.907ms

Nearly 10x reduction in processing time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants