Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size by removing scipy #451

Open
ojnas opened this issue Nov 7, 2022 · 4 comments
Open

Reduce size by removing scipy #451

ojnas opened this issue Nov 7, 2022 · 4 comments

Comments

@ojnas
Copy link
Contributor

ojnas commented Nov 7, 2022

Adding gnpy (using pip install) when e.g. building an application as docker containers increases the size of the application by a few 100's of MBs, which can sometimes be a bit problematic. The main reason is the requirements, which includes some very large packages, the largest one being scipy. Searching through the code, I can only find two uses of scipy: 1) constants (which could easily be replaced by internal definitions) and 2) interp1d from scipy.interpolate (for which there is a practically equivalent function in numpy). Could removing the dependency on scipy be an option?

Another large dependency is pandas, which only seems to be used in a couple of tests.

jktjkt added a commit that referenced this issue Nov 15, 2022
Pandas is only used from the test suite.

Bug: #451
Change-Id: Iafd02c800e5b7772e180979d19b81a2eda0e588f
@AndreaDAmico
Copy link
Contributor

Unfortunately, numpy.interp is not equivalent to scipy.interpolate.interp1d and the latter provides more flexible usage.
In GNPy scipy.interpolate.interp1d cannot be replaced by numpy.interp seamlessly, because of the usage in gnpy.core.science_utils, where scipy.interpolate.interp1d allows the interpolation along one selected axis (lines 171, 172, 412, 435).
This obstacle can be overcome introducing an inline for loop but this solution won't scale properly with the integration resolution (I tested this solution).

@ojnas
Copy link
Contributor Author

ojnas commented Nov 29, 2022

Hi @AndreaDAmico,

with "inline for loop", do you mean something like:

power_profile = asarray([interp(z_final, z, p) for p in power_profile])

instead of the current:

power_profile = interp1d(z, power_profile, axis=1)(z_final)

I tested this and it seems to work. In what way does it not scale properly?

@ojnas
Copy link
Contributor Author

ojnas commented Nov 29, 2022

@AndreaDAmico
Copy link
Contributor

Hi @ojnas,

Exactly, that is the solution I was talking about. I ran some simulations changing the sim_params.raman_params.result_spatial_resolution (which determines the size of z_final) and the number of channels.
In the figure that I attached to this comment, you can see that the numpy solution is always at least the double slower than the scipy solution.
That is why I am saying that this change cannot be done seamlessly.

time_scaling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants