Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting the flows #2

Closed
rlouf opened this issue May 23, 2015 · 23 comments
Closed

Getting the flows #2

rlouf opened this issue May 23, 2015 · 23 comments

Comments

@rlouf
Copy link
Contributor

rlouf commented May 23, 2015

Thanks for the wrapper, it has been very useful to me! For my application, I would like to get the flows as well, and I was wondering whether it would be possible?

I read over the original C++ code, and if I understood well you can pass a pointer to a std::vector<std::vector<NUM_T>> initialised with 0, and it will modify the container in place, giving you the flows. Am I right?
I am a complete beginners with Cython. Would you have any pointer to some useful tutorials (besides the doc) to get me started? Since python functions do not modify objects in place, I am a bit confused.

I'll see what I can do, and file a PR once/if I ever managed to include this function!

@rlouf
Copy link
Contributor Author

rlouf commented May 28, 2015

So it seems that this is a bit tricky to do with the current C++ implementation, since we'd have to pass a pointer to a std::vectorstd::vector, and I haven't found anything encouraging on the web that would imply Cython can do that.

I suggest writing a wrapping function in the C++ code that creates a nested vector, passes it to the original function and returns it. I think Cython can deal with simple vector outputs. Unless you have any better idea?

@wmayner
Copy link
Owner

wmayner commented May 28, 2015

I'm fine with that approach—I'm afraid that since our application doesn't require the flows, I can't really spend too much time on this feature. Happy to look over a PR though!

@rlouf
Copy link
Contributor Author

rlouf commented May 28, 2015

NP - I need it, an I think it might be useful to someone else, so I might
as well push it on the repo!

On 28 May 2015 at 18:48, Will Mayner notifications@github.com wrote:

I'm fine with that approach—I'm afraid that since our application doesn't
require the flows, I can't really spend too much time on this feature.
Happy to look over a PR though!


Reply to this email directly or view it on GitHub
#2 (comment).

Rémi

"We asked the captain what course
of action he proposed to take toward
a beast so large, terrifying, and
unpredictable. He hesitated to
answer, and then said judiciously
'I think I shall praise it' '" (Robert Hass)

www.notonebitsimpler.com
www.quanturb.com

Institut de Physique Théorique, CEA Saclay
Bât. 774, P.136

remi.louf@cea.fr
0777034170

@gojomo
Copy link

gojomo commented Oct 18, 2015

@rlouf – any luck with the flows? If not, I may be able to help. (And, am I correct in understanding that once one has the flows, you can also deduce where any 'remainder' unmoved/unmatched mass remains?)

@rlouf
Copy link
Contributor Author

rlouf commented Dec 15, 2015

@gojomo Not so far, I had to focus on my thesis writing for a while. But I will need this feature soon for my research. My C++ is a bit rusty, so I would definitely appreciate some help with this!

@rlouf
Copy link
Contributor Author

rlouf commented Jan 19, 2016

@gojomo I will start working on this relatively soon, actually (maybe a couple of weeks).

The C++ function takes a pointer as an input for the flows, and I am not sure how/if you can deal with this with Cython. The other idea I had was to wrap the original function with another function than initialises the flow matrices, passes a pointer to the original function, and outputs the matrix. Any thoughts?

@wmayner
Copy link
Owner

wmayner commented Jan 19, 2016

I had to deal with a similar problem in one of my other projects; I managed to get Cython to pass a reference to an array to a C++ function that filled the referenced array with values, and then to turn that into a NumPy array (without copying any data) to expose to Python.

Perhaps some of that code might be useful. I got the idea from a helpful person in this Google groups discussion, and I used it in this file.

@rlouf
Copy link
Contributor Author

rlouf commented Jan 20, 2016

Thanks, that's really helpful! If I understand well that is for simple vectors, correct? Any idea for a vector of vectors (this is out of my league too :) )?
Maybe it can also be adapted to output a matrix by playing with the index anyway...

@wmayner
Copy link
Owner

wmayner commented Jan 21, 2016

Yes, I believe it is much simpler to interpret the vector as multidimensional in Python rather than treating it as such in C++. In fact, that is exactly what's happening in the code I linked—the C++ has two indices which are used to fill the linear vector, and then corresponding NumPy array is reshaped to the proper dimensions afterwards.

@walteracevedo
Copy link

Hello, I've been using Fast-EMD code (the C++ routine and the Matlab wrapper) for a Data assimilation application (ETPF: ensemble transform particle filter) where the important output is precisely the flow. Now, I am extending a Python data assimilation code to ETPF, so I wonder if there is currently a simple way to obtain the flow via pyemd?
Best regards!

@wmayner
Copy link
Owner

wmayner commented Jun 2, 2016

@walteracevedo, glad to hear you're finding the wrapper useful.

@rlouf, did you have any luck with getting the flows?

@rlouf
Copy link
Contributor Author

rlouf commented Jun 3, 2016 via email

@walteracevedo
Copy link

That's a pity. I have no idea about Cython, otherwise I would gladly help in the development.
Any idea on how to proceed?

@rlouf
Copy link
Contributor Author

rlouf commented Jun 3, 2016

The C++ code can output the flows if you pass a parameter, but there is
some template-y thing going on there that I don't understand. The idea
would be to instantiate the class with the parameter, and then to find a
way to make cython pass the vector of pointers to python. Which is not
easy either.

On Fri, 3 Jun 2016, 13:14 Walter Acevedo, notifications@github.com wrote:

That's a pity. I have no idea about Cython, otherwise I would gladly help
in the development.
Any idea on how to proceed?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADtH9CxtP2uyczIqtyn4ICvkBe0RmYz0ks5qIAyogaJpZM4Emwvs
.

@wmayner
Copy link
Owner

wmayner commented Jun 3, 2016

@walteracevedo, I can't do much on this at the moment, but this may be useful in getting started:

The interface to the underlying C++ implementation is described here. We would need to provide a pointer to an array that “has enough space and is initialized to zeros.”

Such an array can be declared and initialized in Cython, i.e. in emd.pyx. I used a similar strategy in another one of my projects, on this line. There, I'm making a call to this function, supplying it with references to the first elements of the various Int32Wrappers, which it then fills with data.

Note that the EMD code requires a pointer, rather than a reference, but you can probably adapt the above approach to work that way.

@walteracevedo
Copy link

walteracevedo commented Jun 8, 2016

Hey again, I just got from my boss a bit of time to work on getting the flows via pyemd :D

One question:
The program from which I am calling pyemd is writen in Python 2.7 but I already noticed that in some parts of pyemd code Python 3 is assumed.

Could that be a problem?

@wmayner
Copy link
Owner

wmayner commented Jun 8, 2016

Yes, it may be a problem—it's written assuming Python 3. There are no plans to implement Python 2 support at present, since the wrapper was developed to be used as a dependency of another project, PyPhi, which is a Python 3 library.

However, if you need Python 2.7 support, it shouldn't be too difficult to develop a fork that's compatible, since there's not that much code.

Also, this shouldn't affect the strategy for getting the flows.

@wmayner
Copy link
Owner

wmayner commented Aug 17, 2016

@walteracevedo, just FYI, I recently tried installing pyemd to a Python 2.7 virtual environment, and it seems to be compatible. I can't guarantee support in the future, but for now it should work.

@rlouf
Copy link
Contributor Author

rlouf commented Nov 29, 2016

I finally found a way to get the flows. I am now working on finding a way to get both flows and emd value while not breaking the current api. Will add a PR when I'm done.

@walteracevedo
Copy link

That's very good to hear! In any case in the meantime I came up with another solution. I took an interface to C/Fortran that I have developed before and added an extra layer of ctypes code. It was a rather ad-hoc solution but so far it has worked well for me. In case anyone is interested the code can be found here https://github.com/walteracevedo/fastEMD_python_fortran_wrapper

@rlouf
Copy link
Contributor Author

rlouf commented Dec 14, 2016

Great! I'll try your solution at some point. I have issues with my implementation; I developed it over an old fork of pyemd, it worked well, and now it won't compile (see PR #14)... Any help on the PR appreciated!

@rlouf
Copy link
Contributor Author

rlouf commented Jan 5, 2017

Solved by PR #15

@rlouf rlouf closed this as completed Jan 5, 2017
@wmayner
Copy link
Owner

wmayner commented Jan 6, 2017

@walteracevedo, @gojomo, thanks to @rlouf's efforts the flow can now be obtained in the latest version (v0.4.0):

from pyemd import emd_with_flow
emd, flow = emd_with_flow(first_signature, second_signature, distance_matrix)

Enjoy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants