Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply NumPy's random functions using awkward inputs? #489

Closed
Duchstf opened this issue Oct 17, 2020 · 11 comments
Closed

Apply NumPy's random functions using awkward inputs? #489

Duchstf opened this issue Oct 17, 2020 · 11 comments
Labels
duplicate This issue or pull request already exists feature New feature or request

Comments

@Duchstf
Copy link

Duchstf commented Oct 17, 2020

Hello,

Thanks for the excellent work!! So I'm working on an application where I want to apply np.random.normal on each element of the awkward array. I'm trying to do the followings:

a = ak.from_iter([[8],[7],[9,11],[5]])
f = lambda x : np.random.normal(x,x*0.15)
# f(a) gives errors
f_arr = np.frompyfunc(f, 1, 1) # try making ufunc
# f_arr(a) also gives errors

I'm wondering if anyone here have suggestions as to what I should do in this case. 😄 Our solution right now is to basically do nested loops to change each element, which is not very efficient.

@jpivarski
Copy link
Member

The reason it's not working is because they're isn't an Awkward function overriding this NumPy function. That would be a good feature to add (hadn't thought of it), and it would be a whole new category of overload (it doesn't quite belong in ak.operations.structure, though it would be implemented in a similar way).

For the time being, I guess you'd have to unwrap the a object manually with a.layout.content until you get the underlying NumpyArray, cast this as a np.asarray, compute the random numbers, and then wrap it up as a ListOffsetArray64 using the same offsets as the original, and then put that in a new ak.Array. That's essentially what a built-in function would do, but for general structures, and it's also why such a function would be a nice addition.

@Duchstf
Copy link
Author

Duchstf commented Oct 17, 2020

Thank you! There is something I want to clarify:

and then wrap it up as a ListOffsetArray64 using the same offsets as the original, and then put that in a new ak.Array

How exactly can I do this? (Sorry I'm just starting to learn awkward). So suppose I did the following:

b = np.asarray(a.layout.content)
b_random = np.random.normal(b, b*0.15)

What should I do to convert b_random back to an awkward array with the same offsets as a.layout.offsets?

@jpivarski
Copy link
Member

(I'm waiting from a phone, so it's hard to give examples.)

The a.layout.offsets is one of the two arguments to the ak.layout.ListOffsetArray64 constructor, which puts the same list structure around the random numbers that you've made as the original list had, and then passing this to ak.Array gives it the high level interface of a (same not a.layout). Try this on a terminal to see what I mean. The repr view of these objects should help show what's going on—it's particularly instructive to practice on small objects and then scale up when you understand the structure.

@Duchstf
Copy link
Author

Duchstf commented Oct 17, 2020

Well then I guess I would do this to make a new awkward array with the same offset?

ak.Array(ak.layout.ListOffsetArray64(a.layout.offsets, ak.Array(b_random).layout))

@Duchstf
Copy link
Author

Duchstf commented Oct 17, 2020

This is my whole function, would it be similar to what you have in mind for future implementation?

def smear(arr):
    
    from awkward1 import Array
    from awkward1.layout import ListOffsetArray64
    
    #Convert it to a 1D numpy array and perform smearing
    numpy_arr = np.asarray(arr.layout.content)
    smeared_arr = np.random.normal(numpy_arr, numpy_arr*0.15)
    
    #Convert it back to awkward form
    return Array(ListOffsetArray64(arr.layout.offsets, Array(smeared_arr).layout))

@jpivarski
Copy link
Member

Your NumpyArray is unnecessarily wrapped (Array) and unwrapped (.layout), but otherwise, yes. Also, the general function would work for all data structures. (There's an internal ak._util.broadcast_and_apply that generalizes the process of unwrapping and re-wrapping. That's how all of the ufuncs work. But since it's an internal function and not a part of the stable API, you should use the technique you use here.)

Oh! I guess the reason you wrapped and unwrapped the NumpyArray is because you didn't know it was called that. :) Your can use ak.layout.NumpyArray instead of Array and then .layout. The effect is the same, but it's more direct.

@Duchstf
Copy link
Author

Duchstf commented Oct 18, 2020

Thanks for your help!

@Duchstf Duchstf closed this as completed Oct 18, 2020
@jpivarski jpivarski changed the title How to apply ufunc on awkward arrays? Apply NumPy's random functions using awkward inputs? Oct 18, 2020
@jpivarski jpivarski added the feature New feature or request label Oct 18, 2020
@jpivarski
Copy link
Member

I'm reopening this as a reminder to add the feature.

@jpivarski jpivarski reopened this Oct 18, 2020
@Duchstf
Copy link
Author

Duchstf commented Oct 18, 2020

Ok, also if you can point me to where to look at I'll be willing to make the PR for the feature too!

@jpivarski
Copy link
Member

I'll want to start a new submodule for this, so it might be done before there's enough of a pattern to build on. However, I could start with the randomization functions and if there are any others you need, it should be clear how to build on that pattern.

@jpivarski
Copy link
Member

Closing this one because it's a time-traveling duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants