-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Shuffle #1072
Comments
I have to ask, are you using the entire array at any point or are you using columns only ? |
So this problem arose for me because I'm trying to shuffle prior to training to improve convergence. So I can't quite shuffle the entire array, but just shuffle the row/columns preserving the entire vector held within them. The |
@jramapuram for the first function, are you talking about a random shuffle ? |
Yup, random shuffle (something like knuth shuffle) but on rows (or columns) instead of fully random of all elements. This is needed for shuffling training data instances for example. |
Just thinking about it, it can be done using sort_index/sort_by_key. If you want to randomize the indices per column, this is something you can do.
Explanation: The shuffled_indices actually start as a range of [0, dim(0)] repeated for every column. Since the randomizer has different values for every column, the shuffled_indices come out to be shuffled based on the randomizer. If you dont want indices but want shuffled values, the you can use the sort_by_key function
This will shuffle the values based on the randomizer. You can do the same with small modifications for row-wise. |
@shehzan10 : Will this be faster than the method below? pub fn row_planes(input: &Array, first: u64, last: u64) -> Result<Array, AfError> {
af::index(input, &[Seq::new(first as f64, last as f64, 1.0)
, Seq::default()
, Seq::default()])
}
pub fn set_row_planes(input: &Array, new_planes: &Array
, first: u64, last: u64) -> Result<Array, AfError>
{
match input.dims().unwrap().ndims() {
4 => af::assign_seq(input, &[Seq::new(first as f64, last as f64, 1.0)
, Seq::default()
, Seq::default()
, Seq::default()]
, new_planes),
3 => af::assign_seq(input, &[Seq::new(first as f64, last as f64, 1.0)
, Seq::default()
, Seq::default()]
, new_planes),
2 => af::assign_seq(input, &[Seq::new(first as f64, last as f64, 1.0)
, Seq::default()]
, new_planes),
1 => af::assign_seq(input, &[Seq::new(first as f64, last as f64, 1.0)]
, new_planes),
_ => panic!("unknown dimensions provided to set_row_planes"),
}
}
pub fn shuffle_array(v: &mut[&mut Array], rows: u64) {
let mut rng = rand::thread_rng();
for row in 0..rows {
let rnd_row = rng.gen_range(0, rows - row);
for mat in v.iter_mut() { //swap all tensors similarly
let dims = mat.dims().unwrap();
let rnd_plane = row_plane(mat, rnd_row).unwrap();
let orig_plane = row_plane(mat, dims[0] - row - 1).unwrap();
**mat = set_row_plane(mat, &rnd_plane, dims[0] - row - 1).unwrap();
**mat = set_row_plane(mat, &orig_plane, rnd_row).unwrap();
}
}
} |
I'm not completely sure. If you can give me the benchmark time of the function for the sizes you are testing, I can probably give you a better answer. Also are you looking to to shuffle it in a particular order? Or is it a matter more random is good. You should be able to test it yourself once the v3.3.2 installers are out (probably on 4/26/2016). |
Hey @shehzan10 , so let me clarify this idea : Generate uniform random values (same dims as matrix). Then sort them (via sort_index) and use that to run extern crate arrayfire as af;
use af::{Array, Dim4};
pub fn shuffle_array(v: &mut[&mut Array]) {
let row_dims = v[0].dims();
// sort randomized rows
let mut randomizer = af::randu::<u32>(row_dims);
randomizer = af::sort_index(&randomizer, 0, false).1;
// also tried
// randomizer = af::sort_index(&randomizer, 0, false).1;
af::print(&randomizer);
// sort the array according to the randomized sorted rows
for mat in v.iter_mut() { //swap all tensors similarly
**mat = af::sort_by_key(&randomizer, mat, 0, false).1;
}
}
fn main() {
let idims = Dim4::new(&[3,3,1,1]);
let idata = vec![0,1,2,3,4,5,6,7,8];
let mut input_array = af::transpose(&Array::new(&idata, idims), false);
af::print(&input_array);
shuffle_array(&mut[&mut input_array]);
af::print(&input_array);
} Results [first is true array, second is randomizer and third is sorted]: No Name Array
[3 3 1 1]
0 1 2
3 4 5
6 7 8
No Name Array
[3 3 1 1]
2 2 1
0 1 2
1 0 0
No Name Array
[3 3 1 1]
0 1 5
6 4 2
3 7 8 So when I tried to sort the random matrix by the columns (and use that as the index), which makes the most sense to me it ends up breaking row symmetry. I.e. elements from one row go into another row. |
A random shuffle can be implemented like this: int len = original.dims(1); // use original.dims(0) for rows
af::array tmp = af::randu(len, 1);
af::array val, idx;
af::sort(val, idx, tmp);
af::array shuffled = original(span, idx); // use original(idx, span) for rows You don't need 2 sorts. |
It helps to think of the keys+values as a pair. For example, if you want top sort a complex number by just its real component. If the real component is being reordered, the imaginary component will follow the same rule. @pavanky wouldn't a single sort_by_key be better than sort+indexing? Here's what I have
As you can see, the sorted_in is not shuffled input. |
@shehzan10 shuffle needs to preserve rows when shuffling columns and vice versa. You can see that the row |
@pavanky is the method you proposed in this thread the preferred one or has there been an update since then about shuffling a firearray? |
@papamarkou that is still a valid approach. If you are referring to mixed batch support in sort_by_key that pavan mentioned, no that hasn't been added I think. |
Thank you for letting me know @9prady9 - I will use the proposed random shuffle in this case. |
example signature:
where
const int seq_dim
is the dim onto which shuffling takes place.Furthermore, it would make sense implementing something like this to resolve the above:
The text was updated successfully, but these errors were encountered: