Right now the indices are stored as global bit vector for the purpose of fast indexing the raw data. It gives good performance of logical operations like unions and intersections compared to the integer indices,yet costing more space. For instance, the FCS file of 250k events is about 15M.
170 gates will end up having indices of N*170/10^6/8=5.2MB. which is 1/3 of raw data size.
We have following solutions based on the discussions so far:
1.compression of bit vector( run-length encoding is one of compressing techniques we can use)
2.using global integer indices, if most of gates are rare populations (like cytokine gates),it could be more efficient even integer vector is 32 times as big as bit vector
3.using local indices instead of global one(suggested by Raphael ages ago), it can be either bit or integer vector.
1st and 3rd can ease the space problem for sure,just at cost of some extra computation time (either sorting or converting local indices to global).
A modified version of 2nd could be promising too: basically we could mix two types of indices in one gating tree, each gate can store indices as either bit or integer vector, based on which ever is smaller.
Right now the indices are stored as global bit vector for the purpose of fast indexing the raw data. It gives good performance of logical operations like unions and intersections compared to the integer indices,yet costing more space. For instance, the FCS file of 250k events is about 15M.
170 gates will end up having indices of N*170/10^6/8=5.2MB. which is 1/3 of raw data size.
We have following solutions based on the discussions so far:
1.compression of bit vector( run-length encoding is one of compressing techniques we can use)
2.using global integer indices, if most of gates are rare populations (like cytokine gates),it could be more efficient even integer vector is 32 times as big as bit vector
3.using local indices instead of global one(suggested by Raphael ages ago), it can be either bit or integer vector.
1st and 3rd can ease the space problem for sure,just at cost of some extra computation time (either sorting or converting local indices to global).
A modified version of 2nd could be promising too: basically we could mix two types of indices in one gating tree, each gate can store indices as either bit or integer vector, based on which ever is smaller.