-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data structures for variably-sized fixed columns #1542
base: main
Are you sure you want to change the base?
Conversation
77c7792
to
d9061d2
Compare
serde_cbor::from_reader(file).unwrap() | ||
impl<T: DeserializeOwned + Serialize> ReadWrite for T { | ||
fn read(file: &mut impl Read) -> Self { | ||
serde_cbor::from_reader(file).unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to this PR but we should probably use BufRead
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean just changing to file: &mut impl BufRead
? What would be the difference? It is just a trait that adds more functions, which apparently serde_cbor::from_reader()
does not use?
number/src/lib.rs
Outdated
|
||
#[derive(Serialize, Deserialize)] | ||
/// Like Columns, but each column can exist in multiple sizes | ||
pub struct VariablySizedColumns<F> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be inside number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's much more than just "number". Either we create a new crate or rename this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not put it inside the executor? Isn't it one of its main outputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, changed it!
number/src/lib.rs
Outdated
@@ -31,6 +33,60 @@ pub fn log2_exact(n: BigUint) -> Option<usize> { | |||
.filter(|zeros| n == (BigUint::from(1u32) << zeros)) | |||
} | |||
|
|||
pub type Columns<F> = Vec<(String, Vec<F>)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this type name over-simplifies (i.e. there should be no typedef ;) ). I would like to know that this is a vector and that it contains the name as a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
number/src/lib.rs
Outdated
/// Like Columns, but each column can exist in multiple sizes | ||
pub struct VariablySizedColumns<F> { | ||
/// Maps each column name to a (size -> values) map | ||
columns: Vec<(String, BTreeMap<usize, Vec<F>>)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we gain much hiding variably sized columns in a collection of them.
Wouldn't it be better to use a type for ONE column that has a variable size? Then there could be free functions that reduce a collection of those to a unique size.
The len()
function is very misleading for example. Does it return the number of different sizes? If this is just a Vec<String, VariableLengthColumn<T>>
it's much clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Should we really store multiple copies of the values? Wouldn't we want to return a slice instead (plus manage metadata about which slice lengths are "legal")? |
The problem with that is that we sometimes have columns that are different in the last row, for example here. |
0c43fdf
to
1de8252
Compare
number/src/lib.rs
Outdated
} | ||
} | ||
|
||
pub fn get_only_size<F>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe get_uniquely_sized
? With this name, it could mean that it only returns the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
31c1e54
to
a2a5968
Compare
a2a5968
to
b90a386
Compare
This PR adds
number::VariablySizedColumns
, which can store several sizes of the same column. Currently, we always just have one size, but as part of #1496, we can relax that.