Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SourceModel should be a trait #4

Closed
danieleades opened this issue Feb 25, 2022 · 7 comments
Closed

SourceModel should be a trait #4

danieleades opened this issue Feb 25, 2022 · 7 comments

Comments

@danieleades
Copy link
Contributor

it strikes me that the SourceModel should be a trait.

The SourceModel provided is somewhat opinionated. Consumers of this crate may want a more bespoke model for certain cases.

The way to have both would be to create a trait to represent how a model should behave, and then implement that trait for the concrete SourceModel already in this crate.

@cgbur
Copy link
Owner

cgbur commented Feb 25, 2022

I agree. Ideally I want to make another crate that does huffman coding and make a third crate that is a parent of both that exposes the traits. Not sure on the architecture or how to go about that without more thought. I think they should use the same API and be interchangeable to give users the choice between accuracy/compression or the speed of Huffman. And or whatever methods others would want to make. Something as simple as RLE should be able to use the same API.

Honestly I am not sold on the API of this crate in general and would be open to reworking it. I think it’s clunky right now.

@danieleades
Copy link
Contributor Author

I'll have a think about whether I have anything I can contribute here. I'm just reading up on arithmetic coding for a specific project, so I'm fairly new to this.

In my case, I probably also need to be generic over 'symbols' in the sense that they could be characters, numbers, or enums.

@cgbur
Copy link
Owner

cgbur commented Feb 25, 2022

Well in the end they’re all just bits and it just matters how you interpret them. So that’s another decision point whether a user should just deal with bits or have the user friendliness of types.

I think there’s a lot of convenience crates and tools to be made in the compression space.

@danieleades
Copy link
Contributor Author

danieleades commented Feb 25, 2022

i guess i was thinking of something along the lines of

trait Model {
    type T;
    fn range(context: &[T], symbol: T) -> Interval;
}

impl Model for arcode::SourceModel {
    type T = u32;
    // etc.
}

no idea if that would actually work...

Well in the end they’re all just bits and it just matters how you interpret them

that's not true in the general case though right? For example if you were to encode a vector of enums you have a few choices about how you might do that, but it's not 'bits'

@danieleades
Copy link
Contributor Author

for some context, i'm planning to do lossy compression of floats, integers, and strings similar to this spec - https://libdccl.org/codecs.html

@danieleades
Copy link
Contributor Author

i drafted an example of what this might look like - https://github.com/danieleades/arithmetic-coding.

it's a little rough, but you get the idea. I'm still very new to arithmetic coding (and compression in general)

Interested to hear your thoughts

@danieleades
Copy link
Contributor Author

my implementation of this using a Model trait is pretty mature now (actually it kind of grew arms and legs). I'll close this issue, since i don't think there's any action here.

I think the best way forward for this crate if it were to use a Model trait would be to import the trait from my crate, and implement it for the fenwick tree based models in this crate. You'd be more than welcome to re-export the encoder/decoder too if that's useful. they're pretty quick, because they don't use any floating point arithmetic.

Thank you for the inspiration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants