Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Milestones to v1.0 #46

Closed
nandoconde opened this issue Mar 16, 2023 · 10 comments
Closed

Milestones to v1.0 #46

nandoconde opened this issue Mar 16, 2023 · 10 comments

Comments

@nandoconde
Copy link

Hello everyone

I am interested in native Julia packages for bit manipulation, and this seems to be the best (along with native bitarray, which I suspect is less performant than this package).

However, it seems to be stuck since some time ago in pre v1.0 phase. Could we set any milestones needed to push forward the package to its first semver-stable release?

Thanks!
@Roger-luo @GiggleLiu

@Roger-luo
Copy link
Member

Hi, this package was not intended to be a general-purpose bit manipulation package. Still, it has a lot of overlapping with general bit manipulation, e.g this package also supports dits. It'd be nice if there are more users trying it and reporting back on API designs instead rush into 1.0. Otherwise, we'd stay as it is since it's a quite stable component of Yao.jl

@nandoconde
Copy link
Author

Hi,
I understand the current situation, and it does seem that a separate package would be desirable.

I played a bit with the package and liked a lot of its ideas. However, the differences in memory layout and mutability between having BigInt and <: Integer for the internal representation of the bitstrings may not be suitable for a general purpose bitstring, which may benefit from having Vector{UInt8} or Vector{UInt64} as in the standard library's bitarray.

What do you think? How could the things go to have a general purpose, performan bit manipulation package?

@Roger-luo
Copy link
Member

If you use Vector{UInt8}, you will be allocated on the heap, and it will no longer be a bitstype. Losing this property means you might not want to use it inside a loop as loop indices etc. And you won't want a heap vector just for 64-bit data. This is why we are using a primitive type.

The internal data structure of BigInt is actually very similar to Vector{UInt8}, the use of it is more for convenience.

@nandoconde
Copy link
Author

Yeah, my fault.

I was actually speaking about MVector and SVector.

The only problem I see is the multiplicity of representations. A bitstring with internal MVector{UInt8, 2} and an MVector{UInt16, 1} would need to compile different methods, even if they could be representing the same length.

@Roger-luo
Copy link
Member

MVector is still different from primitive type tho, though it is a stack-allocated vector but not exactly lowered into the same llvm data structure as a primitive type. But if you are gonna use MVector, just use UInt8 is fine, that's the standard way of representing a byte stream, but this has a size limit too Or you could just use Ptr{UInt8} for bits then call the libc allocator like what BigInt does.

@nandoconde
Copy link
Author

I'd like to avoid messing with Ptr{UInt8} since, as far as I can gather, it'd be just like using BigInt and that would be way easier.

I have been reading (quickly) over LLVM's primitive types and they include integers with arbitrary length. I guess that Julia does not support them out of the box because they have no direct correspondence to native types in most architectures, isn't it?

I have also peeked at the internals of your package and gathered how things are done there. If you'd rather keep this packages separate just for bit basis for Yao, I think I can create a separate package specifically for bitstring manipulation. It'd include the possibility of having both long and short (< 128-bit) bitstrings, just like yours. For the long bitstrings I can use small SVectors of UInt64 and avoid mutability altogether.

I could go directly with BigIntjust like your package, but I'd like to avoid heap allocations. Am I right in guessing that this would improve performance and facilitate support in StaticCompiler.jl?

Thanks for your help!

@Roger-luo
Copy link
Member

Please go ahead for creating a new package. This package kinda handles the bit configuration and basis at the same time so it will likely be deprecated in a new rewrite we are working on with explicit basis support. It'd be nice to rethink about bit configurations in a separate manner.

I think Julia lowers its own primitive type to LLVM primitive but with limitation on a maximum size.

Yeah I think for long bits SVector is good, but remember there's also a size limitation on SVector, thus it cannot be arbitrary large. For arbitrary large bits you will still want Vector{UInt8} or BigInt.

Unless you have a very specific use case that must compile binaries in Julia, I wouldn't expect too much on StaticCompiler at this stage it is highly experimental and will need a lot more work to reach a more usable state.

What's wrong with heap allocation tho? If you have a large bit stream there's no other way but heap allocate.

On the other hand, the best way of supporting such type is just having different types under the same abstract type since their behavior could be quite different.

@nandoconde
Copy link
Author

nandoconde commented Mar 21, 2023

Regarding heap allocation, besides the StaticCompiler caveat, I have nothing against it. However, the native bitarray type already uses a heap allocation of Vector{UInt64}, so I am a bit wary of the potential use case of yet-another-package for bit manipulation besides bitarray, which is already provided in Julia's stdlib.

On the one hand, Julia's bitarray does not have poor performance and it is used in several well-known packages and stdlib methods, so having another package that does exactly the same would be dividing efforts.

On the other hand, providing a new package with specialized functions and methods for specific bitstring manipulation could enhance the ecosystem, providing more compact representation of states, or be used for protocol transmission and saving/loading of data.

@Roger-luo
Copy link
Member

Roger-luo commented Mar 22, 2023

On the one hand, Julia's bitarray does not have poor performance and it is used in several well-known packages and stdlib methods, so having another package that does exactly the same would be dividing efforts.

but it lacks manipulation tools and kinda follows the Array interface, which does not make sense when you think of bit strings - the mental model will be closer to a string plus some extra features rather than an array. We kinda end up converting to BitArray using the bitarray function inside this package which is not that ideal.

But anyway, I think it doesn't hurt to start with stack-allocated ones. We currently don't have the bandwidth on expanding more features for generic usage, but please feel free to ping me if you have more questions. Ideally, bit string, dit string, etc. deserve a common interface (trait) instead of using an array interface.

@nandoconde
Copy link
Author

Thanks! I'll keep you on the loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants