-
Notifications
You must be signed in to change notification settings - Fork 362
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to be awesome to finally get this across the line 🥳
crates/ggml/src/format/loader.rs
Outdated
| ContainerType::Ggmf(1) | ||
| ContainerType::Ggjt(1 | 2) | ||
| ContainerType::Ggmf(1 | 100) | ||
| ContainerType::Ggjt(1 | 2 | 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use an enum here that maps these values to something more meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a change I made, but I realise that with the growing number of versions this is only more confusing. I'll reintroduce constants soon
Updated to latest main |
|
||
graph: RefCell<ggml::ComputationGraph>, | ||
token_index: ggml::Tensor, | ||
state: RefCell<ggml::Tensor>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be shared across inference sessions?
crates/models/rwkv/src/lib.rs
Outdated
self.state | ||
.borrow_mut() | ||
.write_data(std::slice::from_raw_parts( | ||
session.state.data() as *mut u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind i see this now.
After #271 is in (in whatever form), I'll bring this back up to date |
This has diverged rather significantly from mainline, but not hopefully not unworkably so. Maybe it's simpler to cherry-pick the actual model definition in? Either way, @RedBoxing, what was the state of this when you last tried it? iirc the evaluation still had some issues; I'm thinking we should bring this up to date and merge it in, but as an optional disabled-by-default implementation like BLOOM was. That should help us keep it up to date while we figure out what's wrong. |
OK, I had a go at updating this, but the dual-context design is hard to adapt to our current session abstraction. Given that rwkv.cpp has moved on beyond this point and detecting RWKV models will get easier with GGUF, I'm going to close this for now and we can use it as a reference when we reimplement it again. I think we should wait some time, though. |
Based on @danforbes's work, I added the evaluation code and replaced the tokenizer with huggingface's tokenizer.
Close #75 and #35