Skip to content

1.0.3: The Vision Update

Choose a tag to compare

@SearchSavior SearchSavior released this 17 Apr 02:25
· 242 commits to main since this release
1e92df9

OpenArc 1.0.3: The Vision Update

New features

  • Vision support (!)
    • OpenArc takes a dynamic approach to how images are processed
    • Recieved messages are checked for base64 and are passed to the appropriate tokenization method, enabling text to text as well as image to text in the same chat/input
    • There are no normalization steps for images. We don't shrink to 100dpi, or apply a zoom, or anything like that; bring your own logic for preprocessing.
    • stream=false is not supported yet
  • Load multiple models at once on different devices with the "Model Manager" tab. unload models from here as well.
  • Added model metadata. Now loaded models store data about how they were loaded; we use this throughout inference, track models in memory across devices. You can now
    • Load both vision and text models into memory
    • Be careful though, we dont have any safety measure in place. Usually these situations would create a stalled load up or some memory error
    • For those with multiple GPUs, you could run multiple
  • Added model_type field. When loading a model you now specify either TEXT or VISION; this routes requests to the appropriate class and will be extended to other architectures/tasks in the future
  • Updated model conversion tool to latest which has a ton of experimental datatypes/quant types.
  • Dashboard has been refactored and is less of a mess
  • And many more changes to the codebase that communicate project direction.

Issues

  • Right now gemma3 has specific requirements for inference. We are working out the right set of parameters to load with and it needs better documentation
  • Inference doesn't usually fail gracefully i.e, needs better handling of threading so the API doesn't become inaccessible and crash when a thread fails for whatever reason
  • Concurrent requests to multiple loaded models is not yet implemented, and we don't have queuing yet
  • There are probably other things so report what you encounter in the discord or on github but for the "vision"