Releases: sigoden/aichat
v0.19.0-rc1
Support RAG
Seamlessly integrates document interactions into your chat experience.
Support AI Agent
Agent = Prompt (Role) + Tools (Function Callings) + Knowndge (RAG). It's also known as OpenAI's GPTs.
New Models
- claude:claude-3-5-sonnet-20240620
- vertexai:gemini-1.5-pro-001
- vertexai:gemini-1.5-flash-001
- vertexai-claude:claude-3-5-sonnet@20240620
- bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0
- zhipuai:glm-4-0520
- lingyiwanwu:yi-large*
- lingyiwanwu:yi-medium*
- lingyiwanwu:yi-spark
New Configuration
repl_prelude: null # Overrides the `prelude` setting specifically for conversations started in REPL
agent_prelude: null # Set a session to use when starting a agent. (e.g. temp, default)
# Regex for seletecting dangerous functions
# User confirmation is required when executing these functions
# e.g. 'execute_command|execute_js_code' 'execute_.*'
dangerously_functions_filter: null
# Per-Agent configuration
agents:
- name: todo-sh
model: null
temperature: null
top_p: null
dangerously_functions_filter: null
# ---- RAG ----
rag_embedding_model: null # Specifies the embedding model to use
rag_rerank_model: null # Specifies the rerank model to use
rag_top_k: 4 # Specifies the number of documents to retrieve
rag_chunk_size: null # Specifies the chunk size
rag_chunk_overlap: null # Specifies the chunk overlap
rag_min_score_vector_search: 0 # Specifies the minimum relevance score for vector-based searching
rag_min_score_keyword_search: 0 # Specifies the minimum relevance score for keyword-based searching
rag_min_score_rerank: 0 # Specifies the minimum relevance score for reranking
rag_template: ...
clients:
- name: localai
models:
- name: xxxx # Embedding model
type: embedding
max_input_tokens: 2048
default_chunk_size: 2000
max_batch_size: 100
- name: xxxx # Rerank model
type: rerank
max_input_tokens: 2048
New REPL Commands
.edit session Edit the current session with an editor
.rag Init or use a rag
.info rag View rag info
.exit rag Leave the rag
.agent Use a agent
.info agent View agent info
.starter Use the conversation starter
.exit agent Leave the agent
.continue Continue the response
.regenerate Regenerate the last response
New CLI Options
-a, --agent <AGENT> Start a agent
-R, --rag <RAG> Start a RAG
--list-agents List all agents
--list-rags List all RAGs
Break Changing
Some client fields have changed
clients:
- name: myclient
patches:
<regex>:
- request_body:
+ chat_completions_body:
models:
- name: mymodel
max_output_tokens: 4096
- pass_max_tokens: true
+ require_max_tokens: true
The way to identify dangerous functions has changed
Previous we treats function name that starts with may_
as execute type (dangerously). This method requires modifying function names, which is inflexible.
Now we makes it configurable. In config.yaml
, you can now define which functions are considered dangerous and require user confirmation .
dangerously_functions_filter: 'may_execute_.*'
New Features
- support RAG (#560)
- custom more path to file/dirs with environment variables (#565)
- support agent (#579)
- add config
dangerously_functions
(#582) - add config
repl_prelude
andagent_prelude
(#584) - add
.starter
repl command (#594) - add
.edit session
repl command (#606) - abandon
auto_copy
(#607) - add
.continue
repl command (#608) - add
.regenerate
repl command (#610) - support lingyiwanwu client (#613)
- qianwen support function calling (#616)
- support rerank (#620)
- cloudflare support embeddings (#623)
- serve embeddings api (#624)
- ernie support embeddings and rereank (#630)
- ernie support function calling (#631)
Bug Fixes
v0.18.0
Break Changing
Add custom request parameters based on patch
, other than extra_fields
We used to add request parameters to models using extra_fields
, but this approach lacked flexibility. We've now switched to a patch
mechanism, which allows for customizing request parameters for one or more models.
The following examples enable web search functionality for all Cohere models.
- type: cohere
patches:
".*":
request_body:
connectors:
- id: web-search
Remove all tokenizers
Different platforms may utilize varying tokenizers for their models, even across versions of the same model. For example, gpt-4 uses o200k_base
while gpt-4-turbo and gpt-3.5-turbo employ cl100k_base.
AIChat supports 100 models, It's impossible to support all tokenizers, so we're removing them entirely and switching to a estimation algorithm.
Function Calling
Function calling supercharges LLMs by connecting them to external tools and data sources. This unlocks a world of possibilities, enabling LLMs to go beyond their core capabilities and tackle a wider range of tasks.
We have created a new repository to help you make the most of this feature: https://github.com/sigoden/llm-functions
Here's a glimpse of what function calling can do for you:
New Models
- gemini:gemini-1.5-flash-latest
- gemini-1.5-flash-preview-0514
- qianwen:qwen-long
Features
- Allow binding model to the role (#505)
- Remove tiktoken (#506)
- Support function calling (#514)
- Webui add toolbox(copy-bt/regenerate-btn) to message (#521)
- Webui operates independently from aichat (#527)
- Allow patching req body with client config (#534)
Bug Fixes
- No builtin roles if no roles.yaml (#509)
- Unexpect enter repl if have pipe-in but no text args (#512)
- Panic when check api error (#520)
- Webui issue with image (#523)
- Webui message body do not autoscroll to bottom sometimes (#525)
- JSON stream parser and refine client modules (#538)
- Bedrock issues (#544)
New Contributors
- @rolfwilms made their first contribution in #544
- @ProjectMoon made their first contribution in #549
Full Changelog: v0.17.0...v0.18.0
v0.17.0
Break Changing
- always use stream unless set
--no-stream
explicitly (#415) - vertexai config changed: replace
api_base
withproject_id
/location
Self-Hosted Server
AIChat comes with a built-in lightweight web server:
- Provide access to all LLMs using OpenAI format API
- Host LLM playground/arena web applications
$ aichat --serve
Chat Completions API: http://127.0.0.1:8000/v1/chat/completions
LLM Playground: http://127.0.0.1:8000/playground
LLM ARENA: http://127.0.0.1:8000/arena
New Clients
bedrock, vertex-claude, cloudflare, groq, perplexity, replicate, deepseek, zhipuai, anyscale, deepinfra, fireworks, openrouter, octoai, together
New REPL Command
.prompt Create a temporary role using a prompt
.set max_output_tokens
> .prompt your are a js console
%%> Date.now()
1658333431437
.set max_output_tokens 4096
New CLI Options
--serve [<ADDRESS>] Serve the LLM API and WebAPP
--prompt <PROMPT> Use the system prompt
New Configuration Fields
# Set default top-p parameter
top_p: null
# Command that will be used to edit the current line buffer with ctrl+o
# if unset fallback to $EDITOR and $VISUAL
buffer_editor: null
New Features
- add completion scripts (#411)
- shell commands support revision
- add
.prompt
repl command (#420) - customize model's max_output_tokens (#428)
- builtin models can be overwritten by models config (#429)
- serve all LLMs as OpenAI-compatible API (#431)
- support customizing
top_p
parameter (#434) - run without config file by set
AICHAT_CLIENT
(#452) - add
--prompt
option (#454) - non-streaming returns tokens usage (#458)
.model
repl completions show max tokens and price (#462)
v0.16.0
New Models
- openai:gpt-4-turbo
- gemini:gemini-1.0-pro-latest (replace gemini:gemini-pro)
- gemini:gemini-1.0-pro-vision-latest (replace gemini:gemini-pro-vision)
- gemini:gemini-1.5-pro-latest
- vertexai:gemini-1.5-pro-preview-0409
- cohere:command-r
- cohere:command-r-plus
New Config
ctrlc_exit: false # Whether to exit REPL when Ctrl+C is pressed
New Features
- use ctrl+enter to newline in REPL (#394)
- support cohere (#397)
- -f/--file take one value and do not enter REPL (#399)
Full Changelog: v0.15.0...v0.16.0
v0.15.0
Breaking Changes
Rename client localai to openai-compatible (#373)
clients:
-- type: localai
++ type: openai-compatible
++ name: localai
Gemini/VertexAI clients add block_threshold
configuration (#375)
block_threshold: BLOCK_ONLY_HIGH # Optional field
New Models
- claude:claude-3-haiku-20240307
- ernie:ernie-4.0-8k
- ernie:ernie-3.5-8k
- ernie:ernie-3.5-4k
- ernie:ernie-speed-8k
- ernie:ernie-speed-128k
- ernie:ernie-lite-8k
- ernie:ernie-tiny-8k
- moonshot:moonshot-v1-8k
- moonshot:moonshot-v1-32k
- moonshot:moonshot-v1-128k
New Config
save_session: null # Whether to save the session, if null, asking
CLI Changes
New REPL Commands
.save session [name]
.set save_session <null|true|false>
.role <name> <text...> # Works in session
New CLI Options
--save-session Whether to save the session
Fix Bugs
- erratic behaviour when using temp role in a session (#347)
- color on non-truecolor terminal (#363)
- not dirty session when updating properties (#379)
- incorrectly render text contains tabs (#384)
Full Changelog: v0.14.0...v0.15.0
v0.14.0
Breaking Changes
Compress session automaticlly (#333)
When the total number of tokens in the session messages exceeds compress_threshold
, aichat will automatically compress the session.
This means you can chat forever in the session.
The default compress_threshold
is 2000, set this value to zero to disable automatic compression.
Rename max_tokens
to max_input_tokens
(#339)
To avoid misunderstandings. The max_input_tokens
also be referred to as context_window
.
models:
- name: mistral
-- max_tokens: 8192
++ max_input_tokens: 8192
New Models
-
claude
- claude:claude-3-opus-20240229
- claude:claude-3-sonnet-20240229
- claude:claude-2.1
- claude:claude-2.0
- claude:claude-instant-1.2
-
mistral
- mistral:mistral-small-latest
- mistral:mistral-medium-latest
- mistral:mistral-larget-latest
- mistral:open-mistral-7b
- mistral:open-mixtral-8x7b
-
ernie
- ernie:ernie-3.5-4k-0205
- ernie:ernie-3.5-8k-0205
- ernie:ernie-speed
Commmand Changes
-c/--code
generate code only (#327)
Chat-REPL Changes
.clear messages
to clear session messages (#332)
Miscellences
Full Changelog: v0.13.0...v0.14.0
v0.13.0
What's Changed
- fix: copy on linux wayland by @sigoden in #288
- fix: deprecation warning of .read command by @Nicoretti in #296
- feat: supports model capabilities by @sigoden in #297
- feat: add openai.api_base config by @sigoden in #302
- feat: add
extra_fields
to models of localai/ollama clients by @kelvie in #298 - fix: do not attempt to deserialize zero byte chunks in ollama stream by @JosephGoulden in #303
- feat: update openai/qianwen/gemini models by @sigoden in #306
- feat: support vertexai by @sigoden in #308
- refactor: update vertexai/gemini/ernie clients by @sigoden in #309
- feat: edit current prompt on $VISUAL/$EDITOR by @sigoden in #314
- refactor: change header of messages saved to markdown by @sigoden in #317
- feat: support
-e/--execute
to execute shell command by @sigoden in #318 - refactor: improve prompt error handling by @sigoden in #319
- refactor: improve saving messages by @sigoden in #322
New Contributors
- @Nicoretti made their first contribution in #296
- @kelvie made their first contribution in #298
- @JosephGoulden made their first contribution in #303
Full Changelog: v0.12.0...v0.13.0
v0.12.0
What's Changed
- feat: change REPL indicators #263
- fix: pipe failed on macos #264
- fix: cannot read image with uppercase ext #270
- feat: support gemini #273
- feat: abandon PaLM2 #274
- feat: support qianwen:qwen-vl-plus #275
- feat: support ollama #276
- feat: qianwen vision models support embeded images #277
- refactor: remove path existence indicator from info #282
- feat: custom REPL prompt #283
Full Changelog: v0.11.0...v0.12.0
v0.11.0
What's Changed
- refactor: improve render #235
- feat: add a spinner to indicate waiting for response #236
- refactor: qianwen client use incremental_output #240
- fix: the last reply tokens was not highlighted #243
- refactor: ernie client system message #244
- refactor: palm client system message #245
- refactor: trim trailing spaces from the role prompt #246
- feat: support vision #249
- feat: state-aware completer #251
- feat: add ernie:ernie-bot-8k qianwen:qwen-max #252
- refactor: sort of some complete type #253
- feat: allow shift-tab to select prev in completion menu #254
Full Changelog: v0.10.0...v0.11.0
v0.10.0
New features
Use ::: for multi-line editing, deprecate .edit
〉::: This
is
a
multi-line
message
:::
Temporarily use a role to send a message.
coder〉.role shell how to unzip a file
unzip file.zip
coder〉
As shown above, you temporarily switched to the shell role in the coder role and sent a message. After sending, the current role is still coder.
Set default role/session with config.prelude
For those who want aichat to enter a session after startup, you can set it as follows:
prelude: session:mysession
For those who want aichat to use a role after startup, you can set it as follows:
prelude: role:myrole
Use a model that is not in the --list-models
If OpenAI releases a new model in the future, it can be used without upgrading Aichat.
$ aichat --model openai:gpt-4-vision-preview
〉.model openai:gpt-4-vision-preview
Changelog
- refactor: improve error message for PaLM client by @sigoden in #213
- refactor: rename Model.llm_name to name by @sigoden in #216
- refactor: use &GlobalConfig to avoid clone by @sigoden in #217
- refactor: remove Model.client_index, match client by name by @sigoden in #218
- feat: allow the use of an unlisted model by @sigoden in #219
- fix: unable to build on android using termux by @sigoden in #222
- feat: add
config.prelude
to allow setting default role/session by @sigoden in #224 - feat: deprecate
.edit
, use """ instead by @sigoden in #225 - refactor: improve repl completer by @sigoden in #226
- feat: temporarily use a role to send a message by @sigoden in #227
- refactor: output info contains auto_copy and light_theme by @sigoden in #230
- fix: unexpected additional newline in REPL by @sigoden in #231
- refactor: use ::: as multipline input indicator, deprecate """ by @sigoden in #232
- feat: add openai:gpt-4-1106-preview by @sigoden in #233
Full Changelog: v0.9.0...v0.10.0