Skip to content

nathankot/bashi

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.

🌁 Bashi

Note: Bashi is still very early stage and under active development, if you are looking for a polished product you will be disappointed. But if you are looking to spend some time building your own personal assistant AI then you're at the right place!

Bashi is an extensible platform that bridges LLMs to tasks and actions.

It comes with an OSX personal assistant app, so you can try it out quickly, and mould the app to your own needs.

This repo has two components:

  • The Bashi API server.
  • The OSX app 'assist', which serves as an example client implementation, as well as a usable product

Examples of the assist OSX app in action:

LLM+REPL

Bashi uses a novel approach in that the LLM is asked to write Javascript and the server effectively provides a REPL for the model. Below are some example prompt+completions. All of these are real examples that you can try on the OSX app πŸ‘

Request: help me write a commit message please
Thought: I need to generate a commit message based on a diff
Action: returnText(writeCommitMessage(getInputText(\"diff\")));
Request: there is a function I don't understand, can help me summarize it?
Thought: I need to extract the information from the given string
Action: returnText(extractInformation(\"summarize the function\", getInputText(\"what is the function?\")))

There seems to be some advantages to this:

  • Reduced completion sizes since not only does code compact information, but by having multiple steps in a single action you save on model round trips.
  • GPT3.5 was probably trained on lots of Javascript, so asking it to write Javascript may lead to tendencies towards emergent reasoning/logic characteristics.

Extensible

Client-defined commands

Clients are able to extend the capabilties of the agent by providing their own commands/functions. For example the OSX client provides the server with information about the createCalendarEvent command:

AnonymousCommand(
    name: "createCalendarEvent",
    cost: .Low,
    description: "make calendar event for the given name, datetime and duration",
    args: [
            .init(type: .string, name: "name"),
            .init(type: .string, name: "iso8601Date"),
            .init(type: .number, name: "event duration in hours")
    ],
    returnType: .void,
    triggerTokens: ["calendar", "event", "appointment", "meeting"],
    runFn: { (api, ctx, args) async throws -> BashiValue in
        // ... redacted guard code
        let event = EKEvent.init(eventStore: self.eventStore)
        event.startDate = date
        event.title = name
        event.endDate = date.addingTimeInterval(60 * 60 * hours.doubleValue)
        event.calendar = defaultCalendar
        try self.eventStore.save(event, span: .thisEvent, commit: true)
        await api.indicateCommandResult(message: "Calendar event created")
        return .init(.void)
    }),

Commands can be resolved on either the client or the server. The example above is a command resolved on the client. In contrast, there are commands like math that are resolved on the server.

Write your own client

Clients just need to interface with the API defined in openapi.json. There are plenty of OpenAPI definition -> client library generation tools out there to help get things started if you wish to write a client in a new language.

Running it locally

Server

After cloning the repo, set up your API keys. At minimum you'll need an OpenAI API key.

cp server/.env.template server/.env
# edit server/.env

Run the entire server stack using docker:

make build
make up-all
open http://localhost:8003

The index page has some examples and you can play around with text or audio prompts. Although any commands that must be resolved on the client-side are fixtures/dummies. (The OSX app includes 'real' client-side commands).

If you are working on the server, you should set up the live-reloading server. Not only does this pick up code changes, but it also generates the OpenAPI spec when there are changes to API interface.

make dev
open http://localhost:8080

OSX App

Currently pre-built binaries are not available so you'll need to build the OSX app yourself.

Open the xcworkspace in Xcode:

open assist/assist.xcworkspace

Build the 'assist' scheme. Note by default the app points to http://localhost:8003/api which corresponds to the server running in docker via make up-all. If you are running the server with make dev you'll want to update the API base URL to http://localhost:8080/apiby going to the app settings.

Any changes to the API surface will require a new swift client to be generated using make clients.

Documentation

I still need to work on some more comprehensive documentation for the codebase πŸ™‡

API Spec

The API is described in openapi.json which can be plugged into https://editor.swagger.io/ for viewing.

Contributing

Let's build JARVIS together :)

There is no contribution guide for now, but you are welcome to make contributions to the OSX client (or introduce new clients if you are okay with an unstable API).

Issues and feature requests are accepted for the server/, but not code changes at this moment.

Testing

For the OSX client run tests via Xcode as per usual. For the server use make test to run tests, and make test-update to update test snapshots. Snapshot testing is used liberally, for better or for worse πŸ™ˆ

Bug reports

Running the server in make dev results in more verbose logging, please copy and paste the error output into any bug reports (after redacting sensitive information, if any).

Known issues

(Need to migrate these to GH issues)

  • If the model does not have the commands to complete the request, it will tend to just repeat the question back to the user.

    • A general knowledge lookup command would be useful here, the existing search command is not sufficient as it does not provide a knowledge graph. Perhaps this command can have multiple layers - ask a LLM first, if no answer use some search API.
  • The model gets confused when there are overlapping commands relevant to a request. For example if you ask it to write example code to create a reminder using swift, it ends up calling the createReminder() command instead.

    • I'm exploring an approach to alleviate this by having a pre-process stage where the command set is filtered down, this could also result in token savings.
  • The model will sometimes output an un-parseable expression leading to error. A quick fix is usually to rephrase your request, in the long term the following approaches are being considered:

    • Progressively loosen the definition of the action language to accomodate for common model errors.
    • Support configurable automatic retries.
    • Prompt engineering and fine tuning.
  • Apple APIs for audio recording and transcription is hard :( Push-to-talk is very flaky with airpods on.

About

Extensible AI assistant platform that bridges LLMs to tasks and actions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published