Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Meeting Notes (2023-10-23) #134

Open
DanielRosenwasser opened this issue Oct 31, 2023 · 1 comment
Open

Design Meeting Notes (2023-10-23) #134

DanielRosenwasser opened this issue Oct 31, 2023 · 1 comment

Comments

@DanielRosenwasser
Copy link
Member

Possible Topics

  • Issue tracker
  • Library integrations
  • OpenAI functions
  • Formal representations
  • TypeChat Programs
  • Other languages (e.g. Python and C#)
  • Other features

Issue Tracker Maintenance and Community Engagement

  • We had a pause - what happened?
    • Vacations, explorations with internal teams (e.g. Copilot implementations), etc.
    • Direct discussions with users, but took us away from GitHub for a bit.
  • Where are we now?
  • Still want more blog posts, want to have a video explainer - seeing is believing.
  • Plan to do a sweep over issues and PRs.

TypeChat and Orchestrators

  • Things like Semantic Kernel, langchain, etc.
  • Currently exploring how these can be integrated - loose ideas at this moment?
  • Want to be able to find where these can complement each other, integrate better, etc.
    • Planners based on TypeChat's JSON Programs

OpenAI Functions

#45

  • OpenAI functions are one function at a time.
  • Described via JSON schema.
  • There's a function role that fits within a conversation.
  • Fine-tuned - not guaranteed to get schema-conforming data (nor even well-formed data!).
  • Is there a lot of usage?
    • There's a lot of excitement, but we haven't yet spoken with many users.
  • So why not just use the TypeChat approach here? Either TypeChat JSON validation or TypeChat JSON programs?
    • We believe one subsumes the other - TypeChat being cross-model with type-checked validation is more robust.
    • Could plug in your favorite schema validator to do this technically, right?
    • Anecdotally, TypeChat performs very very well. To be honest, a lot better in our experience.
      • We're missing evidence we can show to the outside world though.
  • Do we have any insight into long-term plans with OpenAI functions?
    • Not yet, we would love to discuss further with these teams.
  • Conclusion?
    • We don't yet think it makes sense to support directly - would love to better understand long-term plans from LLM providers like OpenAI.

Formal Representations for LLMs

  • What's that mean?
    • Verifiable and repairable syntactically/semantically
  • Areas of investigation
    • Best representations of...
      • specifications (e.g. TypeScript types, "JSON templates", JSON schema...)
      • return formats (e.g. JSON, YAML, code in specific languages)
    • Is there a compact schema form that we can adopt/invent with high accuracy? It'd be easier to verify if we had something more compact than JSON schema.
      • But new languages = new toolchains. Picking a well-defined subset of a known language like TypeScript might be more successful.
    • How do we make these work across languages?
  • What about a separate authoring format?
    • "SchemaLite"?
  • What about that subset of TypeScript?
  • What about TypeSpec?
  • TypeScript versus JSON Schema?
    • TypeScript really shines on discriminated unions.
    • What's the best way to describe a discriminated union to an LLM? For data interchange, that's fundamentally how you describe polymorphism.

Further Evolution of JSON Programs/Planning/Scripting/Orchestration

  • Some feedback on programs is that they're cool, but too limited.
    • Clever ways to enable some stuff like branching and iteration, but they don't always scale.
  • Models are being asked to produce an IR that is turned into another language, then interpreted.
  • The feedback loop from a type-checker is pretty removed.
    • Hard problem with verification.
  • But we have concerns about sandboxing and guaranteed availability (i.e. keeping your host programs working in spite of the halting problem).
  • Plus, what if you have millions of functions, or methods on objects with thousands of types, etc.?
    • And if we want to deliver plans with no hallucinations, we want to be able to summarize plans for humans too. So we want that...
    • But how do you actually present this to a user?
    • Just be able to provide transactions/undo? Commit/unroll?
  • Maybe there's some inspiration to be taken from languages like PowerShell, Tcl? Bring your own language features, build it up.

Multi-Agent/Multi-Schema/Routing Support

  • Dynamic Schema Generation from Data
  • Programmatic Schema Construction
    • Dynamically populating structure and entities

Long-Term Features We'd Like to Tackle

  • Embeddings
  • Vocabulary
  • Multi-Schema
  • Routing
  • Multi-Model Infrastructure
@xumx
Copy link

xumx commented Nov 7, 2023

OpenAI functions are now parallel
https://platform.openai.com/docs/guides/function-calling/parallel-function-calling
and can output JSON more reliably.

I have also been implementing dynamic schema construction and generation from data.
and also a JSON program to generate schemas.

One limitation currently is the generated JSON Programs are linear, LLM is not given a chance to reflect on intermediate results to readjust the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants