Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create data/AI platform server #42

Closed
8 tasks
josephjclark opened this issue Apr 3, 2024 · 0 comments · Fixed by #44
Closed
8 tasks

Create data/AI platform server #42

josephjclark opened this issue Apr 3, 2024 · 0 comments · Fixed by #44
Assignees

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Apr 3, 2024

This issue is to restructure the gen repo so that it contains:

  • A web platform written in typescript and bun
  • A bunch of standalone python services

Structure

The structure is bun-first, so the top level has a package json and some scripts

gen
├──services
│   ├── codegen
│    │  ├── README.md
│    │  ├── src
│    │  └── main.py
│    └── test
|   ├── ...other python services
├── platform
|   ├── README.md
|   ├── src
|   │   ├── index.ts
|   └── test
| package.json

Web platform

I want a really simple, data-driven layer to hook up web services. This needs experimenting with but it'll be something like:

const routes = {
  // endpoint: python module
 'template-gen': loadPython('code_generator')
 'metadata': loadJS('metadata')
}

The TypeScript framework will accept a post request with a JSON payload, do any auth that's appropriate, and feed the JSON payload through to the python service as a dictionary.

This object-dict binding will be the core of the API really. Python modules should be loaded from a predictable folder structure with a main handler function.

At the moment I don't plan to support resources in paths or anything - just static bindings of a path to a python module.

Bun has some frameworks for web servers. Do we need anything fancy or can we just use basic http server APIs? I think the server component is going to be pretty trivial. But it's worth a look around. Elysia and native libraries both look fine.

Hosting & Branding

This is either a data-generation server which includes AI stuff, or an AI server which includes data stuff. I prefer the former.

But one way around the semantics is to give the server a name, like apollo

God of oracles, healing, archery, music and arts, light, knowledge, herds and flocks, and protection of the young

Apollo fits nicely into our product pantheon - and the idea of this being a server of prophecy, oracles and knowledge is just perfect.

Athena, as goddess of wisdom, would be be another candidate.

Authentication

Should we adopt a security-first approach to this server?

Lightning and the CLI are both likely to be the major clients.

But neither will share any particularly sensitive information. Metadata is only as sensitive as the credential you use, adaptor docs should be considered public, and any generated code templates are not sensitive.

We may want Lightning and the CLI to include some kind of token so that we can monitor usage and rates and we have some idea who is calling. CLI's are anonymous of course but maybe each can generate its own identifier.

This excludes any tokens needed for AI generation.

Data services

The existing python needs to be restructured, removing all the web server stuff and just leaving the core python modules (#30)

A data service can be python (like the current adaptor code generator) or typescript (like the metadata service).

In either case it is a function which accepts an object with arguments as input and returns a JSON object as output. The returned object may have some conventions.

All services are stateless. They may utilise caching (that probably needs to happen inside the service handler, if appropriate)

Python

The server will create a python runtime, in which all python code can be implemented.

I'll need to work out the best way to structure things so that poetry can install dependencies from the root.

CLI integration

I guess this is an issue for kit really, but I want the CLI to integrate well with this.

Something like `openfn apollo path/to/json.json -i standard/cli-style/output

Basically it just takes a resource and a payload and it'll ensure the output goes to the right place.

For sevices which write to the adaptors monorepo, we'll either need like a command override (most likely) or there's some special flag you add like -a salesforce which causes the results to be written to salesforce.

Maybe the return data from the services can use a convention, like if it returns a bunch of files, they come down like:

files: {
   'src/Adaptor.js': '<...>'
}

Checklist

  • Setup a basic bun web server
  • call out to the sig gen python module
  • dockerise
  • Formalize the web APIs
  • Set up a nice declarative router
  • Create one entrypoint for api gen
  • Refactor stuff out of utils
  • remove python server stuff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant