tobami edited this page Jan 10, 2011 · 13 revisions

We want to develop iteratively, with bite-sized steps and refactor as needed.

First Milestone [Done]

The goal is to have a unified front-end to public clouds, private clouds, and dedicated hardware.

  • Very basic Django Web UI [Done]
  • DB Schema for Providers and Nodes [Done]
  • Use libcloud drivers to interface with clouds [Done]
  • Provider plugins infrastructure [Done]
  • Dedicated server plugin [Done]
  • REST API [Done]
  • User authentication [Done]

Milestone 1 has been released as Overmind 0.1

Second milestone [in progress]

As a second step, implement Configuration Management, a job queue and improve the UI

  • Caching of images [Done]
  • Celery job queue [Done]
  • Configuration Management
    • Plugin architecture. Start with a Chef-solo plugin
    • Discuss best way to implement it.
      • A server role could define both a machine type and a Chef role
      • We can begin with a simple push architecture using Little Chef
  • Begin to design and implement a good UI

Third milestone

The main goal should be to complement or substitute the initial push architecture, with a pull/queue architecture.

  • Pull/queue architecture
    • Messaging architecture
    • Agent. The agent will just:
      • Consume the queue
      • Execute Chef Solo when required, passing it the appropriate parameters
  • UI should now be great

Fourth milestone

This will be a stable release, maybe the first to get wider exposure.

  • Refine the CM pull/queue architecture
    • Cement messaging API
    • Test scalability and fine-tune accordingly
  • Add basic monitoring by extending the agent:
    • Execute client-side, monitoring Python/bash/X scripts (nagios equiv to plugins) with simple api
    • As a first step, only report basic readings (cpu/mem, etc) to display real-time status, but don’t record history
    • cpu/mem/disk data should be shown in the node list and its status set accordingly
    • As a bonus, the agent could gather (on first run?) system information. Maybe using something like ohai
    • We could use something like Graphite (or parts of it). It now sports AMQP Integration: “The ability for carbon-cache to receive datapoints via AMQP was added in 0.9.6”

Fifth milestone onward

  • Make monitoring a serious feature
    • Study the possibility to reuse nagios plugins (perl scripts?)
    • Store monitoring data in the main DB
    • Graphs
  • Start incorporating advanced functionality:
    • Store monitoring data in a second DB (MongoDB)?
    • Cloning server constellations:
    • Load Balancing, etc:
    • Service guarantee, etc: