Skip to content
View simon-mo's full-sized avatar
Block or Report

Block or report simon-mo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

👋 I'm Simon.

Currently, I'm a PhD student at Berkeley Sky Computing Lab for machine learning system and cloud infrastructures. I am advised by Prof. Joseph Gonzalez.

My latest focus is building an end to end stack for LLM inference on your own infrastructure. This work includs

  • vLLM runs LLM inference efficiently.
  • Conex builds, push, and pull containers fast.
  • SkyATC orchestrate LLMs in multi-cloud and scaling them to zero.

I previously work on Model Serving System @anyscale.

  • Ray takes your Python code and scale it to thousands of cores.
  • Ray Serve empowers data scientists to own their end-to-end inference APIs.

Before Anyscale, I was a undergraduate researcher @ucbrise.


Reach out to me: at


  1. conex conex Public

    Container Express is a tool to accelerate Docker push and pull.

    Rust 3 1