Machine Learning Headquarters

An exploration in democratizing machine learning technology and the infrastructures that support it.

Goals and Ideals:

Creating scaleable, affordable hardware for those interested in learning and contributing to machine learning.

Open Source education and designs.

Lower the barrier to entry for AI, ML, and supercomputing.

Document everything, release everything free of charge.

This is not a startup, this is a lifestyle. A mission.

What is the first hurdle that needs to be overcome? What's the smallest goal we can meet while making some sort of difference?
All of the goals below can be done independently in ways that will have an impact (I think), the easiest is by creating a compendium of resources for individuals to follow the same path. The next would be to make an open source design for a specific goal (this would be the super expandable super clusters). From previous conversation, it sounds like the current bottleneck is affordable high speed network technology, so we can start by understanding that and then tackling other avenues.

What does it take to run popular machine learning models (huggingface) on risc-v hardware.

What makes a super computer? Memory Bandwidth? CPU Clock? Instructions per clock? 200,000 nodes?

Can we use the same hardware to scale a super computer from $100 to $100 million?

Can we produce a risc-v processor that uses DDR5, PCIE5, 10Gbe network, and runs C/python without paying license fees? Do we need NVME?

What is the fastest way to connect nodes in a cluster that is "plug and play" (just buy more nodes, not more connectors, or bigger nodes)

Is it possible to run nvidia CUDA software on risc-v? If not what is a realistic alternative to CUDA that is actually competative. Is there something close to a drop in replacement? What does that look like? Does VULKAN+SPIR-V fit in here?

Can Kubernetes run on a risc-v cluster?

Can FreeNAS run on a risc-v cluster?

Minimum Viable Product:

150 Intel NUC kubernetes cluster benchmark training and inference for statistical learning and ML Transformers for NLP

A high speed, low cost network interface.

Stretch goals:

Open source blade for expanding commonly available systems like the RPi.

Open Source rack motherboard for connecting those blades to an ethernet link (the blade is probably useless without this)

Open source CPUs that can slot into the blade. This requires new designs all the way down.

Rank on the super computer leaderboard with a risc-v based supercomputer.

Current Configuration:

A plurality of CPUs are mounted to a "blade" that allows an interface into a rack mounted motherboard, more than likely through a PCIe bus. These motherboards can be stacked and expanded through high speed ethernet interfaces that link to high speed network switches in order to expand a supercomputer cluster further.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Machine Learning Headquarters

Machine Learning Headquarters

An exploration in democratizing machine learning technology and the infrastructures that support it.

Goals and Ideals:

Minimum Viable Product:

Stretch goals:

Current Configuration:

Popular repositories

Repositories

People

Top languages

Most used topics