🦾 War-Machine AI

A high-performance, local AI character integration using Node.js (ES Modules) and Ollama. This project is specifically tuned for the Intel i5-1235U with 16GB RAM, focusing on bypassing Windows DNS latency and providing real-time streaming.

🚀 Features

Custom Personality: A "War-Machine" persona defined via a dedicated Modelfile.
Zero-Lag Networking: Direct IPv4 (127.0.0.1) binding to skip the 2-second Windows localhost lookup delay.
Real-Time Streaming: Uses HTTP chunked encoding to deliver words the millisecond they are generated.
RAM Optimized: Includes keep_alive logic to ensure the model stays in your 16GB RAM.

🛠️ Hardware Context

Processor: 12th Gen Intel Core i5-1235U (2 P-Cores, 8 E-Cores).
RAM: 16GB DDR4/DDR5.
Performance Note: On this CPU, expect roughly 5-8 tokens per second. Streaming is enabled to ensure "First Token" delivery in < 1.5s.

📥 Setup Instructions

1. Install Ollama

Download the engine at ollama.com. Ensure the Ollama icon is visible in your system tray.

2. Project Initialization

# Initialize and install dependencies
npm init -y
npm install express ollama
npm pkg set type="module"

3. Build the Character

Ensure you have a file named Modelfile in your root directory. Then, register the character:

# Run via terminal
ollama create war-machine -f Modelfile

4. Ignite the Server

node server.js

📡 API Reference

POST `/ask`

The primary endpoint for interacting with War-Machine.

Headers: Content-Type: application/json

Request Body:

{
  "prompt": "War-Machine, what is your current status?"
}

Testing via PowerShell (cURL):

curl.exe -X POST [http://127.0.0.1:3000/ask](http://127.0.0.1:3000/ask) `
-H "Content-Type: application/json" `
-d '{"prompt": "Give me a status report on the CPU cores."}'

⚙️ Key Optimizations Applied

Direct IP: Changed localhost to 127.0.0.1 in the client to stop DNS lag.
Streaming Loop: Implemented for await (const part of stream) to pipe output directly.
Keep-Alive: Added keep_alive: '30m' to prevent the i5 from reloading from SSD.

📝 Example Response

Below is a live look at War-Machine in action. Notice the low-latency streaming and the character-driven persona.

Note: On an i5-1235U, the first token is delivered in ~1.2s, with a full response completed in under 9s thanks to streaming optimizations.

📜 License

MIT - Created for the War-Machine Project. 🤖🦾

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Modelfile		Modelfile
package.json		package.json
readme.md		readme.md
response.png		response.png
server.js		server.js
setup.js		setup.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦾 War-Machine AI

🚀 Features

🛠️ Hardware Context

📥 Setup Instructions

1. Install Ollama

2. Project Initialization

3. Build the Character

4. Ignite the Server

📡 API Reference

POST `/ask`

⚙️ Key Optimizations Applied

📝 Example Response

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦾 War-Machine AI

🚀 Features

🛠️ Hardware Context

📥 Setup Instructions

1. Install Ollama

2. Project Initialization

3. Build the Character

4. Ignite the Server

📡 API Reference

POST /ask

⚙️ Key Optimizations Applied

📝 Example Response

📜 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/ask`

Packages