On-device LLM inference for React Native for Alibabas MNN.
As of now this library only supports Android.
- π Fast on-device LLM inference - Powered by MNN engine
- π± React Native native modules - TurboModule architecture
- π Streaming text generation - Real-time token-by-token output
- π¬ Conversation support - Multi-turn chat with history
- β‘ Optimized for mobile - ARM64 optimized with quantization support
- π― Type-safe API - Full TypeScript support
- π§ Flexible configuration - Runtime config updates
npm install mnn.rn
# or
yarn add mnn.rnimport { createMnnLlmSession } from 'mnn.rn';
const session = createMnnLlmSession();
// Initialize
await session.init({
modelDir: '/sdcard/models/llama-3-8b',
maxNewTokens: 2048,
systemPrompt: 'You are a helpful AI assistant.',
keepHistory: true
});
// Generate with streaming - now returns Promise!
const metrics = await session.submitPrompt(
'Write a haiku about React Native',
true,
(chunk) => console.log(chunk), // Each token
(metrics) => console.log('Done!'), // Completion callback
(error) => console.error(error) // Errors
);
console.log('Generated', metrics.decodeLen, 'tokens');
// Clean up
await session.release();See QUICK_START.md for detailed usage examples.
// Create session
const session = createMnnLlmSession();
// Initialize with model
await session.init({
modelDir: string,
maxNewTokens?: number,
systemPrompt?: string,
keepHistory?: boolean
});
// Release resources
await session.release();// Streaming with callbacks AND Promise (recommended)
const metrics = await session.submitPrompt(
prompt: string,
keepHistory: boolean,
onChunk?: (chunk: string) => void,
onComplete?: (metrics: LlmMetrics) => void,
onError?: (error: string) => void
): Promise<LlmMetrics>
// Conversation with history
const metrics = await session.submitWithHistory(
messages: LlmMessage[],
onChunk?: (chunk: string) => void,
onComplete?: (metrics: LlmMetrics) => void,
onError?: (error: string) => void
): Promise<LlmMetrics>
// Stop generation
await session.stop();// Update settings at runtime
await session.updateMaxNewTokens(512);
await session.updateSystemPrompt('You are a helpful assistant.');
await session.updateConfig(JSON.stringify({ temperature: 0.7 }));
// Manage conversation
await session.clearHistory();
await session.reset();
// Stop ongoing generation
await session.stop();See API.md for complete API reference.
Run the included example:
cd example
npm install
npm run androidFeatures demonstrated:
- β Model initialization
- β Real-time streaming
- β Token counter
- β Performance metrics
- β Conversation history
- β Example prompts
βββββββββββββββββββββββββββ
β React Native App β TypeScript API
βββββββββββββββββββββββββββ€
β TurboModule Bridge β React Native Bridge
βββββββββββββββββββββββββββ€
β Kotlin Module β Session Management
βββββββββββββββββββββββββββ€
β JNI Layer β Callback Bridge
βββββββββββββββββββββββββββ€
β C++ LlmSession β MNN Wrapper
βββββββββββββββββββββββββββ€
β libMNN.so β Inference Engine
βββββββββββββββββββββββββββ
- Convert your model to MNN format using MNN tools
- Place on device:
adb push /path/to/model /sdcard/models/your-model/
- Model structure:
/sdcard/models/your-model/ βββ model.mnn βββ tokenizer.txt βββ config.json
- React Native 0.71+
- Android:
- NDK r21+
- Gradle 8.0+
- ARM64 device (arm64-v8a)
- iOS: Coming soon
"Session is not initialized"
- Solution: Call
init()before using the session
"Model not found"
- Solution: Verify model path with
adb shell ls /sdcard/models/your-model
Slow performance
- Solution: Use quantized models (4-bit or 8-bit)
- Solution: Reduce
maxNewTokens - Solution: Use smaller model size
Out of memory
- Solution: Use smaller model
- Solution: Clear history more frequently
- Solution: Close other apps
See QUICK_START.md for more details.
- Quick Start Guide - Get started quickly
- API Reference - Complete API documentation
- Architecture - System design
- Implementation Plan - Development guide
Contributions are welcome! Please read our Contributing Guide.
MIT License - see LICENSE file for details.
- MNN - Mobile Neural Network inference framework
- React Native team for TurboModule architecture
- GitHub Issues: [Report bugs or request features]
- Documentation: See files above
- Example App: Run
cd example && npm run android
Built with β€οΈ by Naved Merchant