Skip to content

GPU constrained! No More. Microsoft released Phi3 specially designed for memory/compute constrained environments. The model support ONXX CPU runtime which offers amazing inference speed even on mobile cpu.

Notifications You must be signed in to change notification settings

swastikmaiti/Phi3-No-GPU-No-Worry

Repository files navigation

RAG-System with Microsoft Phi3 Mini

In this work we harness the power of Microsoft Phi3 Mini 3.8 on ONXX CPU runtime. We build a PDF Q/A system with nomic-embed-text-v1 as embedding moel faiss as Vector DB.

File Structures

  • pre_processing.py: Contains code for parsing PDF file, creating Embedding and Vector DB.
  • application.ipynb: This notebook for creating a pdf Q/A pipeline.
  • app.py: Code for Gradio Application. The app is hosted on HF Space

Frameworks

  • LLM: Phi3 Mini
  • Embedding Model: nomic-embed-text-v1
  • Vector DB: faiss
  • Application: Gradio

How to RUN

  • Install libraries with make install
  • Prepare Phi3 Mini with ONXX CPU Runtime in Linux with make phi3_dependency
  • Run run the app execute python app.py

Acknowledgement

  • Microsoft for the open source Phi3 Mini Quantized along with ONXX Runtime support.
  • Hugging Face for the all the educational and open source resources.

If you find the repo helpful, please drop a ⭐

About

GPU constrained! No More. Microsoft released Phi3 specially designed for memory/compute constrained environments. The model support ONXX CPU runtime which offers amazing inference speed even on mobile cpu.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published