Name		Name	Last commit message	Last commit date
parent directory ..
Intro_RAG.ipynb		Intro_RAG.ipynb
Model_FlanT5.ipynb		Model_FlanT5.ipynb
README.md		README.md
minsearch.py		minsearch.py

README.md

2. Open-Source LLMs

In the previous module, we used Gemini 1.5 Flash via Google API. It's a very convenient way to use an LLM, but you have to pay for the usage, and you don't have control over the model you get to use.

In this module, we'll look at using open-source LLMs instead.

2.1 Introduction

YouTube Class: 2.1 - Introduction to Open-Source

Open-Source LLMs
Replacing the LLM box in the RAG flow

2.2 Using a GPU in Saturn Cloud

YouTube Class: 2.2 - Using SaturnCloud for GPU Notebooks

Registering in Saturn Cloud
Configuring secrets and git
Creating an instance with a GPU

Bonus: Using Google Colab for GPU Notebooks

This is my personal choice!

2.3 Model: Google FLAN-T5

YouTube Class: 2.3 - HuggingFace and Google FLAN T5

HuggingFace Model: google/flan-t5-xl
Jupyter Notebook: Model_FlanT5.ipynb
References:
- huggingface.co/google/flan-t5-xl
- huggingface.co/docs/transformers/en/model_doc/flan-t5

2.4 More models

Phi 3 Mini
- YouTube Class: 2. 4 - Phi 3 Mini
- HuggingFace Model: microsoft/Phi-3-mini-128k-instruct
- Reference: huggingface.co/microsoft/Phi-3-mini-128k-instruct
Mistral-7B and HuggingFace Hub Authentication
- YouTube Class: 2.5 - Mistral-7B
- HuggingFace Model: mistralai/Mistral-7B-v0.1
- Reference: huggingface.co/mistralai/Mistral-7B-v0.1
Exploring Open-Source Models
- YouTube Class: 2.6 - Exploring Open Source LLMs
- HF: Generation with LLMs
- HF: Open LLM Leaderboard
- HF: LLM-Perf Leaderboard

2.5 Ollama - Running LLMs on a CPU

YouTube Class: 2.7 - Running LLMs Locally without a GPU with Ollama

Install and start Ollama:

curl -fsSL https://ollama.com/install.sh | sh
ollama start

Pull (only once) and run locally a model:

ollama pull phi3
ollama run phi3

then a chat with the model will be opened from the command line interface.

Intro_RAG.ipynb: RAG system using phi3 local model
- Pros: open-source models and codes, totally free
- Cons: very slow running and probably not scalable
Ollama model libray
GitHub: Ollama

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

02-open-source

02-open-source

README.md

2. Open-Source LLMs

2.1 Introduction

2.2 Using a GPU in Saturn Cloud

2.3 Model: Google FLAN-T5

2.4 More models

2.5 Ollama - Running LLMs on a CPU

Files

02-open-source

Directory actions

More options

Directory actions

More options

Latest commit

History

02-open-source

Folders and files

parent directory

README.md

2. Open-Source LLMs

2.1 Introduction

2.2 Using a GPU in Saturn Cloud

2.3 Model: Google FLAN-T5

2.4 More models

2.5 Ollama - Running LLMs on a CPU