# Getting Started with Open-Source LLMs Using Python

Large Language Models (LLMs) like LLaMA, Mistral, and Gemma are now available as open-source projects. You can easily run them **locally** using tools such as **Ollama**, **Hugging Face Transformers**, or **LM Studio** — all directly from Python.

This guide explains how to start working with open-source LLMs using **Ollama** and **Hugging Face**.

## 1. What Are Open-Source LLMs?

Open-source LLMs are models whose **weights** and **inference code** are publicly available. Unlike proprietary models (e.g., GPT-4), you can:

* Run them **offline**
* Fine-tune them on your data
* Embed them into your own apps or research projects

Popular families include:

* **LLaMA / Meta**
* **Mistral / Mixtral**
* **Gemma / Google**
* **Falcon / TII**
* **Qwen / Alibaba**

## 2. Choosing the Right Model

+ **English**: https://www.vellum.ai/llm-leaderboard
+ **Persian**: https://huggingface.co/spaces/MCINext/mizan-llm-leaderboard

<div style="text-align: center;">
    <img src="images/mizan.png" width="90%"></img>
</div>

**Note**: Don't always trust leaderboards! Evaluate different models on your data. Each model's performance may vary for different domains.

## 3. Running LLMs Locally with Ollama

[Ollama](https://ollama.ai) provides a simple way to **download and run** open models locally — with GPU or CPU acceleration.

### Installation

```bash
# Linux / macOS
curl -fsSL https://ollama.com/install.sh | sh

# Windows (PowerShell)
winget install Ollama.Ollama
```

Verify installation:

```bash
ollama --version
```

---

### Running a Model from the CLI

List available models:

```bash
ollama list
```

Download and run a model (e.g., Gemma 2B):

```bash
ollama pull gemma:2b
ollama run gemma:2b
```

---

### Using Ollama in Python

Install the official Python client:

```bash
pip install ollama
```

Example code:

```python
from ollama import chat

response = chat(model='gemma:2b', messages=[
  {'role': 'system', 'content': 'You are a computer science expert.'},
  {'role': 'user', 'content': 'Explain the difference between AI and machine learning.'}
])

print(response['message']['content'])
```

Ollama automatically caches models locally. You can integrate it into chat UIs, retrieval systems (RAG), or custom agents.

---

## 3. Using Hugging Face Transformers

[Hugging Face Transformers](https://huggingface.co/transformers) is the most popular Python library for working with LLMs.
It provides APIs to load, run, and fine-tune models from the **Hugging Face Hub**.

### Installation

```bash
pip install transformers torch accelerate
```

---

### Example: Text Generation

```python
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-3-270m-it")
messages = [
    {"role": "user", "content": "Who is the president of united states?"},
]
pipe(messages)
```

---

## 4. Authentication and Model Access

Some models (especially on Hugging Face) require authentication.

```bash
huggingface-cli login
```

or 

```python
from huggingface_hub import login

login(YOUR_HF_TOKEN)
```

For **Ollama**, no authentication is needed by default.

---

## 6. Other Useful Tools

| Tool                                | Description                                      |
| ----------------------------------- | ------------------------------------------------ |
| **LM Studio**                       | GUI for running and testing open models          |
| **vLLM**                            | High-performance inference engine                |
| **LangChain**                       | Framework for building LLM apps                  |
| **Llama.cpp**                       | C++ backend for running quantized models locally |

---

##  References

* [Ollama Models](https://ollama.com//library)
* [HuggingFace Pipelines](https://huggingface.co/docs/transformers/v4.57.1/en/main_classes/pipelines#)
* [LangChain Docs](https://docs.langchain.com/oss/python/langchain/overview)

## Appendix: Free API Providers for Quick Testing

+ Openrouter
+ ai.studio.google.com
