### Project structure 

**controllers/**
Purpose: Implements the Controller part of the MVC architecture. These files handle user input and manage interactions between the Model and View.

**models**
Purpose: Implements the Model part of the MVC architecture. These files define the data structure and interact with the database or other data sources.

**helpers/**
Purpose: Contains utility functions or classes that assist other components in the project.

**enums/**
Defines enumerations (constants) for managing states or types.

**routes/**
Purpose: Defines the View part of the application, routing user requests to the appropriate controllers and returning responses.

**schemes/**
Contains schemas for validating and serializing API request/response data.


``` json
Project Root/
├── docker/
│   ├── .gitignore
│   └── docker-compose.yml
├── src/
│   ├── assets/
│   │   ├── .gitignore
│   │   ├── .gitkeep
│   │   └── mini-rag-app.postman_collection.json
│   ├── controllers/
│   │   ├── BaseController.py
│   │   ├── DataController.py
│   │   ├── ProcessController.py
│   │   ├── ProjectController.py
│   │   └── __init__.py
│   ├── helpers/
│   │   ├── __init__.py
│   │   └── config.py
│   ├── models/
│   │   ├── db_schemes/
│   │   │   ├── __init__.py
│   │   │   ├── data_chunk.py
│   │   │   └── project.py
│   │   ├── enums/
│   │   │   ├── DataBaseEnum.py
│   │   │   ├── ProcessingEnum.py
│   │   │   ├── ResponseEnums.py
│   │   │   ├── __init__.py
│   │   ├── BaseDataModel.py
│   │   ├── ChunkModel.py
│   │   ├── ProjectModel.py
│   │   └── __init__.py
│   ├── routes/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── data.py
│   │   ├── schemes/
│   │   │   ├── __init__.py
│   │   │   ├── base.py
│   │   │   └── data.py
│   ├── .env.example
│   ├── .gitignore
│   ├── main.py
│   └── requirements.txt
├── LICENSE
└── README.md
```

### Conda 


In [None]:
# Start Conda
source ~/miniconda3/bin/activate 

# Create new environment
conda create -n mini_rag python=3.8

# Start environment
conda activate mini_rag

#view environments
conda info --envs

: 

### .gitignore template

In [None]:
https://github.com/github/gitignore/blob/main/Python.gitignore

### .env

**Make sure to add it in .gitignore**

**Create .env.example That has the same structure as .env**

In [None]:
# requirements
python-dotenv

# To load the variables in the system 
from dotenv import load_dotenv
load_dotenv(".env")

# to use the variables 
app_name = os.getenv('APP_NAME')

### Fast API 
- Used to handle apis 
- Create a main file , its responsibility is to only include the routes
- create a dir routes , and add the routes in it
- create a prefix and a tag for the API router
- Use JSON response  and status , in json response you can specify the status code and add some content 
- Use on event (on startup , shut down) to create the db connections and load env variables
- Use request to load variables saved in the on event
- depends  allows you to declare dependencies that will be executed before your route or function is processed, and their results are automatically passed to the function as arguments.

Field

In [None]:
# Requirements
fastapi
uvicorn[standard]

In [None]:
# code example
from fastapi import FastAPI

# Create a FastAPI app instance
app = FastAPI()

# Define a simple route
@app.get("/")
async def read_root():
    return {"message": "Hello, FastAPI!"}

# Add another route for greeting a user
@app.get("/greet/{name}")
async def greet_user(name: str):
    return {"message": f"Hello, {name}! Welcome to FastAPI."}


In [None]:
# Check documentation
http://127.0.0.1:8000/docs

In [None]:
# Best practice : 
# - Create a main file , its responsibility is to only include the routes
# - Use on event  to create the db connections and load env variables

from fastapi import FastAPI
from routes import base, data
from motor.motor_asyncio import AsyncIOMotorClient
from helpers.config import get_settings

app = FastAPI()

@app.on_event("startup")
async def startup_db_client():
    settings = get_settings()
    app.mongo_conn = AsyncIOMotorClient(settings.MONGODB_URL)
    app.db_client = app.mongo_conn[settings.MONGODB_DATABASE]

@app.on_event("shutdown")
async def shutdown_db_client():
    app.mongo_conn.close()


app.include_router(base.base_router)
app.include_router(data.data_router)


In [None]:
# - create a dir routes , and add the routes in it : 
# - create a prefix and a tag for the API router
# - Use JSON response  and status 

from fastapi import FastAPI, APIRouter ,  status
import os
from fastapi.responses import JSONResponse

base_router = APIRouter(
    prefix="/api/v1",
    tags=["api_v1"],
)

@base_router.get("/")
async def welcome():
    app_name = os.getenv('APP_NAME')
    app_version = os.getenv('APP_VERSION')

    return JSONResponse(
        status_code=status.HTTP_400_BAD_REQUEST,  
        content={
            "signal": ResponseSignal.FILE_UPLOAD_FAILED.value
        }
    )





In [None]:
# - Use Request to load variables saved in (ON events)
# - use Depends to declare dependencies that will be executed before your route or function is processed

from fastapi import FastAPI, APIRouter, Depends, UploadFile, status, Request
from helpers.config import get_settings, Settings
@data_router.post("/upload/{project_id}")
async def upload_data(request: Request, project_id: str, file: UploadFile,
                    app_settings: Settings = Depends(get_settings)):
    

project_model = ProjectModel(
    db_client=request.app.db_client
)



### UVICORN

In [None]:
uvicorn filename:app --reload

uvicorn app:app --host 0.0.0.0 --port 8080 --workers 4 --reload --log-level debug


### Pydantic
- We can use pydantic-settings to validate the environment variables , so if anyone forgot to add the .env or changed the types an error will occur
- We can use pydantic to validate the api request parameters 



In [None]:
# requirements
pydantic-settings


# Creation
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):

    APP_NAME: str
    APP_VERSION: str
    OPENAI_API_KEY: str

    FILE_ALLOWED_TYPES: list
    FILE_MAX_SIZE: int
    FILE_DEFAULT_CHUNK_SIZE: int

    MONGODB_URL: str
    MONGODB_DATABASE: str

    class Config:
        env_file = ".env"

def get_settings():
    return Settings()


# Usage 
app_settings = get_settings()

In [None]:
# requirements
pydantic


# Creation
from pydantic import BaseModel
from typing import Optional

class ProcessRequest(BaseModel):
    file_id: str
    chunk_size: Optional[int] = 100
    overlap_size: Optional[int] = 20
    do_reset: Optional[int] = 0


# Usage 
@data_router.post("/process/{project_id}")
async def process_endpoint(request: Request, project_id: str, process_request: ProcessRequest):

    file_id = process_request.file_id
    chunk_size = process_request.chunk_size
    overlap_size = process_request.overlap_size
    do_reset = process_request.do_reset


### Controller
- These files handle user input and manage interactions between the Model and View.
- In controllers we create a base controller that has the functions that most of the other controllers would use 


In [None]:
import os

class BaseController:
    
    def __init__(self):

        self.app_settings = get_settings()  # We loaded the env variables here
        
        self.base_dir = os.path.dirname( os.path.dirname(__file__) )

In [None]:
from .BaseController import BaseController
import os

class DataController(BaseController):
    
    def __init__(self):
        super().__init__()

### models 
- These files define the data structure and interact with the database or other data sources.
- In models we create a base models that has the functions that most of the other models would use 
- In models we create enums that has constants that you will use in your system , like the error messages , db tables and others
- In models we deal with databases , we need to create the schemas of the database so that we can validate and make sure we have a constant structure

In [None]:
from helpers.config import get_settings, Settings

class BaseDataModel:

    def __init__(self, db_client: object):
        self.db_client = db_client
        self.app_settings = get_settings()

In [None]:
from .BaseDataModel import BaseDataModel

class ProjectModel(BaseDataModel):

    def __init__(self, db_client: object):
        super().__init__(db_client=db_client)


In [None]:
from enum import Enum

# Creation
class DataBaseEnum(Enum):

    COLLECTION_PROJECT_NAME = "projects"
    COLLECTION_CHUNK_NAME = "chunks"

class ResponseSignal(Enum):

    FILE_VALIDATED_SUCCESS = "file_validate_successfully"
    FILE_TYPE_NOT_SUPPORTED = "file_type_not_supported"
    FILE_SIZE_EXCEEDED = "file_size_exceeded"


# Usage
ResponseSignal.FILE_TYPE_NOT_SUPPORTED.value

In [None]:
from pydantic import BaseModel, Field, validator
from typing import Optional
from bson.objectid import ObjectId

class DataChunk(BaseModel):
    id: Optional[ObjectId] = Field(None, alias="_id")  #Optional
    chunk_text: str = Field(..., min_length=1)   # Mandatory
    chunk_metadata: dict  # Mandatory
    chunk_order: int = Field(..., gt=0)
    chunk_project_id: ObjectId

    class Config:
        arbitrary_types_allowed = True

        
class Project(BaseModel):
    id: Optional[ObjectId] = Field(None, alias="_id")
    project_id: str = Field(..., min_length=1)

    @validator('project_id')
    def validate_project_id(cls, value):
        if not value.isalnum():
            raise ValueError('project_id must be alphanumeric')
        
        return value

    class Config:
        arbitrary_types_allowed = True

# Notes 
mvc  #
controller  #
base controller and how to call it in a child controller #

models #
enums & database enums #
responseenums and usage #
db schemas #

helper
pydantic settings , class and validation and usage  #
pydantic validating request parameters #
validator
_id and how to handle it

fast api json response & status code #
fast api on event #
fast api request  #
depends #
Field

os and pathes

Logging vs sending error to user

Docker 
mongo db image 
3T studio
motor asyncio
paginate 
bulk (batch) write

async
