### Question 1
    Install uv
    What's the version of uv you installed?
    Use --version to find out

    Answer: I used the command pip install uv
    uv --version
    uv 0.9.5


### Question 2
    Use uv to install Scikit-Learn version 1.6.1
    What's the first hash for Scikit-Learn you get in the lock file?
    Include the entire string starting with sha256:, don't include quotes

    [[package]]
    name = "scikit-learn"
    version = "1.6.1"
    source = { registry = "https://pypi.org/simple" }
    dependencies = [
        { name = "joblib" },
        { name = "numpy" },
        { name = "scipy" },
        { name = "threadpoolctl" },
    ]
    sdist = { url = "https://files.pythonhosted.org/packages/9e/a5/4ae3b3a0755f7b35a280ac90b28817d1f380318973cff14075ab41ef50d9/scikit_learn-1.6.1.tar.gz", 
    hash = "sha256:b4fc2525eca2c69a59260f583c56a7557c6ccdf8deafdba6e060f94c1c59738e", size = 7068312, upload-time = "2025-01-10T08:07:55.348Z" }

    Answer is : sha256:b4fc2525eca2c69a59260f583c56a7557c6ccdf8deafdba6e060f94c1c59738e

### Question 3
    Let's use the model!

    Write a script for loading the pipeline with pickle
    Score this record:
    {
        "lead_source": "paid_ads",
        "number_of_courses_viewed": 2,
        "annual_income": 79276.0
    }
    What's the probability that this lead will convert?

    0.333
    0.533
    0.733
    0.933

In [None]:
import pickle
from pathlib import Path

# ---------- config ----------
MODEL_PATH = Path("pipeline_v1.bin")

record = {
    "lead_source": "paid_ads",
    "number_of_courses_viewed": 2,
    "annual_income": 79276.0,
}
choices = [0.333, 0.533, 0.733, 0.933]
# ----------------------------

def pick_option(prob, options):
    # nearest; if exactly in-between, choose the higher option
    # (equivalent to rounding to nearest, ties go up)
    best = options[0]
    best_dist = abs(prob - best)
    for opt in options[1:]:
        dist = abs(prob - opt)
        if dist < best_dist or (abs(dist - best_dist) < 1e-12 and opt > best):
            best = opt
            best_dist = dist
    return best

def main():
    print("Loading model from:", MODEL_PATH.resolve())
    if not MODEL_PATH.exists():
        raise FileNotFoundError("pipeline_v1.bin not found in current folder")

    with open(MODEL_PATH, "rb") as f:
        model = pickle.load(f)

    # DictVectorizer inside the pipeline expects a list of dicts
    proba = model.predict_proba([record])[0, 1]
    print(f"Probability: {proba:.6f}")

    chosen = pick_option(proba, choices)
    print("Closest MCQ option:", chosen)

if __name__ == "__main__":
    main()


uv add scikit-learn==1.6.1
 uv run python -u main.py

 Probability: 0.533607
Closest MCQ option: 0.533

### Question 4
    Now let's serve this model as a web service

    Install FastAPI
    Write FastAPI code for serving the model
    Now score this client using requests:
    url = "YOUR_URL"
    client = {
        "lead_source": "organic_search",
        "number_of_courses_viewed": 4,
        "annual_income": 80304.0
    }
    requests.post(url, json=client).json()
    What's the probability that this client will get a subscription?

        0.334
        0.534
        0.734
        0.934

### Answer: {'probability': 0.5340417283801275}

     Activated the virtual enviorment
     Installed dependencies: pip install fastapi "uvicorn[standard]" scikit-learn==1.6.1 requests
     Started  the FastAPI server: python -m uvicorn app:app --reload --host 0.0.0.0 --port 8000



### Docker
#### Install Docker. We will use it for the next two questions.

    For these questions, we prepared a base image: agrigorev/zoomcamp-model:2025. You'll need to use it (see Question 5 for an example).

    This image is based on 3.13.5-slim-bookworm and has a pipeline with logistic regression (a different one) as well a dictionary vectorizer inside.

    This is how the Dockerfile for this image looks like:

    FROM python:3.13.5-slim-bookworm
    WORKDIR /code
    COPY pipeline_v2.bin .
    We already built it and then pushed it to agrigorev/zoomcamp-model:2025.

    Note: You don't need to build this docker image, it's just for your reference.

### Asnwer
    1. I started Docker Desktop on Windows and made sure the daemon was running. I verified Docker was available from PowerShell with: docker info (This returned information about the Docker Engine, confirming the daemon was running.)
    2. I pulled the prepared base image from Docker Hub: docker pull agrigorev/zoomcamp-model:2025
    3. After the pull completed I checked the local images: docker images agrigorev/zoomcamp-model:2025 Example output I saw: REPOSITORY TAG IMAGE ID CREATED SIZE agrigorev/zoomcamp-model 2025 14d79fde0bbf 9 days ago 181MB
    4. I started a container with an interactive bash shell: docker run --rm -it agrigorev/zoomcamp-model:2025 bash
    5. Inside the container I listed the /code directory: root@...:/code# ls -la /code total 12 drwxr-xr-x 1 root root 4096 ... -rwxr-xr-x 1 root root 1296 Oct 21 07:50 pipeline_v2.bin: This confirmed pipeline_v2.bin was present at /code inside the image
    6. If I wanted a repeatable image with FastAPI and scikit-learn installed and the model baked in, I created a Dockerfile like this (placed in the same directory as pipeline_v1.bin and my app.py):

Dockerfile FROM python:3.13.5-slim-bookworm WORKDIR /code COPY pipeline_v1.bin . RUN pip install --no-cache-dir fastapi "uvicorn[standard]" scikit-learn==1.6.1 COPY app.py . EXPOSE 8000 CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

7. Then I built and ran the image: docker build -t my-lead-scorer:v1 . docker run --rm -p 8000:8000 --name lead-scorer my-lead-scorer:v1







Question 6
Let's run your docker container!

After running it, score this client once again:

url = "YOUR_URL"
client = {
    "lead_source": "organic_search",
    "number_of_courses_viewed": 4,
    "annual_income": 80304.0
}
requests.post(url, json=client).json()
What's the probability that this lead will convert?

0.39
0.59
0.79
0.99