<a href="https://colab.research.google.com/github/churamani2030dev/proj3_DS_coding/blob/main/proj3_DS_coding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ⬇️ Get the project

In [30]:
!git clone https://github.com/udacity/dsnd-dashboard-project
%cd dsnd-dashboard-project

Cloning into 'dsnd-dashboard-project'...
remote: Enumerating objects: 160, done.[K
remote: Counting objects: 100% (82/82), done.[K
remote: Compressing objects: 100% (44/44), done.[K
remote: Total 160 (delta 47), reused 38 (delta 38), pack-reused 78 (from 1)[K
Receiving objects: 100% (160/160), 258.78 KiB | 1.37 MiB/s, done.
Resolving deltas: 100% (59/59), done.
/content/dsnd-dashboard-project/dsnd-dashboard-project/dsnd-dashboard-project


# 🔧 Install deps (project + package)

In [32]:
!pip install -q -r requirements.txt


# (Nice to have)

In [18]:
!pip install -q pytest pyngrok

In [20]:
# Add both the project root and the package folder to sys.path
import sys, pathlib, os
ROOT = pathlib.Path.cwd()
PKG  = ROOT / "python-package"
sys.path += [str(ROOT), str(PKG)]


# Verify key files exist
for p in [
    "python-package/employee_events/employee_events.db",
    "assets/model.pkl",
    "report/dashboard.py",
    "tests/test_employee_events.py"
]:
    print(p, "✅" if (ROOT / p).exists() else "❌")

python-package/employee_events/employee_events.db ✅
assets/model.pkl ✅
report/dashboard.py ✅
tests/test_employee_events.py ✅


In [21]:
%%writefile python-package/employee_events/sql_execution.py
from pathlib import Path
import sqlite3, pandas as pd

DB_PATH = Path(__file__).resolve().parent / "employee_events.db"

def run(query: str, params: dict | tuple = ()) -> pd.DataFrame:
    with sqlite3.connect(DB_PATH) as conn:
        conn.row_factory = sqlite3.Row
        cur = conn.execute(query, params if isinstance(params, tuple) else dict(params))
        rows = cur.fetchall()
    return pd.DataFrame([dict(r) for r in rows])

Overwriting python-package/employee_events/sql_execution.py


In [22]:
%%writefile python-package/employee_events/query_base.py
from .sql_execution import run

def employee_timeseries(employee_id: int):
    q = """
    SELECT event_date,
           SUM(positive_events) AS pos,
           SUM(negative_events) AS neg
    FROM employee_events
    WHERE employee_id = :eid
    GROUP BY event_date
    ORDER BY event_date;
    """
    return run(q, {"eid": employee_id})

def team_summary(team_id: int):
    q = """
    SELECT e.employee_id,
           e.first_name || ' ' || e.last_name AS name,
           SUM(ev.positive_events) AS pos,
           SUM(ev.negative_events) AS neg
    FROM employee e
    JOIN employee_events ev USING(employee_id)
    WHERE e.team_id = :tid
    GROUP BY e.employee_id, name
    ORDER BY pos DESC;
    """
    return run(q, {"tid": team_id})

Overwriting python-package/employee_events/query_base.py


In [23]:
%%writefile python-package/employee_events/employee.py
import numpy as np, pandas as pd
from .query_base import employee_timeseries

def get_employee_timeseries(entity_id, model):
    return employee_timeseries(int(entity_id))

def get_employee_snapshot(entity_id, model):
    ts = employee_timeseries(int(entity_id))
    pos30 = ts.tail(30)["pos"].sum() if not ts.empty else 0
    neg30 = ts.tail(30)["neg"].sum() if not ts.empty else 0
    # Example feature vector; adjust to what your model expects
    X = pd.DataFrame([{"pos30": pos30, "neg30": neg30}])
    try:
        risk = float(model.predict_proba(X)[:,1])
    except Exception:
        risk = 0.0
    return {"pos30": int(pos30), "neg30": int(neg30), "risk": risk}

Overwriting python-package/employee_events/employee.py


In [24]:
%%writefile python-package/employee_events/team.py
from .query_base import team_summary

def get_team_table(entity_id, model):
    return team_summary(int(entity_id))

Overwriting python-package/employee_events/team.py


In [25]:
%%writefile report/utils.py
import pickle, pathlib
ASSETS = pathlib.Path(__file__).resolve().parents[1] / "assets"

_model = None
def get_model():
    global _model
    if _model is None:
        with open(ASSETS / "model.pkl", "rb") as f:
            _model = pickle.load(f)
    return _model

Overwriting report/utils.py


In [45]:
%%writefile report/dashboard.py
from report.utils import get_model
from employee_events.employee import get_employee_snapshot, get_employee_timeseries
from employee_events.team import get_team_table

# If your FastHTML skeleton exposes classes, import them here.
# Below is a simple bottle/fastapi-like sketch; adapt to your framework.

import http.server, socketserver, json

PORT = 8000
MODEL = get_model()

class Handler(http.server.SimpleHTTPRequestHandler):
    def do_GET(self):
        if self.path.startswith("/api/employee/"):
            eid = int(self.path.split("/")[-1])
            snap = get_employee_snapshot(eid, MODEL)
            self._send_json(snap)
        else:
            # Serve a tiny HTML that calls the API (placeholder)
            html = """
            <html><body>
              <h2>Employee Dashboard</h2>
              <p>Try /api/employee/1</p>
            </body></html>
            """
            self._send_html(html)

    def _send_json(self, obj):
        data = json.dumps(obj).encode()
        self.send_response(200)
        self.send_header("Content-Type","application/json")
        self.send_header("Content-Length", str(len(data)))
        self.end_headers()
        self.wfile.write(data)

    def _send_html(self, html):
        data = html.encode()
        self.send_response(200)
        self.send_header("Content-Type","text/html")
        self.send_header("Content-Length", str(len(data)))
        self.end_headers()
        self.wfile.write(data)

if __name__ == "__main__":
    with socketserver.TCPServer(("", PORT), Handler) as httpd:
        print(f"Serving on http://127.0.0.1:{PORT}")
        httpd.serve_forever()

Overwriting report/dashboard.py


In [27]:
!pytest -q


[33m[33mno tests ran[0m[33m in 0.02s[0m[0m


In [28]:
# run your dashboard (use the project root as CWD)
!PYTHONPATH=$PYTHONPATH:. python report/dashboard.py

Traceback (most recent call last):
  File "/content/dsnd-dashboard-project/dsnd-dashboard-project/report/dashboard.py", line 2, in <module>
    from employee_events.employee import get_employee_snapshot, get_employee_timeseries
ModuleNotFoundError: No module named 'employee_events'


# Employee Events Dashboard Project

This project is a dashboard to visualize employee events and assess potential risks.

## Project Setup

1.  **Clone the repository:**

In [40]:
!python -m venv .venv
!source .venv/bin/activate # Windows: .venv\Scripts\activate
!pip install -r requirements.txt
!pip install -r python-package/requirements.txt
!pip install pytest pyngrok # Nice to have packages

Error: Command '['/content/dsnd-dashboard-project/dsnd-dashboard-project/dsnd-dashboard-project/.venv/bin/python3', '-m', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1.
/bin/bash: line 1: .venv/bin/activate: No such file or directory
[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'python-package/requirements.txt'[0m[31m


In [34]:
    import sys, pathlib, os
    ROOT = pathlib.Path.cwd()
    PKG  = ROOT / "python-package"
    sys.path += [str(ROOT), str(PKG)]

    # Verify key files exist
    for p in [
        "python-package/employee_events/employee_events.db",
        "assets/model.pkl",
        "report/dashboard.py",
        "tests/test_employee_events.py"
    ]:
        print(p, "✅" if (ROOT / p).exists() else "❌")

python-package/employee_events/employee_events.db ✅
assets/model.pkl ✅
report/dashboard.py ✅
tests/test_employee_events.py ✅


In [41]:
!PYTHONPATH=$PYTHONPATH:. python report/dashboard.py

  File "/content/dsnd-dashboard-project/dsnd-dashboard-project/dsnd-dashboard-project/report/dashboard.py", line 164
    ax.barh([''], [pred])
IndentationError: unexpected indent


In [42]:
!pip install -q pyngrok
from pyngrok import ngrok
from google.colab import userdata

NGROK_AUTH_TOKEN = userdata.get("NGROK_AUTH_TOKEN")
ngrok.set_auth_token(NGROK_AUTH_TOKEN)

public_url = ngrok.connect(8000).public_url
print("Public URL:", public_url)

SecretNotFoundError: Secret NGROK_AUTH_TOKEN does not exist.

In [43]:
!PYTHONPATH=$PYTHONPATH:. python report/dashboard.py

  File "/content/dsnd-dashboard-project/dsnd-dashboard-project/dsnd-dashboard-project/report/dashboard.py", line 164
    ax.barh([''], [pred])
IndentationError: unexpected indent


In [44]:
!pytest -q


[33m[33mno tests ran[0m[33m in 0.03s[0m[0m


# Task
Add docstrings to all Python functions and classes in the project and generate a README file at the end of the notebook.

## Add docstrings to sql execution.py

### Subtask:
Modify the cell containing `%%writefile python-package/employee_events/sql_execution.py` to include docstrings for the `run` function.


**Reasoning**:
Modify the `sql_execution.py` file to add a docstring to the `run` function as requested in the instructions.



In [46]:
%%writefile python-package/employee_events/sql_execution.py
from pathlib import Path
import sqlite3, pandas as pd

DB_PATH = Path(__file__).resolve().parent / "employee_events.db"

def run(query: str, params: dict | tuple = ()) -> pd.DataFrame:
    """
    Execute a SQL query and return the results as a pandas DataFrame.

    Args:
        query: The SQL query string to be executed.
        params: A dictionary or tuple of parameters for the query.

    Returns:
        A pandas DataFrame containing the query results.
    """
    with sqlite3.connect(DB_PATH) as conn:
        conn.row_factory = sqlite3.Row
        cur = conn.execute(query, params if isinstance(params, tuple) else dict(params))
        rows = cur.fetchall()
    return pd.DataFrame([dict(r) for r in rows])

Overwriting python-package/employee_events/sql_execution.py


## Add docstrings to query base.py

### Subtask:
Modify the cell containing `%%writefile python-package/employee_events/query_base.py` to include docstrings for the `employee_timeseries` and `team_summary` functions.


**Reasoning**:
Add docstrings to the `employee_timeseries` and `team_summary` functions in the `query_base.py` file.



In [47]:
%%writefile python-package/employee_events/query_base.py
from .sql_execution import run

def employee_timeseries(employee_id: int):
    """
    Retrieves the event time series data for a specific employee.

    Args:
        employee_id: The ID of the employee.

    Returns:
        A pandas DataFrame with columns 'event_date', 'pos', and 'neg',
        representing the sum of positive and negative events per date for the employee.
    """
    q = """
    SELECT event_date,
           SUM(positive_events) AS pos,
           SUM(negative_events) AS neg
    FROM employee_events
    WHERE employee_id = :eid
    GROUP BY event_date
    ORDER BY event_date;
    """
    return run(q, {"eid": employee_id})

def team_summary(team_id: int):
    """
    Retrieves a summary of positive and negative events for each employee within a team.

    Args:
        team_id: The ID of the team.

    Returns:
        A pandas DataFrame with columns 'employee_id', 'name', 'pos', and 'neg',
        summarizing the total positive and negative events for each employee in the team,
        ordered by positive events in descending order.
    """
    q = """
    SELECT e.employee_id,
           e.first_name || ' ' || e.last_name AS name,
           SUM(ev.positive_events) AS pos,
           SUM(ev.negative_events) AS neg
    FROM employee e
    JOIN employee_events ev USING(employee_id)
    WHERE e.team_id = :tid
    GROUP BY e.employee_id, name
    ORDER BY pos DESC;
    """
    return run(q, {"tid": team_id})

Overwriting python-package/employee_events/query_base.py


## Add docstrings to employee.py

### Subtask:
Modify the cell containing `%%writefile python-package/employee_events/employee.py` to include docstrings for the `get_employee_timeseries` and `get_employee_snapshot` functions.


**Reasoning**:
The current subtask requires adding docstrings to the `get_employee_timeseries` and `get_employee_snapshot` functions in the `employee.py` file. This can be achieved by modifying the existing `%%writefile` cell for this file to include the required docstrings.



In [48]:
%%writefile python-package/employee_events/employee.py
import numpy as np, pandas as pd
from .query_base import employee_timeseries

def get_employee_timeseries(entity_id, model):
    """
    Retrieves the event time series data for a specific employee.

    Args:
        entity_id: The ID of the employee as a string or integer.
        model: The trained model (not used in this function but included for consistency).

    Returns:
        A pandas DataFrame containing the time series of positive and negative events for the employee.
    """
    return employee_timeseries(int(entity_id))

def get_employee_snapshot(entity_id, model):
    """
    Calculates a snapshot of recent employee activity and predicts risk using the provided model.

    Args:
        entity_id: The ID of the employee as a string or integer.
        model: The trained model used for risk prediction.

    Returns:
        A dictionary containing:
        - pos30: Total positive events in the last 30 days.
        - neg30: Total negative events in the last 30 days.
        - risk: The predicted risk score (a float between 0 and 1).
    """
    ts = employee_timeseries(int(entity_id))
    pos30 = ts.tail(30)["pos"].sum() if not ts.empty else 0
    neg30 = ts.tail(30)["neg"].sum() if not ts.empty else 0
    # Example feature vector; adjust to what your model expects
    X = pd.DataFrame([{"pos30": pos30, "neg30": neg30}])
    try:
        risk = float(model.predict_proba(X)[:,1])
    except Exception:
        risk = 0.0
    return {"pos30": int(pos30), "neg30": int(neg30), "risk": risk}

Overwriting python-package/employee_events/employee.py


## Add docstrings to team.py

### Subtask:
Modify the cell containing `%%writefile python-package/employee_events/team.py` to include docstrings for the `get_team_table` function.


**Reasoning**:
Modify the cell containing %%writefile python-package/employee_events/team.py to include a docstring for the get_team_table function.



In [49]:
%%writefile python-package/employee_events/team.py
from .query_base import team_summary

def get_team_table(entity_id, model):
    """
    Retrieves a summary table of employee events for a specific team.

    Args:
        entity_id: The ID of the team as a string or integer.
        model: The trained model (not used in this function but included for consistency).

    Returns:
        A pandas DataFrame summarizing positive and negative events for each employee in the team.
    """
    return team_summary(int(entity_id))

Overwriting python-package/employee_events/team.py


## Add docstrings to utils.py

### Subtask:
Add docstrings to utils.py


**Reasoning**:
Add the docstring to the `get_model` function in the `report/utils.py` file.



In [50]:
%%writefile report/utils.py
import pickle, pathlib
ASSETS = pathlib.Path(__file__).resolve().parents[1] / "assets"

_model = None
def get_model():
    """
    Loads a pickled model from the 'assets' directory.

    This function implements a simple caching mechanism to avoid reloading
    the model multiple times.

    Returns:
        The loaded model object.
    """
    global _model
    if _model is None:
        with open(ASSETS / "model.pkl", "rb") as f:
            _model = pickle.load(f)
    return _model

Overwriting report/utils.py


## Add docstrings to dashboard.py

### Subtask:
Modify the cell containing `%%writefile report/dashboard.py` to include docstrings for the `Handler` class and its methods (`do_GET`, `_send_json`, `_send_html`).


**Reasoning**:
Modify the cell containing %%writefile report/dashboard.py to include docstrings for the Handler class and its methods.



In [51]:
%%writefile report/dashboard.py
from report.utils import get_model
from employee_events.employee import get_employee_snapshot, get_employee_timeseries
from employee_events.team import get_team_table

# If your FastHTML skeleton exposes classes, import them here.
# Below is a simple bottle/fastapi-like sketch; adapt to your framework.

import http.server, socketserver, json

PORT = 8000
MODEL = get_model()

class Handler(http.server.SimpleHTTPRequestHandler):
    """
    A simple HTTP request handler for the employee events dashboard.

    This handler routes requests to different API endpoints or serves a default HTML page.
    It utilizes functions from the employee_events package and a pre-loaded model.
    """
    def do_GET(self):
        """
        Handles GET requests.

        If the path starts with "/api/employee/", it extracts the employee ID,
        retrieves the employee snapshot data, and sends a JSON response.
        Otherwise, it serves a basic HTML page.
        """
        if self.path.startswith("/api/employee/"):
            eid = int(self.path.split("/")[-1])
            snap = get_employee_snapshot(eid, MODEL)
            self._send_json(snap)
        else:
            # Serve a tiny HTML that calls the API (placeholder)
            html = """
            <html><body>
              <h2>Employee Dashboard</h2>
              <p>Try /api/employee/1</p>
            </body></html>
            """
            self._send_html(html)

    def _send_json(self, obj):
        """
        Sends a JSON response.

        Args:
            obj: The Python object to be serialized to JSON and sent as the response body.
        """
        data = json.dumps(obj).encode()
        self.send_response(200)
        self.send_header("Content-Type","application/json")
        self.send_header("Content-Length", str(len(data)))
        self.end_headers()
        self.wfile.write(data)

    def _send_html(self, html):
        """
        Sends an HTML response.

        Args:
            html: The HTML string to be sent as the response body.
        """
        data = html.encode()
        self.send_response(200)
        self.send_header("Content-Type","text/html")
        self.send_header("Content-Length", str(len(data)))
        self.end_headers()
        self.wfile.write(data)

if __name__ == "__main__":
    with socketserver.TCPServer(("", PORT), Handler) as httpd:
        print(f"Serving on http://127.0.0.1:{PORT}")
        httpd.serve_forever()

Overwriting report/dashboard.py


## Generate readme

### Subtask:
Create markdown cells at the end of the notebook with the content of the README file, summarizing the project and setup steps.


**Reasoning**:
Create a markdown cell with the README content summarizing the project and setup steps based on the provided information in the notebook.



**Reasoning**:
The previous attempt failed because the markdown content was placed inside a code block. I need to create a markdown cell to correctly render the README content.



In [53]:
%%markdown

# Employee Events Dashboard Project

This project is a dashboard to visualize employee events and assess potential risks.

## Project Setup

1.  **Clone the repository:**
    ```bash
    !git clone https://github.com/udacity/dsnd-dashboard-project
    %cd dsnd-dashboard-project
    ```
2.  **Set up a virtual environment and install dependencies:**
    ```bash
    !python -m venv .venv
    !source .venv/bin/activate # Windows: .venv\Scripts\activate
    !pip install -r requirements.txt
    # Note: python-package/requirements.txt caused an error, review if needed.
    # !pip install -r python-package/requirements.txt
    ```
3.  **Install nice-to-have packages:**
    ```bash
    !pip install -q pytest pyngrok
    ```

## Running the Dashboard

To run the dashboard application, execute the following command from the project root directory:bash
!PYTHONPATH=$PYTHONPATH:. python report/dashboard.py

**Note:** There was an `IndentationError` in the `report/dashboard.py` file during a previous execution attempt. This needs to be resolved for the dashboard to run correctly.

## Testing

You can run the project tests using pytest:bash
!pytest -q

**Note:** Currently, the test execution shows "no tests ran". This might indicate an issue with test discovery or configuration.

## Accessing the Dashboard (via ngrok)

If you have ngrok set up and an auth token configured (e.g., as a Colab secret), you can expose the dashboard to the internet. However, the attempt to get the ngrok auth token failed in the notebook.python
!pip install -q pyngrok
from pyngrok import ngrok
from google.colab import userdata

# Ensure NGROK_AUTH_TOKEN is set as a Colab secret
# NGROK_AUTH_TOKEN = userdata.get("NGROK_AUTH_TOKEN")
# ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# public_url = ngrok.connect(8000).public_url
# print("Public URL:", public_url)


# Employee Events Dashboard Project

This project is a dashboard to visualize employee events and assess potential risks.

## Project Setup

1.  **Clone the repository:**
    ```bash
    !git clone https://github.com/udacity/dsnd-dashboard-project
    %cd dsnd-dashboard-project
    ```
2.  **Set up a virtual environment and install dependencies:**
    ```bash
    !python -m venv .venv
    !source .venv/bin/activate # Windows: .venv\Scripts\activate
    !pip install -r requirements.txt
    # Note: python-package/requirements.txt caused an error, review if needed.
    # !pip install -r python-package/requirements.txt
    ```
3.  **Install nice-to-have packages:**
    ```bash
    !pip install -q pytest pyngrok
    ```

## Running the Dashboard

To run the dashboard application, execute the following command from the project root directory:bash
!PYTHONPATH=$PYTHONPATH:. python report/dashboard.py

**Note:** There was an `IndentationError` in the `report/dashboard.py` file during a previous execution attempt. This needs to be resolved for the dashboard to run correctly.

## Testing

You can run the project tests using pytest:bash
!pytest -q

**Note:** Currently, the test execution shows "no tests ran". This might indicate an issue with test discovery or configuration.

## Accessing the Dashboard (via ngrok)

If you have ngrok set up and an auth token configured (e.g., as a Colab secret), you can expose the dashboard to the internet. However, the attempt to get the ngrok auth token failed in the notebook.python
!pip install -q pyngrok
from pyngrok import ngrok
from google.colab import userdata

# Ensure NGROK_AUTH_TOKEN is set as a Colab secret
# NGROK_AUTH_TOKEN = userdata.get("NGROK_AUTH_TOKEN")
# ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# public_url = ngrok.connect(8000).public_url
# print("Public URL:", public_url)


## Summary:

### Data Analysis Key Findings

*   Docstrings were successfully added to the `run` function in `sql_execution.py`, the `employee_timeseries` and `team_summary` functions in `query_base.py`, the `get_employee_timeseries` and `get_employee_snapshot` functions in `employee.py`, the `get_team_table` function in `team.py`, and the `get_model` function in `utils.py`.
*   Docstrings were added to the `Handler` class and its `do_GET`, `_send_json`, and `_send_html` methods in `dashboard.py`.
*   A README file summarizing the project, setup, running instructions, and testing information was generated at the end of the notebook using a markdown cell.

### Insights or Next Steps

*   Resolve the identified `IndentationError` in `report/dashboard.py` to ensure the dashboard application runs correctly.
*   Investigate the "no tests ran" issue during pytest execution to ensure tests are properly discovered and executed.
