
# 🔎 POLARIS: Policy-Aligned Local Review Inference System
Online reviews shape how people perceive local businesses and locations. Yet, irrelevant, misleading or low-quality reviews can distort reputation and mislead users.

This project, developed for a hackathon challenge, lets you use your Ollama-based review evaluator directly in Jupyter, either for **single reviews** or **batch-process a CSV** to evaluate the quality and relevancy of online location reviews, aligning them with platform policies.

**What it does**
- Classifies each review as **Valid** or **Flagged** (sentiment-neutral).
- If flagged, selects a **single strongest** policy violation.
- Adds a short explanation.

**Policies enforced**
1. **No Advertisement** — no promotional content, discounts, links.
2. **No Irrelevant Content** — must focus on the experience at the specified location.
3. **No Rant Without Visit** — must be based on a real visit/experience (not hearsay).


---

### Prereqs (run outside Jupyter or in a separate terminal)
1) Install **Ollama**: https://ollama.ai/download  
2) Pull a model (this notebook assumes `llama3.2`):  
```bash
ollama pull llama3.2
```
3) Start the Ollama server (in a terminal, not in this notebook if possible):
```bash
ollama serve
```
(Default endpoint: `http://localhost:11434`)

> You *can* try starting it here with `!ollama serve &`, but it's more reliable to run it in a separate terminal so it keeps running.


In [1]:
# If needed: install Python packages used in this notebook
# (Run once, then restart the kernel if prompted)
!pip -q install --upgrade pip
!pip -q install langchain-ollama langchain-core requests pandas ipywidgets



## Imports & Model Setup
This uses the same logic as your script, adapted for notebook use.


In [2]:
import os
import csv
import json
from typing import Tuple, Optional, List

import pandas as pd
from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import ChatPromptTemplate

# -------------------------
# Model & Prompt
# -------------------------
model = OllamaLLM(model="llama3.2")

template = """
You are an expert in identifying trustworthy and policy-compliant location reviews.

Location Being Reviewed:
{location}

Review to Evaluate:
{review}

Policies to Enforce (sentiment-neutral):
    1. No Advertisement: Reviews should not contain promotional content, discount offers, or links.
    2. No Irrelevant Content: Reviews should focus on the experience at the specified location, not unrelated matters.
    3. No Rant Without Visit: Complaints or praise must be based on an actual visit/experience; pure speculation, hearsay, or second-hand rants should be flagged.

Important: Negative or strongly critical reviews are allowed and should not be flagged solely due to sentiment. Only flag if a policy is violated.

Instructions:
    • Determine whether the review is Valid or Flagged.
    • If flagged, choose the single strongest policy violated (Primary Violation).
    • Provide a brief explanation (1 to 2 sentences).

Output Format:
• Decision: Valid / Flagged
• Primary Violation (if flagged): [Policy Name]
• Explanation: [Short reasoning]
"""

prompt = ChatPromptTemplate.from_template(template)
chain = prompt | model

# -------------------------
# Helpers
# -------------------------

def _normalize(name: str) -> str:
    return name.strip().lower().replace(" ", "").replace("_", "")

def _guess_columns(headers: List[str]) -> Tuple[Optional[str], Optional[str]]:
    """
    Try to auto-detect location/review columns using common names.
    Returns (location_col, review_col) or (None, None) if not found.
    """
    norm_map = {_normalize(h): h for h in headers}

    # Common variants
    loc_candidates = [
        "location", "place", "venue", "restaurant", "store", "hotel", "site", "spot"
    ]
    rev_candidates = [
        "review", "text", "comment", "feedback", "content", "body", "ratingtext"
    ]

    found_loc = next((norm_map[n] for n in loc_candidates if n in norm_map), None)
    found_rev = next((norm_map[n] for n in rev_candidates if n in norm_map), None)
    return found_loc, found_rev

def _parse_model_output(output_text: str):
    """
    Best-effort parser for the model's three-line output format.
    Returns dict with keys: decision, violation, explanation, raw
    """
    lines = [l.strip() for l in output_text.splitlines() if l.strip()]
    decision, violation, explanation = "", "", ""

    for ln in lines:
        low = ln.lower()
        if "decision:" in low and not decision:
            decision = ln.split(":", 1)[1].strip()
        elif "primary violation" in low and not violation:
            violation = ln.split(":", 1)[1].strip()
        elif "explanation:" in low and not explanation:
            explanation = ln.split(":", 1)[1].strip()

    return {
        "Decision": decision,
        "Primary Violation": violation,
        "Explanation": explanation,
        "RawOutput": output_text,
    }

def evaluate(location: str, review: str):
    """Evaluate one review (returns a dict)"""
    res = chain.invoke({"location": location.strip(), "review": review.strip()})
    text = getattr(res, "content", res) if not isinstance(res, str) else res
    return _parse_model_output(text)

def evaluate_dataframe(df: pd.DataFrame, location_col: Optional[str] = None, review_col: Optional[str] = None, include_raw: bool = False) -> pd.DataFrame:
    """
    Notebook-friendly batch evaluation.
    - Auto-detects columns if not provided.
    - Returns a copy of the DataFrame with appended columns.
    """
    if df is None or df.empty:
        raise ValueError("Input DataFrame is empty.")

    headers = list(df.columns)
    loc_col, rev_col = location_col, review_col

    if not loc_col or not rev_col:
        guessed_loc, guessed_rev = _guess_columns(headers)
        loc_col = loc_col or guessed_loc
        rev_col = rev_col or guessed_rev

    if not loc_col or not rev_col or loc_col not in headers or rev_col not in headers:
        raise ValueError(f"Could not find location/review columns. Available headers: {headers}\n"
                         f"Pass explicit column names via location_col=... and review_col=..." )

    out_df = df.copy()
    decisions, violations, explanations, raws = [], [], [], []

    for _, row in out_df.iterrows():
        location = str(row.get(loc_col, "") or "").strip()
        review = str(row.get(rev_col, "") or "").strip()
        if not review:
            decisions.append("")
            violations.append("")
            explanations.append("")
            raws.append("")
            continue
        result = evaluate(location, review)
        decisions.append(result["Decision"])
        violations.append(result["Primary Violation"])
        explanations.append(result["Explanation"])
        raws.append(result["RawOutput"])

    out_df["Decision"] = decisions
    out_df["Primary Violation"] = violations
    out_df["Explanation"] = explanations
    if include_raw:
        out_df["RawOutput"] = raws

    return out_df

def evaluate_csv_to_csv(input_path: str, output_path: Optional[str] = None, location_col: Optional[str] = None, review_col: Optional[str] = None, include_raw: bool = False) -> str:
    """
    File-based helper for batch processing (no interactive input()).
    Returns path to the written CSV.
    """
    if not os.path.isfile(input_path):
        raise FileNotFoundError(f"File not found: {input_path}")

    df = pd.read_csv(input_path, encoding="utf-8-sig")
    out_df = evaluate_dataframe(df, location_col=location_col, review_col=review_col, include_raw=include_raw)

    if not output_path:
        base, ext = os.path.splitext(input_path)
        output_path = f"{base}_evaluated{ext}"

    out_df.to_csv(output_path, index=False, encoding="utf-8")
    return output_path

print("✅ Setup complete. If your Ollama server is running and the 'llama3.2' model is available, you're ready!")

✅ Setup complete. If your Ollama server is running and the 'llama3.2' model is available, you're ready!



---

### Notes & Tips
- If you see connection errors, make sure **Ollama server is running** and the **model is pulled**.
- You can swap models by changing `OllamaLLM(model="llama3.2")` to any local model you've pulled.
- To include the raw LLM output for debugging, set `include_raw=True` in the batch helpers.
- When running large CSVs, expect it to take a while since each row calls the model.


## 1) Evaluate a Single Review
Run the next cell, edit the `location` and `review` strings, then run it again to test different inputs.


In [3]:
# Example: Single review evaluation. Feel free to enter your review and try!
sample_location = "Cafe Aurora"
sample_review = "Great cappuccino and relaxing atmosphere!"
result = evaluate(sample_location, sample_review)
result  # dict of Decision / Primary Violation / Explanation / RawOutput


{'Decision': 'Valid',
 'Primary Violation': 'None',
 'Explanation': "The review is focused on the experience at Cafe Aurora, mentioning a great cappuccino and relaxing atmosphere, without any promotional content or unrelated matters. There's no indication of speculation or hearsay.",
 'RawOutput': "Decision: Valid\nPrimary Violation: None\nExplanation: The review is focused on the experience at Cafe Aurora, mentioning a great cappuccino and relaxing atmosphere, without any promotional content or unrelated matters. There's no indication of speculation or hearsay."}


## 2) Batch Process a DataFrame (in-notebook)
If you already have a `pandas.DataFrame`, use `evaluate_dataframe` and display the results.


In [4]:
# Example: Build a small DataFrame and evaluate
df_example = pd.DataFrame([
    {"location": "Pizza House", "review": "Best pizza! Visit www.pizzapromo.com for discounts!"},
    {"location": "Furama Hotel", "review": "I had a weird dream yesterday"},
    {"location": "Sakura Sushi", "review": "Never been here, but I heard it’s terrible!"},
])

evaluated = evaluate_dataframe(df_example, include_raw=False)
evaluated

Unnamed: 0,location,review,Decision,Primary Violation,Explanation
0,Pizza House,Best pizza! Visit www.pizzapromo.com for disco...,Flagged,No Advertisement,The review contains a link to the website www....
1,Furama Hotel,I had a weird dream yesterday,Flagged,No Irrelevant Content,"The review ""I had a weird dream yesterday"" is ..."
2,Sakura Sushi,"Never been here, but I heard it’s terrible!",Flagged,No Rant Without Visit,Although the review expresses a strongly criti...



## 3) Batch Process a CSV (Path-based)
- Your CSV should contain columns for location and review (auto-detects common names like `location`, `place`, `review`, `text`, etc.).  
- If auto-detection fails, pass explicit column names.
- You can download the sample csv file we used [here](https://github.com/edwin-ljx/Filtering-the-Noise-ML-for-Trustworthy-Location-Reviews/blob/main/location_reviews.csv).


In [5]:
# Example: Batch process from a CSV path
# 1) Put your CSV on disk and set the path below.
# 2) Optionally set location_col and review_col if auto-detect doesn't work.

input_csv_path = "/Users/scormon/Downloads/location_reviews.csv" # Input your file path 
output_csv_path = None  # or set a custom path
out_path = evaluate_csv_to_csv(input_csv_path, output_path=output_csv_path, location_col="location", review_col="review", include_raw=False) # change accordingly
print("Wrote:", out_path)


Wrote: /Users/scormon/Downloads/location_reviews_evaluated.csv



## 4) **Evaluate Predictions vs. Ground Truth**  
This section adds your evaluation script (decision/violation canonicalization, accuracy, and mismatch export) adapted for Jupyter.  
Use it to compare model outputs (eg, from the batch step) against ground-truth labels.


In [14]:
import csv
from typing import Dict, List, Optional
import pandas as pd
from IPython.display import display

# --- Canonical maps (kept as data) ---
_CANONICAL_DECISION_MAP = {"valid": "Valid", "flagged": "Flagged"}
_CANONICAL_VIOLATION_MAP = {
    "no advertisement": "No Advertisement",
    "no irrelevant content": "No Irrelevant Content",
    "no rant without visit": "No Rant Without Visit",
    "none": "None",
}
# Optional: common aliases
_VIOLATION_ALIASES = {
    "advertising": "No Advertisement",
    "advertisement": "No Advertisement",
    "ads": "No Advertisement",
    "promo": "No Advertisement",
    "promotion": "No Advertisement",
    "irrelevant": "No Irrelevant Content",
    "off-topic": "No Irrelevant Content",
    "rant without visit": "No Rant Without Visit",
    "speculation": "No Rant Without Visit",
    "hearsay": "No Rant Without Visit",
    "": "None",
}

def _safe_strip(x) -> str:
    return "" if x is None or (isinstance(x, float) and pd.isna(x)) else str(x).strip()

def _norm(s: Optional[str]) -> str:
    s = _safe_strip(s)
    return " ".join(s.lower().split())

def canonicalize_decision(s: Optional[str]) -> str:
    key = _norm(s)
    if key.startswith("valid"):
        return "Valid"
    if key.startswith("flag"):
        return "Flagged"
    return _CANONICAL_DECISION_MAP.get(key, _safe_strip(s))

def canonicalize_violation(s: Optional[str]) -> str:
    key = _norm(s)
    if key in _CANONICAL_VIOLATION_MAP:
        return _CANONICAL_VIOLATION_MAP[key]
    if key in _VIOLATION_ALIASES:
        return _VIOLATION_ALIASES[key]
    # fuzzy contains
    if "advert" in key or "promo" in key:
        return "No Advertisement"
    if "irrelev" in key or "offtopic" in key:
        return "No Irrelevant Content"
    if "visit" in key or "hearsay" in key or "speculat" in key or "rant" in key:
        return "No Rant Without Visit"
    return "None" if key == "" else _safe_strip(s)

def load_csv_rows(path: str) -> List[Dict]:
    with open(path, "r", encoding="utf-8-sig", newline="") as f:
        return list(csv.DictReader(f))

def evaluate_predictions(
    rows: List[Dict],
    pred_dec_col: str,
    pred_vio_col: str,
    gt_dec_col: str,
    gt_vio_col: str,
    id_col: Optional[str] = None,
    mismatches_out: str = "mismatches.csv",
    review_col_candidates: Optional[List[str]] = None,
    preview_rows: int = 50,
    show_preview: bool = True,
):
    """
    Evaluate predictions vs ground truth and (optionally) show/save mismatches.
    """
    review_col_candidates = review_col_candidates or [
        "review", "Review", "text", "Text", "content", "Content"
    ]

    def _pick_review(r: Dict) -> str:
        for c in review_col_candidates:
            if c in r and _safe_strip(r.get(c)):
                return _safe_strip(r.get(c))
        return ""

    total = len(rows)
    correct_dec, correct_vio = 0, 0
    mismatches: List[Dict[str, str]] = []

    for r in rows:
        pid = r.get(id_col, "") if id_col else ""

        pred_dec = canonicalize_decision(r.get(pred_dec_col, ""))
        gt_dec   = canonicalize_decision(r.get(gt_dec_col, ""))

        pred_vio = canonicalize_violation(r.get(pred_vio_col, ""))
        gt_vio   = canonicalize_violation(r.get(gt_vio_col, ""))

        # If decision is Valid, violation must be None
        if pred_dec == "Valid":
            pred_vio = "None"
        if gt_dec == "Valid":
            gt_vio = "None"

        if pred_dec == gt_dec:
            correct_dec += 1
        if pred_vio == gt_vio:
            correct_vio += 1
        else:
            mismatches.append({
                "id": _safe_strip(pid),
                "pred_decision": pred_dec,
                "gt_decision": gt_dec,
                "pred_violation": pred_vio,
                "gt_violation": gt_vio,
                "review": _pick_review(r),
            })

    # Metrics
    decision_acc  = (correct_dec / total) if total else 0.0
    violation_acc = (correct_vio / total) if total else 0.0

    # Report
    print("===== Evaluation Summary =====")
    print(f"Total rows:           {total}")
    print(f"Decision accuracy:    {correct_dec}/{total} = {decision_acc:.2%}")
    print(f"Violation accuracy:   {correct_vio}/{total} = {violation_acc:.2%}")

    # Save & preview mismatches
    mismatches_df = pd.DataFrame(mismatches)
    if not mismatches_df.empty:
        mismatches_df.to_csv(mismatches_out, index=False, encoding="utf-8")
        print(f"\n⚠️  Found {len(mismatches)} mismatches. Saved to: {mismatches_out}")
        if show_preview:
            print(f"📋 Preview (first {preview_rows} rows):")
            display(mismatches_df.head(preview_rows))
    else:
        print("\n✅ No mismatches found.")

    return {
        "total": total,
        "decision_accuracy": (correct_dec, total, decision_acc),
        "violation_accuracy": (correct_vio, total, violation_acc),
        "mismatches_path": mismatches_out if not mismatches_df.empty else None,
        "mismatches_df": mismatches_df if not mismatches_df.empty else pd.DataFrame(
            columns=["id","pred_decision","gt_decision","pred_violation","gt_violation","review"]
        ),
    }



### Run an evaluation on a CSV
1. Set `eval_csv_path` to your CSV file that contains predictions and ground truth.  
2. Adjust column names below if yours differ from the defaults:
   - Predictions: `Decision`, `Primary Violation`
   - Ground Truth: `GT_Decision`, `GT_Violation`
3. (Optional) Provide an ID column name if available.
4. Run the cell.


You can download the sample csv file we used [here](https://github.com/edwin-ljx/Filtering-the-Noise-ML-for-Trustworthy-Location-Reviews/blob/main/location_reviews_evaluated.csv).


In [15]:
# Example (edit and run):
eval_csv_path = "/Users/scormon/Downloads/location_reviews_evaluated.csv" # Input file path
rows = load_csv_rows(eval_csv_path)
evaluate_predictions(
    rows,
    pred_dec_col="Decision",
    pred_vio_col="Primary Violation",
    gt_dec_col="correct_decision",
    gt_vio_col="correct_labels",
    id_col="review_id",  # or e.g. "id"
    mismatches_out="mismatches.csv",
)


===== Evaluation Summary =====
Total rows:           20
Decision accuracy:    18/20 = 90.00%
Violation accuracy:   14/20 = 70.00%

⚠️  Found 6 mismatches. Saved to: mismatches.csv
📋 Preview (first 50 rows):


Unnamed: 0,id,pred_decision,gt_decision,pred_violation,gt_violation,review
0,R006,Flagged,Flagged,No Rant Without Visit,No Irrelevant Content,I love my new phone’s camera—portrait mode is ...
1,XIC1,Valid,Flagged,,No Irrelevant Content,This place reminds me of my favorite video gam...
2,R018,Flagged,Flagged,No Advertisement,No Rant Without Visit,"I’ve never brought my pet here, but prices loo..."
3,R011,Valid,Flagged,,No Rant Without Visit,"Never visited this place, but I’m giving 1 sta..."
4,XIC2,Flagged,Flagged,No Rant Without Visit,No Irrelevant Content,"I just bought new running shoes, can’t wait to..."
5,XIC4,Flagged,Flagged,No Rant Without Visit,No Irrelevant Content,"The stock market is crazy these days, thought ..."


{'total': 20,
 'decision_accuracy': (18, 20, 0.9),
 'violation_accuracy': (14, 20, 0.7),
 'mismatches_path': 'mismatches.csv',
 'mismatches_df':      id pred_decision gt_decision         pred_violation  \
 0  R006       Flagged     Flagged  No Rant Without Visit   
 1  XIC1         Valid     Flagged                   None   
 2  R018       Flagged     Flagged       No Advertisement   
 3  R011         Valid     Flagged                   None   
 4  XIC2       Flagged     Flagged  No Rant Without Visit   
 5  XIC4       Flagged     Flagged  No Rant Without Visit   
 
             gt_violation                                             review  
 0  No Irrelevant Content  I love my new phone’s camera—portrait mode is ...  
 1  No Irrelevant Content  This place reminds me of my favorite video gam...  
 2  No Rant Without Visit  I’ve never brought my pet here, but prices loo...  
 3  No Rant Without Visit  Never visited this place, but I’m giving 1 sta...  
 4  No Irrelevant Content  I jus