# Bias Recognition

Bias = a prejudice or inclination for or against something, often in an unfair way, or as a systematic distortion of results or perceptions.

#### Advanced Method Institution Bias Detection

1)  Source & provenance analysis (who and how)  
- What they do: check the outlet’s ownership, funders, editorial line, staff expertise, and historical bias; verify authorship and publication metadata; track a story’s chain of custody.
- Why: bias often flows from ownership incentives, sponsorship and editorial policy. Fact-checkers and media monitors systematically record source metadata and compare across outlets
>outlet ownership / funding disclosures  
author history and beat expertise  
use of unnamed/anonymous sources (frequency & pattern)  
links / citations used (source diversity)




2) Claim Extraction → Fact-Check Pipeline
- Goal: Extract verifiable factual claims and auto-match them to fact-check databases.
- Libraries: spaCy or Stanza (for NER, dependency parsing), transformers (for claim detection), elasticsearch.
> NER + dependency parsing to extract claims → normalize claims → search fact-check KB (fuzzy match) → present candidate matches.

3) Cross-Outlet Framing Comparator (event-level)
- Goal: For a single event, compare topic distribution, named actors mentioned, sentiment by actor, and omitted facts across outlets.
- Libraries: transformers (BERT / XLM-RoBERTa), NLTK, gensim (topic modeling), networkx, matplotlib/plotly.
> align articles by event (using dates + keywords) → extract topics & entities → compute per-outlet distributions and divergence metrics (KL divergence) → visualize.

4) Stance Detection Classifier
- Goal: Train a model to label sentences as pro/anti/neutral toward a target (policy/person).
- Libraries: transformers (fine-tune RoBERTa/BERT), sklearn, datasets.
> fine-tune transformer for stance → evaluate cross-target generalization → test on news sentences.

5) Lexical Bias Detector (hedging & loaded language)

- Goal: Count and score rhetorical devices (hedges, intensifiers, opinion verbs, passive voice) indicating slant.
- Libraries: spaCy (tagging, dependency), textstat, lexicon libs.
> compute per-article feature vector (hedges per 1k tokens, passive ratio, emotive adj), train a classifier to predict left/right/centrist labels or compare distributions across outlets.



6) Source-Network / Amplification Graph

- Goal: Build and visualize the graph of which sources an article cites and which social accounts amplify it. Detect echo chambers.
- Libraries: NetworkX, SNAP/igraph, Tweepy (Twitter API) or CrowdTangle for FB/IG (if access), BeautifulSoup.
> extract outgoing links and social shares → build graph → compute modularity, centrality, clustering → detect insular clusters.


7) Multimodal Forensics (image/video)

- Goal: Run quick forensic checks on images embedded in news (reverse image, ELA).
- Libraries: OpenCV, pillow, imagehash, requests, Tineye / Google Reverse Image Search (scripted), ffmpeg for video frames.
>extract images → check EXIF → compute image hash and search reverse image → run ELA for manipulation → flag suspicious items.

    


8) LLM-Assisted Bias Explainability Tool

- Goal: Use LLMs (with careful prompt engineering) to produce an explainable summary of potential bias: highlight omitted facts, framing choices, loaded words, and source gaps — then verify suggestions via rule-based checks.
- Libraries: (local LLMs or API) transformers, prompt-tooling libs.

9) Example
> Input: article URL.  
Actions: scrape text → compute sentiment + hedging score + entity frequency → compare entity sentiment to national average (across 10 outlets) → output: simple bias score + highlighted biased sentences.