Skip to content

The ADI idea provides the mathematical solution to a critical, costly problem in modern AI development: resource waste and service latency caused by vague, low-effort inputs. Our goal is to maximize the return on investment (ROI) of expensive Large Language Models (LLMs) by quantifying the quality of each request and intelligently controlling ...

Notifications You must be signed in to change notification settings

VolkanSah/Anti-Dump-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 Anti-Dump-Algorithm

The ADI Framework: A Lesson in Resource Orchestration

Weeding out the nonsense and fostering clarity. We measure "Dumpiness" by quantifying Noise vs. Effort, Context, and Details. πŸ˜…

Project's Core: ADI – The Anti-Dump Index

This repository is a Simulations-Tool (EDU) and a Lesson in API Economics.

The ADI idea provides the mathematical solution to a critical, costly problem in modern AI development: resource waste and service latency caused by vague, low-effort inputs. Our goal is to maximize the return on investment (ROI) of expensive Large Language Models (LLMs) by quantifying the quality of each request and intelligently controlling routing.


The Core Problem: Why My Wallet Started Crying (The Developer's Pain)

When you're building an app with expensive AI, you quickly learn a hard truth: users send you all kinds of "dumpy" inputs. Vague, low-effort requests that cost you money because your premium AI models still have to process them. It's the digital equivalent of someone shouting "ASAP!!!" at the bouncer. For me, this "Dummheit" (stupidity) started hitting my wallet directly, and I had to build a solution.

The Solution: My Digital πŸ‡ΉπŸ‡·-Bouncer from Germany πŸ˜„

Inspired by the concept of a strict gatekeeper, I created a mathematical framework to act as a Quality Gate for API calls. This is the Anti-Dump Index (ADI). Its job is simple: check the quality of every single request at the door. If it's "dump" – a waste of time and money – my bouncer has one simple rule:

"Ej, du kommst hier net rein!"

This phrase (roughly: "Hey, you're not getting in here!") is a nostalgic joke and the perfect metaphor for the system's role: Protecting your financial and computational resources.

This project isn't a full app; it's a Showcase for Resource Orchestration Logic. It's the technical manifestation of a developer's frustration, turned into a powerful, cost-saving solution.


Purpose: The Mathematics Behind the Judgement

The Anti-Dump Algorithm calculates the ADI (Anti-Dump Index) to evaluate input quality by measuring the trade-off between Noise and Actionable Detail.

$$ \text{ADI} = \frac{w_N \cdot \text{Noise} - (w_E \cdot \text{Effort} + w_B \cdot \text{Bonus})}{w_C \cdot \text{Context} + w_D \cdot \text{Details} + w_P \cdot \text{Penalty}} $$

Key Parameters: Your Thinking Process Quantified

Parameter Description Example
Noise Irrelevant words/phrases (cost drivers) "ASAP", "???"
Effort Clarity/structure (your time saved) Complete sentences, formatting
Context Background information (OS, framework) "Python 3.9 on Windows"
Details Technical depth (solution prerequisites) Error logs, code snippets
Bonus Positive elements (value accelerators) Code blocks, precise terms
Penalty Negative elements (frustration drivers) ALL CAPS, excessive "!!!"
Table of Contents (Lesson Plan)
  1. Core Concepts
  2. Formula Explained
  3. Quality Zones
  4. Advanced Metrics
  5. Real-World Examples
  6. Practical Implementation: API Routing
  7. Integration Guide
  8. Full Code
  9. Extended Logic
  10. FAQs
  11. License

1. Core Concepts

Why ADI Matters (The Economic Impact)

  • Vague requests waste resources: Inputs like "Help plz urgent!!!" cost money and reduce throughput.
  • Missing details delay solutions: No error messages/code requires costly follow-up iterations by the model.
  • AI costs accumulate: The ADI stops accumulating costs on low-value input.

How ADI Works (The Simulation Logic)

  1. Quantify: Measure input components (Noise, Effort, etc.).
  2. Calculate: Compute the ADI score using the weighted formula.
  3. Classify: Route the request based on the quality zones:
    • πŸŸ₯ Dump Zone (ADI > 1): Reject (High resource risk).
    • 🟨 Gray Area (0 ≀ ADI ≀ 1): Review (Medium priority).
    • 🟩 Genius Zone (ADI < 0): Prioritize (High quality; Route to premium model).

2. Formula Explained

Base Formula (Simplified)

$$\text{ADI} = \frac{\text{Noise} - \text{Effort}}{\text{Context} + \text{Details}}$$

Full Formula (Weighted)

$$ADI = \frac{w_N \cdot \text{Noise} - (w_E \cdot \text{Effort} + w_B \cdot \text{Bonus})}{w_C \cdot \text{Context} + w_D \cdot \text{Details} + w_P \cdot \text{Penalty}}$$

Weights customize for different use cases:

weights = {
    "noise": 1.0, 
    "effort": 2.0, 
    "context": 1.5,
    "details": 1.5,
    "bonus": 0.5,
    "penalty": 1.0
}

3. Quality Zones

Interpretation Guide

Zone ADI Range Action Characteristics
Dump Zone > 1 Reject High noise, low effort, missing details
Gray Area 0-1 Review Partial context, some effort needed
Genius Zone < 0 Prioritize Clear, contextualized, detailed

4. Advanced Metrics

4.1 Typo-Adjusted Noise

$$\text{Noise}_{\text{adj}} = \text{Noise} \cdot (1 - \frac{\text{Details}}{\text{Total Words}})$$
def calculate_typos(text):
    typo_pattern = r'\b[a-zA-Z]{1,2}\b|\b[^\s]+[^a-zA-Z0-9\s]+\b'
    typos = len(re.findall(typo_pattern, text))
    return typos / max(len(text.split()), 1)

4.2 Substance Score

Detect "fancy but empty" inputs:

$$\text{Substance} = \frac{\text{Effort} + \text{Details}}{\text{Noise} + \text{PseudoTerms} + 1}$$

4.3 Gradient Analysis

Measure sensitivity to improvements:

$$\nabla\text{ADI} = \frac{\partial \text{ADI}}{\partial (\text{Effort}, \text{Details})}$$

5. Real-World Examples

5.1 Disaster Input

"Help plssss! My code doesn't work. Fix it! ASAP!!!"

noise = 0.75   # 6/8 words irrelevant
effort = 0.1    # No structure
context = 0     # No environment info
details = 0     # No technical details

ADI = (0.75 - 0.1) / (0 + 0) = ∞  # πŸŸ₯ Instant rejection

5.2 Medium Quality

"Python script throws KeyError when accessing dictionary"

noise = 0.1    # Minimal filler
effort = 0.8    # Clear statement
context = 0.7   # Language specified
details = 0.5   # Error type identified

ADI = (0.1 - 0.8) / (0.7 + 0.5) = -0.58  # 🟩 Good candidate

5.3 Perfect Input

"Getting KeyError in Python 3.9 when accessing missing dictionary keys. Code example: print(my_dict['missing'])"

noise = 0.0     # No irrelevant words
effort = 1.0    # Well-structured
context = 1.0   # Python version specified
details = 1.0   # Code example provided
bonus = 0.5     # Code formatting

ADI = (0 - (2.0*1.0 + 0.5*0.5)) / (1.5*1.0 + 1.5*1.0) = -0.92  # 🟩 Prioritize

6. Practical Implementation: Intelligent API Routing

This section is critical. The included example_app.py demonstrates how the ADI translates into a resource management strategy. The core value is not just the score, but its use as a Quality Gate for your most valuable LLMs.

6.1 The ADI Routing Workflow (The Bouncer's Decision Tree)

The ADI score informs the internal routing decision:

  1. Rejection: If the ADI score is in the Dump Zone (ADI > 1.0), the request is rejected immediately.
  2. Prioritization: If the score is in the Genius Zone (ADI < 0), the request is sent to a Premium, Deep Analysis Model (e.g., DeepSeek) to maximize value.
  3. Specialization: For inputs in the Gray Area, the ADI works with simple Content Filtering to send the request to the most specialized, cost-effective model (e.g., a programming-focused model like Claude).

6.2 Code Snippet: The Core Routing Logic

The following code from example_app.py is the heart of the ADI's resource management simulation:

# The core ADI routing logic.
if adi_value > 1.0:
    # High dumpiness: Rejection (Saves processing resources)
    response_text = reject_processing(input_text)
    api_used = "Rejection"
elif adi_value < 0:
    # High-quality input: Route to a premium model.
    response_text = deepseek_processing(input_text)
    api_used = "DeepSeek (Deep Analysis)"
# ... (Weitere Content-Logik)

Use Cases

Domain Application
Support Systems Auto-filter low-quality tickets
Education Grade essay substance vs. fluff
Recruitment Screen application quality
Forums Reward high-quality contributions

7. Integration Guide

API Quality Gating

from adi import DumpindexAnalyzer

def route_request(input_text):
    analyzer = DumpindexAnalyzer()
    result = analyzer.analyze(input_text)
    
    if result['adi'] > 1:
        # Use cheap model for low-quality input
        return gpt3_process(input_text)  
    elif result['adi'] < 0:
        # Use high-quality model for valuable input
        return gpt4_process(input_text)
    else:
        # Standard processing
        return default_process(input_text)

Expected Output

{
  "adi": -0.92,
  "metrics": {
    "noise": 0.05,
    "effort": 0.91,
    "context": 0.85,
    "details": 0.78,
    "bonus": 0.4,
    "penalty": 0.1
  },
  "diagnosis": "High-quality input: Contains code example and version details",
  "suggestions": [
    "Add error log for even better analysis"
  ]
}

8. Full Code

adi.py

This file contains the implementation of the Anti-Dump Algorithm. It includes functions to calculate noise, effort, context, details, bonus factors, and penalty factors, as well as to compute the ADI. You can use the ADI as follows:

from adi import DumpindexAnalyzer

# Initialisiere den ADI-Analyzer
analyzer = DumpindexAnalyzer()

View adi.py Source Code

example_app.py

This file demonstrates how to use the adi.py implementation in a simple Flask application. It includes endpoints to analyze input text and return the ADI and recommendations.

View example_app.py Source Code


9. Extended Logic

9.1 Typo Tolerance System

Adjusts for error-proneness without penalizing non-native speakers:

def calculate_typos(self, text: str) -> float:
    """Calculate typo percentage in text"""
    words = text.split()
    total_words = len(words)
    typo_pattern = r'\b[a-zA-Z]{1,2}\b|\b[^\s]+[^a-zA-Z0-9\s]+\b'
    typos = len(re.findall(typo_pattern, text))
    return typos / max(total_words, 1)

9.2 Substance Profiler

Detects "pseudo-competent" inputs that sound sophisticated but lack substance:

def calculate_substance_score(self, text: str) -> float:
    """Detect fancy but empty inputs"""
    pseudo_terms = r'\b(optimal|synergy|innovative|disruptive|synergize)\b'
    pseudo_count = len(re.findall(pseudo_terms, text.lower()))
    
    return (self.calculate_effort(text) + self.calculate_details(text)) / \
           (self.calculate_noise(text) + pseudo_count + 1)

9.3 Adaptive Noise Calculation

Reduces noise impact when sufficient details are present:

def calculate_adjusted_noise(self, text: str) -> float:
    """Adjust noise based on detail density"""
    base_noise = self.calculate_noise(text)
    detail_score = self.calculate_details(text)
    total_words = len(text.split())
    
    return base_noise * (1 - detail_score / max(total_words, 1))

9.4 Anti-Dump Gradient

Measures sensitivity to input improvements:

$$\nabla\text{ADI} = \begin{bmatrix} \frac{\partial \text{ADI}}{\partial \text{Effort}} \\\ \frac{\partial \text{ADI}}{\partial \text{Details}} \end{bmatrix} = \begin{bmatrix} -\frac{w_E}{D} \\ \frac{w_N N \cdot w_D}{D^2} \end{bmatrix}$$

Where ( D ) = Denominator of ADI formula


10. FAQs

Q: How do I adjust weights for my use case?
A: Modify the weights dictionary:

custom_weights = {
    'noise': 1.2,   # Increase if noise is critical
    'details': 2.0,  # Prioritize technical depth
    'bonus': 0.3     # Reduce formatting importance
}

Q: Can I use ADI with non-English text?
A: Yes! Update the noise patterns and linguistic features in the calculation methods.

Q: What's the performance impact?
A: Minimal - analysis takes <100ms for typical inputs. Caching can optimize repeated requests.


11. License

Apache 2.0 License - Full Text

Acknowledgments: To all who've suffered through "URGENT!!!" requests - may your inputs always be clear! πŸ˜„

Contribute: Found this useful? Star the repo ⭐ or buy me a coffee β˜•!

Stay Dump-Free! πŸš€

About

The ADI idea provides the mathematical solution to a critical, costly problem in modern AI development: resource waste and service latency caused by vague, low-effort inputs. Our goal is to maximize the return on investment (ROI) of expensive Large Language Models (LLMs) by quantifying the quality of each request and intelligently controlling ...

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

  •  

Languages