evaluation-metrics

Here are 175 public repositories matching this topic...

xinshuoweng / AB3DMOT

(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

tracking machine-learning real-time computer-vision robotics evaluation evaluation-metrics multi-object-tracking kitti 3d-tracking 3d-multi-object-tracking 2d-mot-evaluation 3d-mot 3d-multi kitti-3d

Updated Apr 3, 2024
Python

confident-ai / deepeval

Star

The LLM Evaluation Framework

evaluation-metrics evaluation-framework llm-evaluation llm-evaluation-framework llm-evaluation-metrics

Updated Jul 5, 2024
Python

MIND-Lab / OCTIS

Star

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

nlp natural-language-processing hyperparameter-optimization topic-modeling nlp-library bayesian-optimization hyperparameter-tuning latent-dirichlet-allocation evaluation-metrics neural-topic-models latent-semantic-analysis topic-models hyperparameter-search non-negative-matrix-factorization nlproc

Updated Jun 15, 2024
Python

AgentOps-AI / agentops

Star

Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen

agent ai openai evaluation-metrics mistral cost-estimation autogen groq agentops llm langchain anthropic evals ollama crewai

Updated Jul 6, 2024
Python

jitsi / jiwer

Star

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

python3 automatic-speech-recognition speech-to-text evaluation-metrics wer word-error-rate

Updated May 6, 2024
Python

Unbabel / COMET

Star

A Neural Framework for MT Evaluation

nlp machine-learning natural-language-processing machine-translation artificial-intelligence evaluation-metrics

Updated Jun 30, 2024
Python

bheinzerling / pyrouge

Star

A Python wrapper for the ROUGE summarization evaluation package

nlp summarization rouge evaluation-metrics

Updated Feb 10, 2021
Python

up42 / image-similarity-measures

Star

📈 Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.

processing machine-learning image metrics evaluation-metrics p1

Updated Jul 6, 2024
Python

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Mor…

python nlp machine-learning natural-language-processing library linguistics computational-linguistics text-processing nlp-library search-algorithms evaluation-metrics folia language-modelling

Updated Sep 14, 2023
Python

huggingface / lighteval

Star

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

evaluation evaluation-metrics evaluation-framework huggingface

Updated Jul 5, 2024
Python

davidsbatista / NER-Evaluation

Star

An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity

named-entity-recognition semeval ner crfsuite evaluation-metrics notebook-jupyter semeval-2013 ner-evaluation