Skip to content
View mehdidc's full-sized avatar

Block or report mehdidc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Scaling Data-Constrained Language Models

Jupyter Notebook 335 19 Updated Sep 22, 2024

Multi-GPU CUDA stress test

C++ 1,618 317 Updated Aug 20, 2024

Where GPUs get cooked 👩‍🍳🔥

Rust 217 11 Updated Mar 4, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 689 29 Updated Mar 27, 2025

This package contains the original 2012 AlexNet code.

Cuda 2,166 267 Updated Mar 12, 2025

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 330 7 Updated Mar 20, 2025

A Conversational Speech Generation Model

Python 11,850 989 Updated Mar 27, 2025

MINT-1T: A one trillion token multimodal interleaved dataset.

805 20 Updated Jul 31, 2024

Densely Captioned Images (DCI) dataset repository.

Python 175 5 Updated Jul 1, 2024

⏰ AI conference deadline countdowns

TypeScript 245 33 Updated Mar 15, 2025

An open-source implementation of Scaling Laws for Neural Language Models using nanoGPT

Python 41 5 Updated Dec 8, 2023

A method for calculating scaling laws for LLMs from publicly available models

Python 9 Updated Apr 22, 2024

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

Python 1,288 112 Updated Mar 13, 2025

[ICCV'23 Oral] The introduction and toolkit for EqBen Benchmark

Python 127 1 Updated Dec 11, 2023

Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".

Python 44 5 Updated Feb 28, 2024

Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)

Python 65 13 Updated Mar 19, 2021

Toolkit for Visual7W visual question answering dataset

Python 76 18 Updated Oct 8, 2019

Data repository for the VALSE benchmark.

Python 37 3 Updated Feb 15, 2024

Scalable data pre processing and curation toolkit for LLMs

Jupyter Notebook 851 116 Updated Mar 27, 2025

Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]

Python 54 2 Updated Dec 10, 2024

Benchmark for Basic Language Abilities of Multimodal Pretrained Transformers

Jupyter Notebook 3 1 Updated Jun 28, 2024
Python 11 Updated Feb 6, 2024

A Web Interface for chatting with your local LLMs via the ollama API

Vue 959 139 Updated Mar 3, 2025

A toolkit for scaling law research ⚖

Python 49 4 Updated Jan 27, 2025

Force DeepSeek r1 models to think for as long as you wish

Jupyter Notebook 357 22 Updated Feb 18, 2025

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

Python 122 14 Updated Jun 7, 2023

Process Common Crawl data with Python and Spark

Python 422 89 Updated Feb 11, 2025

A polite and user-friendly downloader for Common Crawl data

Rust 36 1 Updated Mar 22, 2025

Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unita…

Python 1,024 119 Updated Mar 7, 2025

Automatic evals for LLMs

HTML 344 36 Updated Mar 27, 2025
Next
Showing results