-
Juelich Supercomputing Center (JSC), Forschungszentrum Jülich GmbH, LAION
- Germany
- https://mehdidc.github.io
- @mehdidc
Stars
- All languages
- ANTLR
- ActionScript
- Assembly
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Common Lisp
- Common Workflow Language
- Coq
- Cuda
- Cython
- D
- Dart
- Dockerfile
- Emacs Lisp
- Erlang
- Forth
- Fortran
- FreeMarker
- GLSL
- Go
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Mako
- Markdown
- Mathematica
- NASL
- NewLisp
- Nim
- OCaml
- PHP
- Perl
- PostScript
- PowerShell
- Protocol Buffer
- Pug
- Python
- R
- Reason
- Roff
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Slash
- Svelte
- Swift
- SystemVerilog
- TeX
- TypeScript
- Vala
- Vim Script
- Vue
- reStructuredText
Scaling Data-Constrained Language Models
Understanding R1-Zero-Like Training: A Critical Perspective
This package contains the original 2012 AlexNet code.
[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
MINT-1T: A one trillion token multimodal interleaved dataset.
Densely Captioned Images (DCI) dataset repository.
⏰ AI conference deadline countdowns
shehper / scaling_laws
Forked from karpathy/nanoGPTAn open-source implementation of Scaling Laws for Neural Language Models using nanoGPT
A method for calculating scaling laws for LLMs from publicly available models
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
[ICCV'23 Oral] The introduction and toolkit for EqBen Benchmark
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)
Toolkit for Visual7W visual question answering dataset
Scalable data pre processing and curation toolkit for LLMs
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
Benchmark for Basic Language Abilities of Multimodal Pretrained Transformers
A Web Interface for chatting with your local LLMs via the ollama API
Force DeepSeek r1 models to think for as long as you wish
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.
Process Common Crawl data with Python and Spark
A polite and user-friendly downloader for Common Crawl data
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unita…