ConvNets is a PyTorch-based library that provides a transformers-like interface for working with Convolutional Neural Networks. It offers easy-to-use APIs for training, fine-tuning, and inference with popular CNN architectures.
Transformers-like API
Familiar interface similar to Hugging Face Transformers.
Multiple Architectures
Supports ResNet, EfficientNet, Vision Transformer, and custom ConvNets.
Timm Integration
Seamlessly load pretrained models from the timm library.
Easy Training
Built-in trainer with logging, evaluation, and checkpointing.
Flexible Configuration
JSON-based configuration system.
Image Processing
Built-in image preprocessing and data utilities.
Comprehensive Metrics
Built-in evaluation metrics and logging.
CLI Tools
Command-line interface for training and inference.
pip install convnetsOr install from source:
git clone https://github.com/naman466/convnets
cd convnets
pip install -e .from convnets import ResNetForImageClassification
# Load from timm
model = ResNetForImageClassification.from_pretrained("resnet18", from_timm=True)
# Or load from local checkpoint
model = ResNetForImageClassification.from_pretrained("./my_model")from PIL import Image
from convnets.image_processing import ConvNetImageProcessor
# Load and process image
image_processor = ConvNetImageProcessor()
image = Image.open("image.jpg")
inputs = image_processor.preprocess(image)
# Run inference
outputs = model(**inputs)
predictions = outputs.logits.softmax(dim=-1)from convnets.trainer import ConvNetTrainer
from convnets.training_args import ConvNetTrainingArguments
from convnets.utils import ImageClassificationDataset
# Setup training arguments
training_args = ConvNetTrainingArguments(
output_dir="./results",
num_train_epochs=10,
per_device_train_batch_size=32,
learning_rate=1e-4,
evaluation_strategy="epoch",
)
# Create datasets
train_dataset = ImageClassificationDataset.from_folder("./train_data")
eval_dataset = ImageClassificationDataset.from_folder("./eval_data")
# Initialize trainer
trainer = ConvNetTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
)
# Start training
trainer.train()| Architecture | Variants | Description |
|---|---|---|
| ResNet | ResNet18, 34, 50, 101, 152 | Deep residual networks with skip connections |
| EfficientNet | EfficientNet-B0 through B7 | Efficient convolutional networks with compound scaling |
| Vision Transformer | ViT-Base, ViT-Large | Transformer architecture adapted for computer vision |
| Custom ConvNet | Flexible architecture | Customizable CNN with configurable layers |
Models can be configured using Python classes or JSON files:
from convnets import ResNetConfig, ResNetForImageClassification
# Python configuration
config = ResNetConfig(
num_labels=1000,
layers=[3, 4, 6, 3], # ResNet-50
block_type="bottleneck"
)
model = ResNetForImageClassification(config)
# Or load from JSON
config = ResNetConfig.from_pretrained("./config.json")The library includes a CLI for common tasks:
# Train a model
convnets-cli train \
--train_dir ./data/train \
--eval_dir ./data/val \
--output_dir ./results \
--num_epochs 10 \
--batch_size 32
# Run inference
convnets-cli infer \
--model_name_or_path ./results \
--images image1.jpg image2.jpg
# Convert timm model
convnets-cli convert \
--timm_model resnet50 \
--output_dir ./converted_modeltraining_args = ConvNetTrainingArguments(
output_dir="./results",
fp16=True, # Enable mixed precision
per_device_train_batch_size=64, # Larger batch size with fp16
)# From folder structure
dataset = ImageClassificationDataset.from_folder("./data")
# From JSON annotations
dataset = ImageClassificationDataset.from_json("./annotations.json")
# Custom transforms
from torchvision import transforms
transform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
])
dataset = ImageClassificationDataset(image_paths, labels, transform=transform)from convnets.utils.metrics import compute_classification_metrics
def custom_metrics(eval_pred):
predictions = eval_pred.predictions
labels = eval_pred.label_ids
accuracy = accuracy_score(predictions.argmax(-1), labels)
top5_acc = top_k_accuracy_score(predictions, labels, k=5)
return {
"accuracy": accuracy,
"top5_accuracy": top5_acc,
}
trainer = ConvNetTrainer(
model=model,
compute_metrics=custom_metrics,
# ... other arguments
)The library is organized for maximum modularity and extensibility:
convnets/
βββ __init__.py # Main API exports
βββ configuration_utils.py # Base configuration classes
βββ modeling_utils.py # Base model classes
βββ feature_extraction.py # Feature extraction utilities
βββ image_processing.py # Image preprocessing
βββ trainer.py # Training utilities
βββ training_args.py # Training configuration
βββ cli.py # Command-line interface
βββ models/ # Model implementations
β βββ __init__.py
β βββ convnet.py # Generic ConvNet
β βββ resnet.py # ResNet models
β βββ efficientnet.py # EfficientNet models
β βββ vision_transformer.py # Vision Transformer
βββ utils/ # Utility functions
β βββ __init__.py
β βββ data_utils.py # Data loading utilities
β βββ metrics.py # Evaluation metrics
βββ tests/ # Test suite
βββ __init__.py
βββ test_models.py # Model tests
βββ test_trainer.py # Trainer tests
βββ test_utils.py # Utility tests
Run the test suite:
# Install test dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run with coverage
pytest tests/ --cov=convnets --cov-report=htmlgit clone https://github.com/convnets/convnets
cd convnets
pip install -e ".[dev]"
pre-commit installThis project is licensed under the MIT License - see the LICENSE file for details.
If you use ConvNets in your research, please cite:
@software{convnets2025,
title={ConvNets: A Transformers-like Library for Convolutional Neural Networks},
author={Naman Tyagi},
year={2024},
url={https://github.com/convnets/convnets},
version={0.1.0}
}- Hugging Face Transformers for the API design inspiration
- timm for pretrained model weights
- PyTorch for the deep learning framework