This project is a comprehensive Django-based academic database management system for researchers to track publications, funding, teaching, talks, and conferences with seamless integration to academic APIs.
This project has two main use cases:
- For researchers with hundreds of publications and other outputs, it can help streamline the generation of your CV.
- For researchers submitting NSF grants, which require a spreadsheet listing all recent collaborators and their affiliations, it can automate the generation of that list. This is not yet fully implemented: At present the collaboration database has been developed, we just need to implement the export. If you need this immediately, post an issue and I'll get on it ASAP.
NOTE: This project is meant for local deployment only, not for production deployment on the web, due to the limitations on Scopus keys.
This is a complete rewrite of the original project, using Claude Code, based on the original codebase. If you wish to access the original project, you can find it here.
Before installing the application, you need to register for ORCID API credentials:
-
Create an ORCID account (if you don't have one):
- Visit https://orcid.org/register
- Complete the registration process
- Enter your details into the ORCID database
-
Register for ORCID developer tools:
- Go to https://orcid.org/developer-tools
- Sign in with your ORCID account
- Click "Register for the free ORCID public API"
-
Create a new application:
- Fill out the application form:
- Application name: Your application name (e.g., "My Academic Database")
- Application website: http://127.0.0.1:8000 (for local development)
- Application description: Brief description of your use case
- Important: Set the Redirect URI to:
http://127.0.0.1:8000/accounts/orcid/login/callback/ - Submit the application
- Fill out the application form:
-
Get your credentials:
- After approval (usually immediate), you'll receive:
- Client ID: Use this for
ORCID_CLIENT_IDin the .env file below - Client Secret: Use this for
ORCID_CLIENT_SECRETin the .env file below
- Client ID: Use this for
- After approval (usually immediate), you'll receive:
The easiest way to get started is using Docker, which handles all dependencies and setup automatically.
- Docker and Docker Compose installed on your system
- ORCID API credentials (for authentication)
- Scopus API key - in principle this is optional, but you would lose much of the functionality of the project
- Clone the repository:
git clone https://github.com/poldrack/academicdb.git
cd academicdb- Create environment configuration:
# Copy the example environment file
cp .env.example .env
# Edit the .env file with your configuration:
nano .env- Configure your
.envfile with the following variables:
# Django Configuration
DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1,0.0.0.0
# ORCID API Configuration (Required)
# See ORCID setup instructions below
ORCID_CLIENT_ID=your-orcid-client-id
ORCID_CLIENT_SECRET=your-orcid-client-secret
# Optional API Keys
SCOPUS_API_KEY=your-scopus-api-key # Get from https://dev.elsevier.com/
SCOPUS_INST_TOKEN=your-scopus-inst-token # Optional - only required for Scopus access outside institutional network
# Docker Configuration
USE_LOCAL_DOCKER_IMAGE=false # Set to 'true' to use locally built image instead of Docker Hub image- Run the application:
By default, the application will use the Docker Hub image (poldrack/academicdb2):
# Start the application using the Docker Hub image
make docker-run-orcidTo use a locally built image instead:
# First, set USE_LOCAL_DOCKER_IMAGE=true in your .env file
# Then build the Docker image locally
make docker-build
# Start the application with the local image
make docker-run-orcidThe make docker-run-orcid command will:
- Validate your ORCID credentials from the
.envfile - Set up proper data persistence with volume mounts
- Start the container with all necessary environment variables
-
Access the application:
- Web interface: http://127.0.0.1:8000
- Admin interface: http://127.0.0.1:8000/admin/
-
Log in using your ORCID credentials.
# Stop the container
make docker-stop
# Remove the container
make docker-remove
# View logs
docker logs academicdb
# Follow logs in real-time
docker logs -f academicdb
# Access Django shell
docker exec -it academicdb python manage.py shell
# Run tests
docker exec academicdb python manage.py test
# Backup database
docker exec academicdb python manage.py backup_db
# Complete restart (clean build)
make docker-full-restartFor development or if you prefer not to use Docker:
- Python 3.9+
- ORCID API credentials (for authentication)
- Scopus API key (optional, for Scopus integration)
- Clone the repository:
git clone https://github.com/poldrack/academicdb.git
cd academicdb2- Install dependencies using uv:
uv sync- Set up environment variables:
# Create .env file with:
ORCID_CLIENT_ID=your-orcid-client-id
ORCID_CLIENT_SECRET=your-orcid-client-secret
SCOPUS_API_KEY=your-scopus-api-key # Optional
SCOPUS_INST_TOKEN=your-scopus-inst-token # Required for Scopus access outside institutional network
USE_LOCAL_DOCKER_IMAGE=true # Since you're doing local development- Run database migrations:
uv run python manage.py migrate- Create a superuser (optional):
uv run python manage.py createsuperuser- Run the development server:
uv run python manage.py runserverVisit http://127.0.0.1:8000 to access the application.
- Login: Authenticate using your ORCID account
- Dashboard: View sync status and statistics
- Publications: Manage your publication list with search and filtering
- Funding: Track grants and funding sources
- Teaching/Talks/Conferences: Use spreadsheet interfaces for bulk editing
- Sync: Import data from external sources with real-time progress tracking
The application provides RESTful APIs for programmatic access:
/api/v1/publications/- Publication CRUD operations/api/v1/teaching/- Teaching record management/api/v1/talks/- Talk record management/api/v1/conferences/- Conference presentation management
All endpoints require authentication and return user-scoped data.
If you want to publish your own version of the Docker image to Docker Hub:
- Docker Hub account (https://hub.docker.com/)
- Docker with buildx support for multi-platform builds
# Login to Docker Hub
docker login
# Create and use a new builder instance for multi-platform builds
docker buildx create --use
# Build and push for both AMD64 and ARM64 architectures
docker buildx build --platform linux/amd64,linux/arm64 \
-t yourusername/academicdb2:latest \
-t yourusername/academicdb2:v1.0.0 \
--push .# For local testing or single platform
docker build -t yourusername/academicdb2:latest .
# Test locally
docker run -p 8000:8000 yourusername/academicdb2:latest
# Push to Docker Hub
docker push yourusername/academicdb2:latest# Test pulling and running from Docker Hub
docker pull yourusername/academicdb2:latest
# this needs to be run within the academicdb2 directory since it needs several files from there
docker run -p 8000:8000 yourusername/academicdb2:latestNote: The multi-platform build ensures compatibility with both Intel/AMD processors and Apple Silicon (M1/M2) Macs.
The system includes comprehensive Django management commands for data operations:
# Comprehensive sync from all sources
uv run python manage.py comprehensive_sync [--user-id ID]
# Sync from specific sources
uv run python manage.py sync_orcid [--user-id ID]
uv run python manage.py sync_pubmed --user-id ID [--query "search query"]
uv run python manage.py sync_scopus --user-id ID [--scopus-id SCOPUS_ID]
# Enhanced Scopus sync with author ID capture
uv run python manage.py sync_scopus_enhanced --user-id ID# Enrich with CrossRef metadata
uv run python manage.py enrich_crossref [--user-id ID]
# Enrich with PubMed data
uv run python manage.py enrich_pubmed --email your@email.com [--user-id ID]
# Add Scopus author IDs
uv run python manage.py enrich_author_scopus_ids [--user-id ID]
# Lookup PMC IDs
uv run python manage.py lookup_pmc_ids [--user-id ID]# Clear data (with confirmation)
uv run python manage.py clear_publications --user-id ID --confirm
uv run python manage.py clear_funding --user-id ID --confirm
# Import CSV data
uv run python manage.py import_csv --user-id ID --teaching-file teaching.csv# Create JSON backup
uv run python manage.py backup_data [--output-dir backups/] [--user-id ID]
# Restore from JSON backup
uv run python manage.py restore_data <backup_dir> [--user-id ID] [--merge]# Deduplicate DOIs
uv run python manage.py deduplicate_doi_case [--auto-merge]
# Detect preprints
uv run python manage.py detect_preprints
# Consolidate author cache
uv run python manage.py consolidate_author_cache
# Extract coauthors
uv run python manage.py extract_coauthors --user-id ID# Test sync performance
uv run python manage.py test_sync_diagnostic --user-id ID
# Populate author cache
uv run python manage.py populate_author_cache --user-id IDMost commands support these options:
--dry-run: Preview changes without modifying data--user-id ID: Target specific user--force: Override safety checks--rate-limit N: API call throttling (seconds)
- AcademicUser: Extended user model with ORCID integration and academic profile
- Publication: Flexible publication tracking with JSONB metadata fields
- Funding: Grant and funding source management
- Teaching: Course and teaching activity records
- Talk: Invited talks and speaking engagements
- Conference: Conference presentations and papers
- AuthorCache: Intelligent author name normalization and matching
- User Data Isolation: All data scoped to authenticated users
- Edit Protection: Manual edits preserved during API synchronization
- Flexible Metadata: JSONB fields for varying API response structures
- Full-Text Search: PostgreSQL text search capabilities
- Audit Trails: Comprehensive edit history tracking
- Framework: Django 4.2+ with PostgreSQL
- Authentication: django-allauth with ORCID OAuth
- Frontend: Bootstrap 5 with minimal JavaScript
- API: Django REST Framework
- Database: PostgreSQL with JSONB for flexible schemas
- Background Tasks: Threading for long-running operations
- Real-time Updates: Server-Sent Events for progress tracking
academicdb/
├── academic/ # Main Django app
│ ├── models.py # Database models
│ ├── views.py # Web and API views
│ ├── serializers.py # DRF serializers
│ ├── management/ # Management commands
│ │ └── commands/ # Individual command files
│ └── templates/ # Django templates
├── academicdb_web/ # Django project settings
├── src/academicdb/ # Legacy CLI tools (preserved)
└── manage.py # Django management script
Run tests with:
uv run python manage.py test- Create a feature branch
- Write failing tests first (TDD)
- Implement minimal code to pass tests
- Ensure user data isolation
- Submit pull request
- Dashboard load: <2 seconds
- Publication search: <500ms
- API responses: <200ms median
- Support 50+ concurrent users
- Handle 10,000+ publications per user
- ORCID OAuth for authentication
- User data isolation at database level
- CSRF protection for all forms
- Input validation for external API data
- Comprehensive audit trails
For issues and feature requests, please use the GitHub issue tracker.