# Open Source Lifecycle and Sustainability 🌍📦

# 🧠 Overview
This notebook walks through the natural lifecycle of an Open Source Software (OSS) project and provides practical guidance, tools, and an interactive utility to evaluate OSS project sustainability using your GitHub's metadata.


## 🔄 1. Natural Open Source Lifecycle
### 📌 Open Source Project Lifecycle: Project → Ecosystem → Archive

#### 🚀 Project Phase
- Initial development, usually by one or a few individuals.
- Focused scope, solving a specific problem.
- Examples: first commit, README, LICENSE.

#### 🌐 Ecosystem Phase
- Community forms, contributions increase.
- Features grow, integrations appear.
- Tools like GitHub Issues, Discussions, Actions, and Projects are used.

#### 🗂️ Archive Phase
- Maintainers shift priorities, activity drops.
- Project is deprecated, forked, or archived.
- A good exit strategy includes archive notices and project transfer guidance.

---

## 🌱 2. Sustainability Practices for Each Lifecycle Phase

### 🚀 Project Phase
- **Actionable Steps**:
  - Define a clear project scope and goals.
  - Create essential files: `README.md`, `LICENSE`, and `CODE_OF_CONDUCT.md`.
  - Use GitHub Issues to track bugs and features.

### 🌐 Ecosystem Phase
- **Actionable Steps**:
  - Foster community engagement through GitHub Discussions.
  - Automate workflows with GitHub Actions.
  - Provide detailed documentation using tools like MkDocs or ReadTheDocs.

### 🗂️ Archive Phase
- **Actionable Steps**:
  - Announce deprecation with clear archive notices.
  - Provide guidance for forks and project transfers.
  - Maintain transparency with contributors and users.

---

## 🛠️ 3. Techniques and Tools to Increase OSS Sustainability

### ✅ Key Sustainability Practices:
- ✅ Clear README, LICENSE, and CODE_OF_CONDUCT.md
- ✅ CONTRIBUTING.md to guide newcomers
- ✅ GitHub Actions for CI/CD
- ✅ Good Issue and PR templates
- ✅ Documentation hosted with tools like MkDocs or ReadTheDocs
- ✅ Package managers (PyPI, npm) for easy installation
- ✅ OpenSSF Scorecard and Dependabot for active repo health monitoring

### 🧰 GitHub Features to Use:
- **GitHub Actions**: Automate testing and deploys
- **Discussions**: Encourage community engagement
- **Projects/Boards**: Track roadmaps
- **Security Policy**: Improve trust with security.md
- **Insights/Contributors Graph**: Visualize community health

## 🏗️ Repository Scaffolding for US Federal Projects

### 📋 DSACMS Repo-Scaffolder
The [DSACMS repo-scaffolder](https://github.com/DSACMS/repo-scaffolder) provides templates and command-line tools specifically designed for creating repositories for US Federal open source projects.

#### 🎯 When to Use DSACMS Repo-Scaffolder:
- **Federal Agency Projects**: When working on open source projects within US Federal agencies
- **Compliance Requirements**: When you need to meet federal standards for open source development
- **Standardized Structure**: When you want a consistent, government-approved repository structure
- **Security & Governance**: When federal security and governance requirements are mandatory

#### 🛠️ How to Use:
1. **Installation**: Clone the repository and follow setup instructions
2. **Template Selection**: Choose from pre-configured templates that meet federal standards
3. **Customization**: Adapt templates to your specific agency or project needs
4. **Deployment**: Use command-line tools to scaffold new repositories with proper structure

#### ✅ Key Features:
- Pre-configured LICENSE files compliant with federal guidelines
- Security policy templates (SECURITY.md)
- Standardized README structures for government projects
- CI/CD workflows tailored for federal requirements
- Contribution guidelines that meet federal open source policies

---

## 📋 6 Things You Can Do Right Now to Improve Your Open Source Project
*[Prepared by the OpenSource@Stanford team (Zach Chandler, Francesca Vera)](https://zenodo.org/records/15678444)*

### 1. 📜 Choose a License
- **Critical First Step**: Make sure all contributors on the team agree before you commit
- **Stanford OTL Recommended Licenses**:
  - **Permissive**: MIT, BSD 1-, 2-, and 3-clause (often best for academic use)
  - **Copyleft**: GPL v2 or LGPL v3
  - **Non-code projects**: Creative Commons
- **Academic Consideration**: At Stanford, no permission needed to publish to public domain (varies by institution)
- **Commercialization**: If there's a chance you might commercialize, consult your tech transfer office

### 2. 🔍 Be Clear & Discoverable
Your repository should have:
- ✅ **LICENSE.md*** 
- ✅ **README.md*** (a good one!)
- ✅ **CONTRIBUTING.md***
- ✅ **Contributor recognition**
- ✅ **Code of conduct**

*\*doesn't have to be a markdown file*

### 3. 🏷️ Use Versioning
- **Reproducibility**: Discrete software versions help with reproducibility (your paper was published at a particular moment, but software keeps evolving)
- **Version Schema**: Use semantic versioning (v.1.1.1) to help track progress and provide clarity to users
- **GitHub Integration**: Use GitHub "Releases" feature for easy version management

### 4. 🎯 Get a DOI
- **Persistence & Credit**: Secure a Digital Object Identifier for your project using Zenodo/GitHub integration
- **Per-Version DOIs**: Assign a unique DOI per version for precise citation
- **Easy Setup**: GitHub→Zenodo is the simplest approach (though other institutional services may be preferred)

**Learn more**: [docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content](https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content)

### 5. 📚 Make It Easy to Cite
- **CITATION.cff File**: Create a CITATION.cff file at the top level of your repository
- **GitHub Integration**: Note the .cff file extension - GitHub has special affordances for parsing and display

**Learn more**: [docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files)

### 6. 🔗 Link to Your ORCID
- **Two Key Identity Systems**: ORCID and GitHub are the most important in the open science ecosystem
- **Reinforce Value**: Link these accounts to reinforce their value and claim your work
- **Professional Recognition**: Helps establish your professional identity in open science

**Learn more**: [info.orcid.org/orcid-and-github-sign-memorandum-of-understanding/](https://info.orcid.org/orcid-and-github-sign-memorandum-of-understanding/)

---

## 📚 Additional Resources & Examples

### 🌟 Examples of Great READMEs:
- [scikit-learn](https://github.com/scikit-learn/scikit-learn)
- [HELM](https://github.com/stanford-crfm/helm) 
- [sunPy](https://github.com/sunpy/sunpy)
- [Levanter](https://github.com/stanford-hai/levanter)

### 📖 Documentation Best Practices:
Many projects maintain separate documentation sites (free version of readthedocs) and keep their README slimmer:
- [Jupyter](https://github.com/jupyter/notebook)
- [pyPSA-USA](https://github.com/PyPSA/pypsa-usa)
- [NatCap](https://github.com/natcap/invest)
- [scikit-image](https://github.com/scikit-image/scikit-image)

---

## 🔎 4. GitHub Sustainability Analyzer Tool

In [23]:
# Let's Check Your Open Source Project Sustainability!
# This script analyzes a GitHub repository for sustainability indicators.
# Use the cell below to test it with a specific repository URL.

import requests
from urllib.parse import urlparse

GITHUB_API = "https://api.github.com"

def extract_repo_info(url):
    """Extract user and repo name from GitHub URL"""
    path = urlparse(url).path.strip("/").split("/")
    if len(path) >= 2:
        return path[0], path[1]
    raise ValueError("Invalid GitHub repository URL")

def fetch_github_repo_metadata(user, repo):
    """Fetch repository metadata from GitHub API"""
    repo_api = f"{GITHUB_API}/repos/{user}/{repo}"
    response = requests.get(repo_api)
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"Failed to fetch metadata: {response.status_code}")

def check_file_exists(user, repo, file_path):
    """Check if a specific file exists in the repository"""
    url = f"{GITHUB_API}/repos/{user}/{repo}/contents/{file_path}"
    r = requests.get(url)
    return r.status_code == 200

def analyze_repo(url):
    """Analyze a GitHub repository for sustainability indicators"""
    user, repo = extract_repo_info(url)
    metadata = fetch_github_repo_metadata(user, repo)
    
    indicators = {
        "Has License": check_file_exists(user, repo, "LICENSE"),
        "Has Readme": check_file_exists(user, repo, "README.md"),
        "Has Contributing Guide": check_file_exists(user, repo, "CONTRIBUTING.md"),
        "Has Code of Conduct": check_file_exists(user, repo, "CODE_OF_CONDUCT.md"),
        "Has GitHub Actions": check_file_exists(user, repo, ".github/workflows"),
        "Open Issues": metadata.get("open_issues_count", 0),
        "Forks": metadata.get("forks_count", 0),
        "Stars": metadata.get("stargazers_count", 0),
        "Archived": metadata.get("archived", False),
        "Last Updated": metadata.get("updated_at"),
        "Community Engagement": metadata.get("subscribers_count", 0)
    }

    return indicators

def test_analyze_repo(repo_url):
    """Test the analyze_repo function with a provided GitHub URL"""
    indicators = analyze_repo(repo_url)
    print("--- Sustainability Indicators ---")
    for key, value in indicators.items():
        emoji = "✅" if value else "❌"
        print(f"{emoji} {key}: {value}")

In [24]:
repo_url = "https://github.com/gt-ospo/oss-training"
test_analyze_repo(repo_url)

--- Sustainability Indicators ---
✅ Has License: True
✅ Has Readme: True
✅ Has Contributing Guide: True
❌ Has Code of Conduct: False
❌ Has GitHub Actions: False
✅ Open Issues: 11
✅ Forks: 5
✅ Stars: 6
❌ Archived: False
✅ Last Updated: 2025-06-26T15:29:08Z
✅ Community Engagement: 3


---

## 🧭 4. Practical Sustainability Improvements

## 📄 Sample Files to Add to Your Repo
- `README.md`: Project purpose, setup, contribution guide
- `LICENSE`: Open source license (e.g. MIT, Apache-2.0)
- `CONTRIBUTING.md`: Steps for new contributors
- `CODE_OF_CONDUCT.md`: Respectful behavior guidelines
- `.github/workflows/ci.yml`: GitHub Actions for automation

You can use GitHub templates for these files and others based on your project needs: https://github.com/topics/template

---

## ✅ Conclusion

### 🧠 Key Takeaways
- Open source projects evolve naturally; sustainability should be built into every phase.
- GitHub provides powerful tools for automation, community, and transparency.
- You can evaluate your OSS sustainability with a few API calls.
- Archive with dignity: good open source is remembered even after it's inactive.
- Community engagement is critical for long-term sustainability.
- **Federal projects** can benefit from specialized tools like DSACMS repo-scaffolder for compliance.
- **Academic projects** should follow the 6 key practices from OpenSource@Stanford for immediate improvement.

### 🚀 Next Steps
- Run the analysis tool on your project.
- Address recommendations provided by the tool.
- Add missing sustainability features (LICENSE, README, CONTRIBUTING.md, etc.).
- Encourage contributions through community practices.
- Plan for the archive phase with clear documentation and guidance.
- **For federal projects**: Consider using DSACMS repo-scaffolder templates.
- **For academic projects**: Implement the Stanford OpenSource team's 6 recommendations.

### 📜 Choosing a License

#### Selecting the right license is crucial for the success of your open source project!

- **For Corporate Use**: Apache 2.0 license is often preferred due to its permissive nature and explicit patent grant, fostering trust and legal clarity.
- **For Copyleft Projects**: GNU GPL enforces strong copyleft (all future modifications must be open source), while LGPL offers more flexibility (allows linking with proprietary software).
- **For Academic Use**: Permissive licenses (MIT, BSD) are often the best choice.
- **For Non-code Projects**: Creative Commons licenses are recommended.

You must choose a license that aligns with your project's goals and community values! If you're unsure, consider using the [ChooseALicense.com](https://choosealicense.com/) tool to help you select the right license for your project - and reach out to [GT's OSPO for guidance](https://ospo.cc.gatech.edu/)!