Skip to content

TheRemote/365SharePointCleanup

Repository files navigation

SharePoint Storage Cleanup Scripts

SharePointCleanup

PowerShell scripts for reducing SharePoint Online storage usage across Microsoft 365 tenants.

Author: James A. Chambers

This project focuses on the three biggest hidden storage cost drivers in SharePoint:

  1. Duplicate files copied across libraries and sites
  2. Version history bloat from excessive file revisions
  3. Missing versioning policies that allow unlimited growth over time

These scripts help identify wasted storage, remove unnecessary historical versions, and enforce preventative controls to keep tenant storage predictable and cost-efficient.

Full blog post available here: https://jamesachambers.com/cleaning-sharepoint-find-duplicates-trim-versions-retention/


Why This Matters

Microsoft 365 includes:

  • 1 TB of base SharePoint storage per tenant
  • +1 GB per licensed user

Once you exceed that allocation, additional SharePoint storage is billed separately.

In many tenants, a significant percentage of that paid storage is unnecessary:

  • the same files stored multiple times
  • old file versions nobody will ever restore
  • unlimited version history growing silently for years

Most environments can reclaim 20–60% of used storage from version cleanup alone.


Included Scripts


1. Find Duplicate Files Across Drives

Find-DriveItemDuplicates

Uses Microsoft Graph and QuickXorHash to identify files with identical content, even when:

  • filenames are different
  • files exist in different libraries
  • files exist across different SharePoint locations

What it does

  • recursively scans a document library
  • collects file hashes
  • groups matching files by content hash
  • calculates potential wasted storage
  • exports results to CSV

Output

  • duplicate file groups
  • duplicate count
  • file sizes
  • estimated reclaimable space
  • file paths / URLs

2. Trim Version History

Two supported approaches:


Approach A: SharePoint Online Batch Jobs (Recommended)

New-SPOListFileVersionBatchDeleteJob

Best for large-scale cleanup.

What it does

  • deletes versions older than X days
  • limits retained major versions
  • limits retained minor versions
  • runs server-side asynchronously
  • avoids Graph API throttling

Recommended for

  • production tenants
  • large environments
  • enterprise-wide cleanup

Approach B: Microsoft Graph Direct Version Deletion

Provides granular control when you need precision.

What it does

  • deletes versions older than a specific date
  • preserves current versions
  • works across all libraries in a site
  • supports custom logic per file

Recommended for

  • highly customized cleanup
  • selective retention policies
  • advanced reporting workflows

3. Measure Version Bloat

Get-SharePointVersionSize

Calculates total storage consumed by non-current file versions.

What it does

  • scans document libraries
  • retrieves historical versions
  • totals previous-version storage usage
  • provides a before/after cleanup baseline

4. Set Preventative Version Limits

Applies tenant-wide controls to stop future storage sprawl.

What it does

  • sets MajorVersionLimit
  • sets MajorWithMinorVersionsLimit
  • standardizes retention across sites

Example policy

  • 50 major versions
  • 10 minor versions

This prevents files from accumulating hundreds of unnecessary revisions.


Recommended Cleanup Runbook

Run scripts in this order:

Step 1 — Measure

Run:

Get-SharePointVersionSize

Establish baseline version storage usage.


Step 2 — Find Duplicates

Run:

Find-DriveItemDuplicates

Identify duplicate file waste.


Step 3 — Delete Old Versions

Run:

  • SPO batch job (preferred)
  • or Graph direct deletion

Remove unnecessary historical versions.


Step 4 — Set Version Limits

Apply retention caps across all sites.

Prevent future version sprawl.


Step 5 — Re-Measure

Confirm actual reclaimed storage.

Track measurable savings.


Requirements

Depending on the script used:

Microsoft Graph PowerShell

Install-Module Microsoft.Graph

SharePoint Online Management Shell

Install-Module Microsoft.Online.SharePoint.PowerShell

Required Permissions

Typical required permissions include:

Microsoft Graph

  • Files.Read.All
  • Sites.Read.All
  • Sites.ReadWrite.All (for deletion workflows)

SharePoint Admin

  • SharePoint Administrator
  • Global Administrator (depending on tenant policy)

Rate Limit Warning

Important

Graph API deletion can hit:

429 Too Many Requests

very quickly in large tenants.

For enterprise-scale cleanup:

Always prefer SPO batch jobs

because they:

  • run server-side
  • avoid API quota limits
  • are Microsoft-supported for this exact workload

Safety Notes

Before running deletion scripts:

Always

  • test in a non-production site first
  • export reports before deleting
  • validate retention requirements
  • confirm legal/compliance policies
  • verify backup expectations

These scripts can permanently remove historical versions.

Use carefully.


Use Cases

Ideal for:

  • Microsoft 365 consultants
  • SharePoint administrators
  • MSPs managing multiple tenants
  • IT teams reducing licensing costs
  • tenant remediation projects
  • pre-migration cleanup
  • post-acquisition environment consolidation

About

Reduce SharePoint storage costs with PowerShell: find duplicate files, trim version history, delete old versions, and enforce version limits to reclaim space.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors