Skip to content

halfmoonai/cleanfile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@halfmoonai/cleanfile

Online Site: https://cleanfile.app

Lossless file metadata parser & cleaner for the browser.

Strip EXIF, GPS, author info and other privacy-leaking metadata from images, PDFs, audio, video, Office documents (DOCX/XLSX/PPTX), and ZIP archives — entirely client-side, no server upload.

Supported Formats

Category Parse Clean Formats
Image JPEG, PNG, WebP, HEIC, AVIF, SVG, GIF, TIFF, BMP, ICO
PDF PDF
Audio MP3 (ID3v1/ID3v2), WAV, FLAC, OGG Vorbis/Opus, M4A
Video MP4, MOV (ISO BMFF)
Office DOCX, XLSX, PPTX
ZIP ZIP

Install

npm install @halfmoonai/cleanfile
# or
yarn add @halfmoonai/cleanfile
# or
pnpm add @halfmoonai/cleanfile

Usage

import {
  detectFile,
  parseImageMetadata,
  cleanImage,
  downloadBlob,
  cleanFilename,
} from '@halfmoonai/cleanfile'

// Detect file type
const file: File = /* from <input> or drag & drop */
const { category } = detectFile(file) // 'image' | 'pdf' | 'audio' | ...

// Parse metadata
const meta = await parseImageMetadata(file)
console.log(meta.hasGps, meta.latitude, meta.longitude)
console.log(meta.make, meta.model, meta.camera)

// Clean (strip all metadata, lossless)
const cleanedBlob = await cleanImage(file)
downloadBlob(cleanedBlob, cleanFilename(file.name))

API

File Detection

  • detectFile(file: File){ file, category, mimeType, extension }

Image

  • parseImageMetadata(file: File)Promise<ImageMetadata>
  • cleanImage(file: File)Promise<Blob>

PDF

  • parsePdfMetadata(file: File)Promise<PdfMetadata>
  • cleanPdf(file: File)Promise<Blob>

Audio

  • parseAudioMetadata(file: File)Promise<AudioMetadata>
  • cleanAudio(file: File)Promise<Blob>

Video

  • parseVideoMetadata(file: File)Promise<VideoMetadata>
  • cleanVideo(file: File)Promise<Blob>

Office (DOCX / XLSX / PPTX)

  • parseWordMetadata(file: File)Promise<WordMetadata>
  • cleanWord(file: File)Promise<Blob>

ZIP

  • parseZipMetadata(file: File)Promise<ZipMetadata>
  • cleanZip(file: File)Promise<Blob>

Utilities

  • downloadBlob(blob: Blob, filename: string) — trigger browser download
  • cleanFilename(name: string)string — prefix with clean_

How It Works

All cleaning is lossless — no re-encoding, no quality loss:

  • JPEG: strips APP1/APP2/APP13 marker segments (EXIF, XMP, ICC, IPTC)
  • PNG: removes tEXt, iTXt, zTXt, eXIf, tIME chunks
  • WebP: removes EXIF/XMP RIFF chunks
  • HEIC/AVIF: neutralizes Exif items in ISO BMFF container
  • SVG: removes <metadata> elements and XML comments
  • PDF: clears Info dictionary (title, author, creator, producer, dates)
  • MP3: strips ID3v2 header and ID3v1 tail
  • WAV: removes LIST/INFO RIFF chunks
  • FLAC: replaces Vorbis Comment with empty block
  • OGG: replaces comment packet with empty Vorbis Comment
  • M4A: removes udta/meta atoms, zeroes mvhd timestamps
  • MP4/MOV: removes udta/meta atoms, zeroes mvhd/tkhd/mdhd timestamps
  • DOCX/XLSX/PPTX: clears core.xml and app.xml metadata, removes comments
  • ZIP: re-archives without comments, normalized timestamps

Development

# Install dependencies
yarn install

# Run tests
yarn test

# Build
yarn build

About

Remove Hidden Metadata From Your Files

Topics

Resources

License

Stars

Watchers

Forks

Contributors