Skip to content

Notes for my talk about Web Archiving to Jane Zhang's Digital Curation class.

Notifications You must be signed in to change notification settings

edsu/zhang-webarchiving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 

Repository files navigation

A Brief Look at Web Archiving

Notes for Jane Zhang's Digital Curation class at Catholic University. March 5, 2014.

Hi

Who Cares?

How much of the Web is archived?

  • Not a solved problem.
  • IA: 366 billion
  • IIPC: 75 billion
  • Google: 1T URLs
  • generous guesstimate: 44%

Despair?

Even if archivists in a particular country were to preserve every record generated throughout the land, they would still have only a sliver of a window into that country’s experience. But of course in practice, this record universum is substantially reduced through deliberate and inadvertent destruction by records creators and managers, leaving a sliver of a sliver from which archivists select what they will preserve. And they do not preserve much.

The archival record is best understood as a sliver of a sliver of a sliver of a window into process. It is a fragile thing, an enchanted thing, defined not by its connection to “reality”, but by its open-ended layerings of construction and reconstruction.

-- Verne Harris - The Archival Sliver

lower case "p" politics

Library of Congress

  • team of 6 + InternetArchive
  • selection
  • notification
  • seed lists, scoping
  • quality control
  • embargo period
  • access!

The Wider Web

Nuts & Bolts

Interlude: Web Packages

Challenges

  • scoping (backlinks)
  • streaming video / audio
  • dynamic content / ghost
  • funding (sustainable)
  • copyright
  • storage space
  • format migration
  • digital preservation significant characteristics?
  • collection development: seedlists, inventory
  • single point of failure (IA)

What can you do?

  • Big data is great, but start with small data:
    • your organizations web presence
    • local blogs
    • local government
    • local arts scene / businesses
  • Website owners:
    • Permalinks/Cool URIs
    • robots.txt
    • sitemaps
  • Personal Digital Archiving
    • outreach with your community
    • best practices / guidance
  • Keep an open mind.
  • Have a whole class about web archiving!

Learn More

About

Notes for my talk about Web Archiving to Jane Zhang's Digital Curation class.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published