New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommendation to strengthen public access and preservation of executive agency publications #65

Open
freegovinfo opened this Issue Nov 20, 2015 · 8 comments

Comments

Projects
None yet
8 participants
@freegovinfo

freegovinfo commented Nov 20, 2015

OMB’s Circular A-130 "Managing Information as a Strategic Resource" — along with the 1980 Paperwork Reduction Act (PRA) 94 Stat. 2812, which “establishes a broad mandate for agencies to perform their information resources management activities in an efficient, effective, and economical manner” – do not directly address and therefore have had unintended negative consequences for long-term access to and preservation of Federal government information.

These policies, along with agency practices of using the web for distribution without attending to consistent standards or to preservation, have resulted in the creation of many incompatible, inconsistent, often badly indexed and difficult to use agency web sites and a propagation of “deep Web” .gov databases. This not only makes it more difficult for individuals to find and use the government information they need, it also makes it difficult for institutions to identify, acquire, describe, and preserve agency “publication information” (defined in draft A-130 line 1066) for the long-term. Such institutions include the Government Publishing Office (GPO), libraries in the Federal Depository Library Program (FDLP), the Internet Archive, and institutions (like Sunlight Foundation, the Government Accountability Project and Open The Government) that promote open-government and government transparency.

Draft A-130 mentions “threats” on page 1, but does not mention the threat of loss of information because of insufficient preservation actions. We recommend therefore that A-130 be updated to require that all government agencies facilitate preservation of and long-term free public access to their publications. This would put the same requirement onto information produced at government expense by government agencies that the National Science Foundation (NSF) and other government funding agencies put onto the data produced by government funded research.

Requirement:

Every government agency should be required to have an “Information Management Plan” for the public information it acquires, assembles, creates, and disseminates. The Information Management Plan should specify how the agency’s public information will be preserved for long-term, free public access and use including its deposit in a reputable, trusted, government or non-government digital repository (including, but not limited to GPO's FDsys). All executive agencies should deposit their publications in FDsys.gov and their data in data.gov.

As one aspect of implementation of the Information Management Plan, we recommend that every government agency be required to make its own website compatible with a few basic, consistent requirements to make it easier for the public to discover, acquire and use its information. Each agency’s site – including subdomains – should be required to:

  1. Follow Web standards and design their sites with site maps. All agency sites should be Archive ready (http://archiveready.com);

  2. Use a standardized directory structure that identifies major types of information (e.g., ../publications ../data ../video ../blog ../podcast ../pressreleases ../rss etc);

  3. Have permanent urls in the form of DOIs or some other standard for all agency publications and other information products.

Updating A-130 for the 21st century to take full advantage of the Internet will bring executive agencies in line with the White House’s Open Government Initiative, will facilitate public access, will require the preservation of agency publications, and will facilitate economic efficiency by encouraging centralized digital preservation while allowing for the use and expansion of non-government digital repositories.

Respectfully submitted,

James A. Jacobs and James R. Jacobs
Free Government Information
http://freegovinfo.info

@cldavids

This comment has been minimized.

Show comment
Hide comment
@cldavids

cldavids Nov 20, 2015

This change would help GPO get more information and give it authority that T44 does not. It would allow GPO to put digital deposit to FDLP libraries in its workflow without a change to T44. It would make it easier for FDLP libraries and other non-government organizations to work with GPO and agencies to preserve and provide long-term access to government information. A rule-based infrastructure would give libraries the ability to prevent loss of information and loss of access to information even during a government shutdown or defunding or if a change of agency's mission or policies takes information offline.

cldavids commented Nov 20, 2015

This change would help GPO get more information and give it authority that T44 does not. It would allow GPO to put digital deposit to FDLP libraries in its workflow without a change to T44. It would make it easier for FDLP libraries and other non-government organizations to work with GPO and agencies to preserve and provide long-term access to government information. A rule-based infrastructure would give libraries the ability to prevent loss of information and loss of access to information even during a government shutdown or defunding or if a change of agency's mission or policies takes information offline.

@shinjoung

This comment has been minimized.

Show comment
Hide comment
@shinjoung

shinjoung Nov 20, 2015

I could also see a robots.txt requirement that make them open to Internet Archive and other public Web crawlers to facilitate Web crawling. I note that many agency Websites have robots.txt files that were set in the late '90s or early 2000s and are still extremely restrictive. Even GPO, which is supposed to distribute documents to libraries and the public, has their robots.txt set to exclude the Internet Archive. See for example https://web.archive.org/web/*/http://purl.fdlp.gov/GPO/gpo56804 and https://web.archive.org/web/http://purl.fdlp.gov/GPO/gpo56917

shinjoung commented Nov 20, 2015

I could also see a robots.txt requirement that make them open to Internet Archive and other public Web crawlers to facilitate Web crawling. I note that many agency Websites have robots.txt files that were set in the late '90s or early 2000s and are still extremely restrictive. Even GPO, which is supposed to distribute documents to libraries and the public, has their robots.txt set to exclude the Internet Archive. See for example https://web.archive.org/web/*/http://purl.fdlp.gov/GPO/gpo56804 and https://web.archive.org/web/http://purl.fdlp.gov/GPO/gpo56917

@kfogel

This comment has been minimized.

Show comment
Hide comment
@kfogel

kfogel Nov 20, 2015

+1

Even though the U.S. federal court system is not affected by A-130, the Supreme Court still offers a good example of the consequences of government agencies not taking data permanence seriously enough: apparently 49% of the web links in Supreme Court decisions are stale. It's crucial that digital content put out by the federal government be permanently findable and independently archivable, just as it is with court decisions.

kfogel commented Nov 20, 2015

+1

Even though the U.S. federal court system is not affected by A-130, the Supreme Court still offers a good example of the consequences of government agencies not taking data permanence seriously enough: apparently 49% of the web links in Supreme Court decisions are stale. It's crucial that digital content put out by the federal government be permanently findable and independently archivable, just as it is with court decisions.

@freegovinfo

This comment has been minimized.

Show comment
Hide comment
@freegovinfo

freegovinfo Nov 20, 2015

Great point Karl.

freegovinfo commented Nov 20, 2015

Great point Karl.

@vreich

This comment has been minimized.

Show comment
Hide comment
@vreich

vreich Nov 21, 2015

Strongly Support: A-130 be updated to require that all government agencies facilitate preservation of and long-term free public access to their publications.

vreich commented Nov 21, 2015

Strongly Support: A-130 be updated to require that all government agencies facilitate preservation of and long-term free public access to their publications.

@sinaiwood

This comment has been minimized.

Show comment
Hide comment
@sinaiwood

sinaiwood Nov 23, 2015

I support any revisions to A-130 that address long-term preservation and long-term access to government information.

sinaiwood commented Nov 23, 2015

I support any revisions to A-130 that address long-term preservation and long-term access to government information.

@sharilaster

This comment has been minimized.

Show comment
Hide comment
@sharilaster

sharilaster Nov 29, 2015

I completely agree with these comments. The trustworthiness of Federal information systems referenced throughout includes the public's confidence that content, records, and publications are appropriately managed and will remain freely available for the long term in a transparent manner. Therefore, the goal of information management with respect to these systems should explicitly include enabling the long-term access and preservation of disseminated information. The development of Information Management Plans would provide public roadmaps for how Federal agencies are enabling the Government Publishing Office (GPO) and public organizations to capture and manage government publications, and would increase public trust in the complete Federal information dissemination lifecycle.

sharilaster commented Nov 29, 2015

I completely agree with these comments. The trustworthiness of Federal information systems referenced throughout includes the public's confidence that content, records, and publications are appropriately managed and will remain freely available for the long term in a transparent manner. Therefore, the goal of information management with respect to these systems should explicitly include enabling the long-term access and preservation of disseminated information. The development of Information Management Plans would provide public roadmaps for how Federal agencies are enabling the Government Publishing Office (GPO) and public organizations to capture and manage government publications, and would increase public trust in the complete Federal information dissemination lifecycle.

@mhucka

This comment has been minimized.

Show comment
Hide comment
@mhucka

mhucka Jan 28, 2017

Sorry to come in late. I was wondering whether it might also be important to say something about time: (1) how quickly after publication materials should be deposited in a digital repository, and (2) for how long the materials should be guaranteed to be available.

mhucka commented Jan 28, 2017

Sorry to come in late. I was wondering whether it might also be important to say something about time: (1) how quickly after publication materials should be deposited in a digital repository, and (2) for how long the materials should be guaranteed to be available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment