Skip to content

InHerOwnRight/InHOR-Dataset

Repository files navigation

In Her Own Right Dataset

About the data

The In Her Own Right dataset consists of 13,295 item-level records aggregated from over twenty different institutions in the Philadelphia area and beyond. A full list of contributing institutions, as well as more information about the project, can be found on the In Her Own Right website.

Each contributing institution hosts its own records and digital objects, but the In Her Own Right project makes the metadata available in the aggregate via a custom search interface, API access, and for download in this repository and on the project website.

Records are Dublin Core-compliant and adhere to the In Her Own Right metadata guidelines, which were designed for maximum flexibility and inclusivity. There is a minimum standard of metadata via required fields, but institutions were encouraged to enhance their records by adding recommended fields. Records are ingested into the In Her Own Right database via OAI-PMH harvest or, in cases where that is not possible, manual ingest via CSV. Visit the project documentation page for more information.

About the dataset

This repository includes the "full" dataset in XML format as returned from the project's API endpoint, and in tabular format as a CSV.

It also includes a "curated" version of the dataset in CSV format that has been lightly re-organized and edited for ease of use. Alterations include re-ordering some fields, combining redundant or overlapping fields, and adding certain fields:

  • The dc:identifier field was sorted, cleaned, and split into two distinct fields: a "Local Identifier" field and a "Record URL" field. The dc:identifier field was then deleted.
  • A "Contributing Institution" field was added, because in the InHOR database this information is assigned during the ingest process and is not located in the record itself.
  • The "dc:relation" field was merged into the "dcterms:IsPartOf" field and then deleted. The "dc:relation" field appears in records harvested via OAI, but is redundant because it maps to the "IsPartOf" field that appears in the metadata guidelines.
  • The "dc:coverage" field was merged into the "dcterms:spatial" field and then deleted. The "dc:coverage" field appears in records harvested via OAI, but is redundant because it maps to the "spatial" field that appears in the metadata guidelines.
  • Fields were re-ordered for clarity and ease of use.

Metadata schema

Below is a guide to each field in In Her Own Right metadata records. It follows the order of the fields in the "curated" CSV for clarity and ease of use.

For fields that appear in the project metadata guidelines, the explanation below is drawn from those guidelines. Fields marked with * were added to the "curated" CSV. Fields marked with ** were removed from the "curated" CSV but appear in the original XML.

Note that for all repeatable fields, multiple entries are concatenated and separated by a pipe, “|”, and appear as “entry1|entry2.”

Table of Contents

header - identifier

dc:identifier**

Local Identifier*

Contributing Institution*

Record URL*

dc:title

dc:creator

dc:subject

dc:date

dc:type

dc:language

dc:description

dcterms:spatial

dcterms:extent

dc:contributor

dc:publisher

dcterms:isPartOf

dc:rights

header - datestamp

full text

header - identifier

Not Dublin Core-compliant

This is the In Her Own Right identifier assigned when the record is ingested into the database. It is drawn from the "local" identifier assigned by the contributing institution and found in the dc:identifier field.

It is also used in the URL for the record in the In Her Own Right database. To determine the InHOR database URL for any particular record, take the number or string that appears after "oai:pacscl:" in the "header - identifier" field and add it to the end of http://inherownright.org/records/. So, for instance, "oai:pacscl:31194" in the "header - identifier" field becomes http://inherownright.org/records/31194.

dc:identifier

Definition: An unambiguous reference to the resource within a given context.

Repeatable: Yes

Required: Yes

Application: Identifier must be unique within your repository.

Example:

a289_001

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:identifier Identifier <identifier>

Local Identifier *

Not Dublin Core-compliant

While the "dc:identifier" field was originally conceived as a non-repeatable field that would contain only the contributing institution's unique local identifier for the record, in practice it expanded to hold the multiple indentifiers present in these records. It therefore concatenates several different kinds of identifiers separated by pipes ("|"). For clarity and ease of use, a dedicated "Local Identifier" field was added to the "curated" CSV and the local identifiers were extracted from the main "dc:identifier" field.

Contributing Institution *

Not Dublin Core-compliant

The institution contributing the record to In Her Own Right. While this field displays on the front end of each record in the InHOR search interface, it is assigned during the ingest process and is not stored in the metadata record itself. It has been added to the "curated" CSV for ease of use.

Record URL *

Not Dublin Core-compliant

While the "dc:identifier" field was originally conceived as a non-repeatable field that would contain only the contributing institution's unique local identifier for the record, in practice it expanded to hold the multiple indentifiers present in these records, including permanent URLs for the record in the home repository. It therefore concatenates several different kinds of identifiers separated by pipes ("|"). For clarity and ease of use, a dedicated "Record URL" field was added to the "curated" CSV and the record URLs were extracted from the main "dc:identifier" field.

dc:title

Definition: A name given to the resource.

Repeatable: Yes

Required: Yes

Application: InHOR will accept titles in any form. Follow DACS as much as possible.

Examples:

Letter to Hannah Darlington from Ann Preston
Annual Meeting of the Germantown Y.W.C.A.
Industrial Committee meeting minutes
New Century Journal of Women's Interests, 1899

Mappings:

Simple Dublin Core ContentDM MODS
dc:title Title <titleInfo><title>

dc:creator

Definition: An entity primarily responsible for making the resource. Could be a person, family, or corporate entity.

Repeatable: Yes

Required: Yes, if known

Application: Before creating a local form of the creator’s name, check the following sources listed in order of preference:

  • LCNAF
  • VIAF
  • http://inherownright.org/ - another contributing repository might have already created a local form of the creator’s name.

If the creator is not found in the above sources, create a local form by following the guidelines in Appendix A - Creating local names for “Subject” or “Creator” fields.

Examples:

Preston, Ann, 1813-1872
Tyng, Anita Elizabeth, -1913
Scarlett, M. J. (Mary J.)
Woman Suffrage Society of the County of Philadelphia
Northern Association of the City and County of Philadelphia for the Relief and Employment of Poor Women

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:creator Creator <name><namePart>

dc:subject

Definition: The topic(s) of the resource.

Repeatable: Yes

Required: Yes. At least one subject is required. InHOR recommends three subjects, use more as needed.

Application: Typically, the subject will be represented using keywords, names, or key phrases. Generally, in considering level of description, adopt this approach: if search results brought a user to the record, would they be disappointed? InHOR recommends the following sources for subjects:

  • http://inherownright.org - another contributing repository might have already used a subject(s) that applies to the resource.
  • FAST subject headings - a simplified version of LCSH that is easier to understand, control, apply, and use.
  • Other controlled vocabularies - LCSH, AAT, LCNAF, VIAF, etc.
  • Pre-selected authorized subject terms and names, compiled and recommended for use in the pilot project’s metadata enhancement event: InHOR subject terms and names (note existence of multiple tabs).

If the subject is a person, family, or corporate entity and the subject is not found in the above sources, create a local form by following the guidelines in Appendix A - Creating local names for “Creator” or "Subject" fields.

For guidance on how to use LCSH, see https://www.loc.gov/catworkshop/lcsh/.

Examples:

Names

Elder, William
Moseley, Nathaniel R.

Topics

Women medical students
Women--Education (Higher)
Female Medical College of Pennsylvania--Students

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:subject Subject <subject>

<topic>

<name>

<occupation>

<geographic>

<temporal>

<titleInfo>

dc:date

Definition: Date of creation of the physical resource.

Repeatable: Yes

Required: Yes

Application: Only dates formatted according to W3CDTF (ISO 8601) (YYYY-MM-DD single date or YYYY-YYYY date range) will be indexed, although anything entered in this field will be displayed.

  • If you only have an approximate date, InHOR recommends entering the date twice: once in machine-readable standard W3CDTF format (eg “1850-1860”), and a second time in a human-readable format (eg “approximately 1855”).
  • If you do not have a date for the resource, instead of “undated,” use a very broad range in machine-readable W3CDTF, such as, “1850-1900”.
  • For uncertain dates, the term “approximately” is preferred.

Examples (using the recommended ISO 8601 (W3CDTF) format YYYY-MM-DD):

For a known date:

1851 _OR_ 1851-01 _OR_ 1851-01-04

For a known date range:

1851-03-03 - 1855-04-17 _OR_ 1851-1855

For an approximate date:

1801-1855 _AND_ approximately 1851 _(note that the latter form will not be indexed by the system in its current state)_

For more examples see https://www.w3.org/TR/NOTE-datetime

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:date Date-Created <originInfo><dateCreated>

dc:type

Definition: The nature or genre of the resource.

Repeatable: Yes

Required: Strongly recommended

Application: In order to map this field to InHOR, the value(s) for this element must be taken from the following list of high-level types:

  • text
  • image
  • physical object
  • sound
  • moving image

This is a requirement from DPLA, which we are following to keep our data DPLA-ready. You may choose to use other values in your system, but if so, you must un-map the DC fields in the OAI export. InHOR recommends putting more specific genres (i.e. “scrapbooks” or “journals”) in the subject field.

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:type Type <typeOfResource>

dc:language

Definition: The language of the resource.

Repeatable: Yes

Required: Yes, for textual resources

Application: InHOR strongly recommends that the value for this element be the 3-letter code from ISO 639-3. If using ISO 639-2, given the choice between a terminology (T) code, and a bibliographic code (B), InHOR strongly recommends using the terminology code.

Examples:

eng; spa; deu

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:language Language <language><languageTerm>

dc:description

Definition: A free text account of the resource.

Repeatable: Yes

Required: Recommended

Application: Description may include but is not limited to: an abstract, a table of contents, or a free-text account of the resource.

Note that if you are contributing via OAI-PMH, transcriptions will appear on the aggregator site as a secondary “Description”. See the full text field below for further details.

Example:

Copy of a letter to Hannah Darlington from Ann Preston, 1851. Hannah Darlington was a fellow Quaker from Chester County, who, like Preston, was involved in the abolition movement. In her letter to Darlington, Preston discusses her health, her enthusiasm for her studies, lectures she has attended, and mutual friends.

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:description Description <abstract>

<note>

<table of contents>

dcterms:spatial

Definition: Spatial characteristics of the resource.

Repeatable: Yes

Required: Recommended

Application: Geographic location relevant to the original item.

  • InHOR requires the use of an authority list if applying geographic terms, and an applicable term can be found. Recommended, in order of preference:
    • LCNAF
    • TGN
  • In addition to an authoritative term, you can add more local or granular terms/locations not in the authority list, using a locally constructed term:
    • Describe from smallest to largest unit using commas between the levels, determining the level of granularity based on information available and the value of the detail - use any/all of the following hierarchy, retaining smallest to largest sequence
      • Street address or intersection
        • Intersection - Numbered street and cross st. e.g. 15th and Market
      • Named structure/landmark
      • Neighborhood
      • City or township
      • County
      • State/Province
      • Country, if not United States
  • Always use the current street name, rather than the contemporaneous name. Use the PhillyHistory index https://www.phillyhistory.org/historicstreets/ to historical, defunct street names for reference.

Examples:

Using authority list

Philadelphia (Pa.)
Chester (Pa. : Township)

Locally constructed term with more granularity

Sharswood, North Philadelphia, Philadelphia, Pennsylvania
Ben Franklin Bridge, Philadelphia, Pennsylvania

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:coverage Coverage-Spatial <subject>

<geographic>

<hierarchicalGeographic>

<geographicCode>

<cartographics>

dcterms:extent

Definition: The size or duration of the resource.

Repeatable: Yes

Required: Optional

Application: Always include unit of measure (linear feet, pages, inches, etc.). Write out all units avoiding abbreviations, except centimeters should be written as cm (without period).

Example:

4 pages; 28 x 22 cm

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:format Format-Extent <physicalDescription><extent>

dc:contributor

Definition: An entity responsible for making contributions to the resource.

Repeatable: Yes

Required: No

Application: The guidelines for using names of persons or organizations as creators apply to contributors.

dc:publisher

Definition: An entity responsible for making the original resource available (not the institution publishing the digital resource).

Repeatable: Yes

Required: Optional

Application: Transcribe the publisher’s name from the resource exactly as written. If no publisher is given, leave this field blank; don’t write “no publisher” or “SN.”

Examples:

Philadelphia: Henry B. Ashmead Book and Job Printer
Woman's Medical College of Pennsylvania
Octavia Hill Association

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:publisher Publisher <originInfo><publisher>

dcterms:isPartOf

Definition: Formal title of the full physical collection to which the resource belongs. If you wish to provide additional information (e.g. finding aid URL or series or box information), you may choose to use other values in your system, but if so, you must un-map the DC fields in the OAI export.

Repeatable: Yes

Required: Yes

Application: The database maps this field so that the user can click on a collection and see all the content from that collection that is included in the InHOR site.

Examples:

Ann Preston papers, Acc-029
Cope Evans papers

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:source or dc:relation Source <relatedItem>

dc:rights

Definition: Information about copyright or other restrictions on the use of the resource.

Repeatable: Yes

Required: Yes

Application: State clearly and concisely what uses are allowed and not allowed for the items in your collection. InHOR recommends selecting one of the standardized statements at RightsStatements.org. Choose the statement that is most precise and permissive allowable for the materials in question. (Creative Commons licenses are another good choice, although they are technically intended for licensing by creators, not but cultural heritage institutions.)

If your institution doesn’t have a specific practice and RightsStatements.org doesn’t suit, consider adopting the following blanket statement, using language from CLIR’s Intellectual Property (IP) Agreement.

    Rights assessment is your responsibility. This material is made available for noncommercial educational, scholarly, and/or charitable purposes. For other uses or for more information, please contact [the repository, + contact info].

Examples: InHOR recommends using the following statements, listed in order of preference, as applicable.

    This work is believed to be in the Public Domain under the laws of the United States. For more information, see [http://rightsstatements.org/page/NoC-US/1.0/](http://rightsstatements.org/page/NoC-US/1.0/?language=en)
    
    This work is not in copyright, but commercial uses of this digital representation are limited. For more information, contact [the repository + contact info] and see [http://rightsstatements.org/page/NoC-NC/1.0/](http://rightsstatements.org/page/NoC-NC/1.0/)
    
    Rights assessment is your responsibility. This material is made available for noncommercial educational, scholarly, and/or charitable purposes. For other uses or for more information, please contact [the repository, + contact info].

Mappings:

Simple Dublin Core CONTENTDM MODS
dc:rights Rights-License <accessCondition>

header - datestamp

Not Dublin Core-compliant

This datestamp indicates when the record was first ingested into the In Her Own Right database. Note that subsequent re-harvests and overwrites are not reflected in the datestamp, so it does not provide an accurate snapshot of when the record was last altered or updated.

full text

Not Dublin Core-compliant

Definition: Transcription or OCR text.

Repeatable: No

Required: Optional

Application: OAI contributors: The InHOR system harvests OAI-PMH records at the object-level only. Therefore, for more complete ingest of your records, InHOR recommends placing the full document transcript at the object level -- either instead of or in addition to entering page-level transcripts. Alternatively, if you do not wish to place full document transcripts at the object level, you may submit the transcripts to InHOR via a supplemental CSV spreadsheet. If you would like to do this, contact inhor@pacscl.org for complete instructions.

CSV contributors: The full document transcript must be entered as plain text into the single cell of the Transcription field for each object-level resource.

All contributors: Object-level transcripts will be labeled on the aggregator site as a secondary "Description".

All transcripts will be full text searchable. For transcription, InHOR recommends following the guidelines in Appendix B.

Acronyms and abbreviations used in these guidelines

AAT - Art & Architecture Thesaurus, http://www.getty.edu/research/tools/vocabularies/aat/

CLIR - Council on Library and Information Resources, https://www.clir.org/

DACS - Describing Archives: A Content Standard, https://www2.archivists.org/standards/DACS

DC - Dublin Core, http://www.dublincore.org/documents/dces/

DPLA - Digital Public Library of America, https://dp.la/

FAST - Faceted Application of Subject Terminology, https://fast.oclc.org/searchfast/

InHOR - In Her Own Right, http://herownright.pacscl.org/

ISO - International Organization for Standardization, https://www.iso.org/

LCNAF - Library of Congress Name Authority File, https://authorities.loc.gov/

LCSH - Library of Congress Subject Headings, https://authorities.loc.gov/

MODS - Metadata Object Descriptive Schema, http://www.loc.gov/standards/mods/

OAI/OAI-PMH - Open Archives Initiative-Protocol for Metadata Harvesting, https://www.openarchives.org/pmh/

PACSCL - Philadelphia Area Consortium of Special Collections Libraries, http://pacscl.org/

RDA - Resource Description and Access, https://www.oclc.org/en/rda/about.html

TGN - Getty Thesaurus of Geographic Names, http://www.getty.edu/research/tools/vocabularies/tgn

VIAF - Virtual International Authority File, https://viaf.org/

W3CDTF - World Wide Web Consortium Date and Time Format, https://www.w3.org/TR/NOTE-datetime

Releases

No releases published

Packages

No packages published