Skip to content

Latest commit

 

History

History
279 lines (174 loc) · 11.9 KB

options.adoc

File metadata and controls

279 lines (174 loc) · 11.9 KB

Emendate options

ambiguous_month_day

What is this?

2/3 could mean February 3 or March 2

Default value

:as_month_day, resulting in February 3

Alternate value

:as_day_month, resulting in March 2

Does not apply when there is no ambiguity. For example, 29/2 will always be interpreted as February 29, as there is no month 29.

ambiguous_month_day_year

What is this?

10-02-06 could reasonably be interpreted in four different ways

Default value

:month_day_year, resulting in 2006-10-02

Alternate values

  • :day_month_year, resulting in 2006-02-10

  • :year_month_day, resulting in 2010-02-06

  • :year_day_month, resulting in 2010-06-02

Does not apply when there is no ambiguity. For example, 87-04-13 will always be interpreted as 1987-04-13, because:

  • There is no month or day 87 (so it must be year)

  • There is no month 13 (so it must be day)

  • That leaves 04 as month

ambiguous_month_year

What is this?

Does 2010-12 mean 2010-2012, or December 2010?

Default value

:as_year, resulting in 2010-2012

Alternate value

as_month, resulting in December 2010

ℹ️

This also applies to ambiguity between year and numerically expressed season, such as 2020-21, which could mean 2020-2021 or Spring 2020.

Does not apply when there is no ambiguity. For example, 2020-10 will always be interpreted as October 2020, since a range of 2020-2010 would be nonsense.

ambiguous_year_rollback_threshold

What is this?

If a two-digit year is to be coerced to four digits, what should the first two digits be? Is '20' 1920 or 2020?

Default value

the current year’s last two digits. So, in 2021, it is 21

Alternate value

any 1 or 2 digit Integer

Numbers less than this value are treated as current century.

Numbers greater than or equal to this are treated as the previous century.

If we are using the default and the current year is 2021…​

  • a value of 21 will become 1921

  • a value of 20 will become 2020

💡

This option is irrelevant if you are using two_digit_year_handling: :literal

and_or_date_handling

What is this?

Controls how date strings like "1970, 1972 - 1999, and 2002" and "1923 or 1925" are parsed

Default value

:multi ("1970, 1972 - 1999, and 2002" will be parsed into 3 dates (a year, a range, and a year), and "1923 or 1925" will be parsed into 2 dates (both years))

Alternate values

  • :single_range ("1970, 1972 - 1999, and 2002" will be parsed into 1 date (a range with earliest value 1970-01-01 and latest value 2002-12-31), and "1923 or 1925" will be parsed into 1 date (a range with earliest value 1923-01-01 and latest value 1925-12-31))

When dialect: :collectionspace, this is automatically set to :single_range, because CollectionSpace’s structured date fields do not provide a graceful way to express multiple dates.

bce_handling

What is this?

Controls how Emendate::DateType::Year sets its attributes when :era == :bce.

Default value

:precise ("1223 BCE" will be parsed to -1222)

Alternate values

  • :naive ("1223 BCE" will parsed to "1223")

When dialect: :collectionspace, this is automatically set to :naive, so that translated values match what CollectionSpace internal date parser produces.

before_date_treatment

What is this?

Whether to treat a date like "before 1950" or "pre-1950" as a range

Default value

:point ("before 1950" will be treated as a single date point, with earliest and latest date 1949-12-31)

Alternate values

  • :range ("before 1950" will be treated as a range, with the value of :open_unknown_start_date as the earliest date, and 1949-12-31 as the latest date)

beginning_hyphen

What is this?

How to interpret a hyphen at the beginning of a date string (e.g. -2002)

Default value

:unknown ("beginning of range is unknown but was some point before 2002")

Alternate values

  • :edtf (2003 BCE)

  • :open (known to have occurred from the beginning of time until 2002)

Default is set to :unknown because I cannot actually imagine a case where the literal meaning of :open would be needed in a GLAM context, and needing to record BCE dates is comparatively rare. It is much more common to use this to mean "unknown beginning of range"

dialect

What is this?

date expression to return when you translate a date string

Default value

:none

Alternate value

:lyrasis_pseudo_edtf, :edtf, collectionspace

Not fully implemented at all!

By default parse will return an Emendate::Result that another script can use to do whatever is needed.

By calling translate, you can get a simpler, pre-processed Emendate::Translation of your original string into another date format. See output documentation for details.

edtf

What is this?

A shorthand option to indicate incoming date values should be interpreted using options for EDTF format

Default value

false

Alternative value

true

If set to true, the following will be set:

  • beginning_hyphen: :edtf

  • ending_slash: :unknown

  • square_bracket_interpretation: :edtf_set

  • max_month_number_handling: :edtf_level_2

These options support the full Level 2 EDTF specification. Set the relevant options manually if incoming date values conform to EDTF Level 0 or 1.

ending_hyphen

What is this?

How to interpret a hyphen at the end of a date string (e.g. 2002-)

Default value

:open ("known to have occurred from 2002 until now, and occurrence is ongoing")

Alternate values

  • :unknown ("occurrence ended some time after 2002 and now, but exact end date is unknown")

Default value is :open because this form is frequently used to record the ongoing (still currenting happening, and expected to continue happening) publication of continuing resources.

ending_slash

What is this?

How to interpret a slash at the end of a date string (e.g. 2002/)

Default value

:open ("known to have occurred from 2002 until now, and occurrence is ongoing"); set to :unknown if edtf: true

Alternate values * :unknown ("occurrence ended some time after 2002 and now, but exact end date is unknown")

Default value is :open because this form is frequently used to record the ongoing (still currenting happening, and expected to continue happening) publication of continuing resources.

hemisphere

What is this?

Used to map location-independent EDTF season values (21-24) to hemisphere-appropriate date ranges

Default value

:northern

Alternate values * :southern

Default is northern hemisphere because that is where the original code author and most clients they work for are located.

max_output_dates

What is this?

Some strings will get parsed into multiple dates (2002, 2004). By default each individual date found will be returned. Some applications can only handle a single date, so you may want to limit the number of dates included in the output.

Default value

:all ("known to have occurred from 2002 until now, and occurrence is ongoing")

Alternate value

any Integer

max_month_number_handling

What is this?

Tells the application what to consider the largest number that might be treated as a month (or season or EDTF Level 2 sub-year grouping, both of which get treated as month internally)

Default value

:months - largest number that can be a month is 12

Alternate values

  • :edtf_level_1 - use if input is known to include values like 2021-22 (Summer 2021, independent of location) - largest number that may be a month is 24

  • :edtf_level_2 - use if input is known to include values like 2021-30 (Summer 2021, Southern Hemisphere) - largest number that may be a month is 41

ℹ️
Numbers 13-20 are never treated as months

mismatched_bracket_handling

What is this?

How the application should handle missing open or close brackets

Default value

:absorb - merge the mismatched bracket segment into the nearest meaningful segment. This keeps the bracket in generated string date values derived from orig_string, but otherwise ignores the bracket for processing.

Alternate values

  • :failure - Date processing will fail with a mismatched bracket error

open_unknown_end_date

Date ranges may have open or unknown end dates.

To display such dates, we don’t need to make up a end date.

However, depending on your application, meaningfully indexing or faceting on this value may require some made-up end date.

This setting controls what will be output as the date_end_full value of your Emendate.result in the case of an open or unknown end date. The date_end value will be derived from this value, but possibly truncated to match the level of granularity of the known/closed end date.

Default value

2999-12-31

Alternate value

Any year/month/day expressed as YYYY-MM-DD

open_unknown_start_date

Date ranges may have open or unknown start dates.

To display such dates, we don’t need to make up a start date.

However, depending on your application, meaningfully indexing or faceting on this value may require some made-up start date.

This setting controls what will be output as the date_start_full value of your Emendate.result in the case of an open or unknown start date. The date_start value will be derived from this value, but possibly truncated to match the level of granularity of the known/closed end date.

Default value

1583-01-01

Alternate value

Any year/month/day expressed as YYYY-MM-DD

For dates like "before 1950" or "pre-1950", this option will have no effect if :before_date_treatment is :point. You must set that option to :range for this value to be used as the earliest date.
ℹ️
See note on ISO8601 and BCE for rationale for default value.

pluralized_date_interpretation

What is this?

Should 1900s be interpreted as 1900-1909, or 1900-1999? Should 2000s be interpreted as 2000-2009, or 2000-2999?

Default value

:decade, resulting in 1900-1909 and 2000-2009, respectively

Alternate value

:broad, resulting in 1900-1999 and 2000-2999, respectively

1990s will always be interpreted as 1990-1999.

square_bracket_interpretation

What is this?

Should square brackets around a date string be interpreted as an inferred date, or as an EDTF "one of" set?

Default value

:inferred_date

Alternate value

:edtf_set

two_digit_year_handling

What is this?

Should 80 be treated as 1980 or literally as the year 80?

Default value

:coerce, resulting in 1980

Alternate value

:literal, resulting in 80

💡

If you are using the default option (:coerce), also pay attention to the ambiguous_year_rollback_threshold option to ensure desired results.

unknown_date_output

What is this?

When a parsed string is determined to represent a known-to-be unknown date, what string should be output?

Default value

:orig, the original date string will be returned

Alternate value

:custom, indicates that you are providing a string to be used in outputting all KnownUnknownDateTypes

If you set this to :custom, and do not provide a custom value for unknown_date_output_string, a blank string will be output.

unknown_date_output_string

What is this?

The string used for outputting KnownUnknownDateTypes if you have set unknown_date_output: :custom

Default value

'', blank String

Alternate value

any String value, which may be useful if you are trying to standardize "n.d.", "undated", and "no date" all to be output as "not dated"

💡

This setting is not used if unknown_date_output: :orig