Why does normalized_date exist? #1336

jcoyne · 2022-11-18T18:37:41Z

I see we copy normalized_date_ssm to the date_sort field, but aside from that it doesn't appear to be used. Do we need it?

The text was updated successfully, but these errors were encountered:

marlo-longley · 2022-12-01T22:51:37Z

@jcoyne Recently we've seen that normalized_date_ssm is appended to the normalized_title and guessed that this was desired by archivists/people working with historical content. We allow it to be customized now via #1344. Not sure if that answers!

jcoyne · 2022-12-02T02:08:42Z

I see that, but does it need to be stored in its own solr field?

seanaery · 2022-12-02T14:22:25Z

I think the normalized date is useful to store in its own Solr field for downstream applications to use if desired even if it is smushed into the normalized title by default in core. A couple examples I know of...

UAlbany has a nice collection list on a "repository" show page that splits the date out from the title: e.g., https://archives.albany.edu/description/repositories/apap and https://github.com/UAlbanyArchives/arclight-UAlbany/blob/main/app/views/arclight/repositories/show.html.erb#L45

Duke has a CSV exporter from bookmarks that we use to make digitization work orders and isolating the date helps us map date metadata from archival components to digitized object metadata for our digital repository.

jcoyne · 2022-12-02T15:41:20Z

@seanaery I think it would be better if we could either put those full features in Arclight or completely remove them from Arclight. Currently these features are sort of hanging between two worlds.

seanaery · 2022-12-02T18:38:43Z

@jcoyne That's a fair assessment and I appreciate that you're thinking about only storing in Solr the data that the core application expects to use. We are flexible at Duke and would just extend the traject rules locally to capture the normalized date atomically/additionally in its own Solr field if that disappears from core.

Caveat: I'm not an archivist. But the way I understand it is, this was developed as it is because proper titles of archival components (collections, too, but especially components) are often generic and repeated within a collection ("Letters," "Correspondence," "Newspaper Clippings"). Appending the dates in the places where the title typically appears gives the described entity some valuable distinction and context, e.g., in the html page <title>, in a list of sibling components, etc.

gwiedeman · 2023-07-20T13:58:25Z

There is duplication here and I think normalized_title_ssm is actually the field that is unnecessary. Yeah, they'll be lots of components like "Minutes, 1990" and "Minutes, 1991" that are distinguished by date and Arclight currently uses normalized_title_ssm to display this, but I feel like title in date could just be appended in the template, no? That way, both the title and date are still stored in a structured way in Solr. This would more easily facilitate #292, which is the dream.

I believe the reason for normalized_title_ssm if you're searching "minutes 1991" in this case, but I'm not sure if including that in the index as a distinct field actually aids relevancy. If it does, then I probably shouldn't have to be stored.

The downside to this is that more logic would have to be in the template to handle date types like inclusive and bulk dates. But to me it makes sense to have well-structured data in Solr and have that logic be in the template rather than the data harvesting pipeline.

I'm also not sure that "normalized" is the best descriptor here for what Arclight is doing. Typically archivists use two date fields, a required well-structured date and what ASpace calls a date "expression" that is optional. That way you can have a publication that has a displayed date expression of "Fall 2002" that also has a date like "2002-09" for sorting. Prior to Arclight 1.0 at lease, Arclight used unitdate_ssm as a list of well-structured dates and normalized_date_ssm for the date expression.

jcoyne changed the title ~~Why does normalied_date exist?~~ Why does normalized_date exist? Nov 18, 2022

cbeer mentioned this issue Nov 18, 2022

Add date field to the displayed metadata sul-dlss/vt-arclight#248

Merged

gwiedeman added the data model label Jul 20, 2023

gwiedeman mentioned this issue Jul 20, 2023

Single dates exported from ArchivesSpace do not display #1028

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does normalized_date exist? #1336

Why does normalized_date exist? #1336

jcoyne commented Nov 18, 2022

marlo-longley commented Dec 1, 2022 •

edited

jcoyne commented Dec 2, 2022

seanaery commented Dec 2, 2022

jcoyne commented Dec 2, 2022

seanaery commented Dec 2, 2022

gwiedeman commented Jul 20, 2023

Why does normalized_date exist? #1336

Why does normalized_date exist? #1336

Comments

jcoyne commented Nov 18, 2022

marlo-longley commented Dec 1, 2022 • edited

jcoyne commented Dec 2, 2022

seanaery commented Dec 2, 2022

jcoyne commented Dec 2, 2022

seanaery commented Dec 2, 2022

gwiedeman commented Jul 20, 2023

marlo-longley commented Dec 1, 2022 •

edited