Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date accessors #26

Merged
merged 5 commits into from
Dec 5, 2017
Merged

Date accessors #26

merged 5 commits into from
Dec 5, 2017

Conversation

cbeer
Copy link
Member

@cbeer cbeer commented Dec 1, 2017

No description provided.

@coveralls
Copy link

coveralls commented Dec 1, 2017

Coverage Status

Coverage increased (+0.003%) to 99.545% when pulling f9a9bb5 on date-accessors into 48b6a8c on master.

@coveralls
Copy link

coveralls commented Dec 1, 2017

Coverage Status

Coverage increased (+0.006%) to 99.548% when pulling 61f2d06 on date-accessors into 48b6a8c on master.

@coveralls
Copy link

coveralls commented Dec 1, 2017

Coverage Status

Coverage increased (+0.007%) to 99.549% when pulling 3c90bb6 on date-accessors into 48b6a8c on master.

@coveralls
Copy link

coveralls commented Dec 4, 2017

Coverage Status

Coverage decreased (-0.04%) to 99.499% when pulling 69944a5 on date-accessors into 48b6a8c on master.

lib/mods/date.rb Outdated

text.gsub(/^[\[]+/, '').gsub(/[\.\]]+$/, '')
Date.cleanup(text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call super here?

lib/mods/date.rb Outdated
@@ -95,11 +102,11 @@ def self.supports?(text)

# Full text extractor for MM/DD/YYYY-formatted dates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment need to change? Are we accepting 3 digit years now?

lib/mods/date.rb Outdated

def self.cleanup(text)
matches = text.match(REGEX)
roman_to_int(matches[:year].upcase).to_s
end

def self.roman_to_int(value)
value = value.dup
value = value.dup.tr('.', '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to dup if we're running a tr won't that create a copy for you?

@@ -145,9 +152,10 @@ def self.roman_to_int(value)
end
end

# Full-text extractor for centuries encoded as Roman numerals
# Full-text extractor for centuries encoded as Roman numerals; sometimes centuries
# are given as e.g. xvith, hence the funny negative look-ahead assertion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

lib/mods/date.rb Outdated
@@ -181,16 +189,77 @@ def self.cleanup(text)
end
end

# Full-text extractor for data formatted as YYYY-YYYY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or YYY-YYY?

lib/mods/date.rb Outdated
class YearRangeFormat < ExtractorDateFormat
REGEX = /(?<start>\d{3,4})-(?<end>\d{3,4})/

def self.cleanup(text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that this normalizes to YYYY/YYYY

lib/mods/date.rb Outdated

def self.cleanup(text)
matches = text.match(REGEX)
"#{matches[:year]}X"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? Can you leave a comment?

lib/mods/date.rb Outdated
class EmbeddedBCYearFormat < ExtractorDateFormat
REGEX = /(?<year>\d{3,4})\s?B\.?C\.?/i

def self.cleanup(text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment that this normalizes to -YYYY

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these methods normalize to EDTF. I'm not sure a comment on each of them is particularly compelling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename cleanup to normalize_to_edtf for clarity.

# Full-text extractor that tries hard to pick any year present in the data
class EmbeddedYearFormat < ExtractorDateFormat
REGEX = /(?<prefix>-)?(?<year>\d{3,4})/
REGEX = /(?<prefix>-)?(?<!\d)(?<year>\d{4})(?!\d)/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of what a prefix might be?

Copy link
Member Author

@cbeer cbeer Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex is matching just a literal -.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


# Full-text extractor that tries hard to pick any year present in the data
class EmbeddedYearWithBracketsFormat < ExtractorDateFormat
# [YYY]Y Y[YYY] [YY]YY Y[YY]Y YY[YY] YYY[Y] YY[Y]Y Y[Y]YY [Y]YYY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah. 😱 Can you add a comment about what this stuff means? Is there a part of a spec I can look at?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I have no idea where it comes from, I just know we have lots of it.

@coveralls
Copy link

coveralls commented Dec 4, 2017

Coverage Status

Coverage decreased (-0.04%) to 99.499% when pulling 7885fc5 on date-accessors into 48b6a8c on master.

@cbeer cbeer force-pushed the date-accessors branch 3 times, most recently from 7d6c472 to fcb7620 Compare December 4, 2017 19:11
@coveralls
Copy link

coveralls commented Dec 4, 2017

Coverage Status

Coverage decreased (-0.04%) to 99.499% when pulling fcb7620 on date-accessors into 48b6a8c on master.

@coveralls
Copy link

coveralls commented Dec 4, 2017

Coverage Status

Coverage decreased (-0.04%) to 99.499% when pulling 73be57e on date-accessors into 48b6a8c on master.

@drh-stanford drh-stanford merged commit 4be5b88 into master Dec 5, 2017
@drh-stanford drh-stanford deleted the date-accessors branch December 5, 2017 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants