Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added handling for simple-article #93

Merged
merged 7 commits into from
Mar 6, 2024
Merged

Conversation

mugdhapolimera
Copy link
Contributor

P.S. : Also added some clean up statements for titles.

@codecov-commenter
Copy link

codecov-commenter commented Feb 23, 2024

Codecov Report

Attention: Patch coverage is 86.66667% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 90.29%. Comparing base (6473c3f) to head (61d5a70).

Files Patch % Lines
adsingestp/parsers/elsevier.py 86.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #93      +/-   ##
==========================================
+ Coverage   89.99%   90.29%   +0.30%     
==========================================
  Files          25       25              
  Lines        2618     2627       +9     
==========================================
+ Hits         2356     2372      +16     
+ Misses        262      255       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@seasidesparrow seasidesparrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but I have one suggestion: in def parse(), would you please raise an exception immediately after the check for d.find("ja:simple-article") (L370) if the result is still self.record_meta=None ? If self.record_meta is nonetype, the remainder of the code is going to generate an exception at some point, so I think it makes sense to stop execution there before it starts executing the self._parse_* statements.

Copy link
Member

@seasidesparrow seasidesparrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the "elif" in L370 to if -- this will guarantee the line is executed regardless of what happens in L368.

@seasidesparrow
Copy link
Member

I am considering revising slightly how we do this in light of the number of document types that Elsevier may provide in addition to ja:article and 'ja:simple-article`. Their schema document (https://supportcontent.elsevier.com/Support%20Hub/DaaS/36178_ConSyn_Schemas_Document.pdf) lists the following potential article types:

cja:converted-article
ja:article
ja:simple-article
ja:book-review
ja:exam
bk:book
bk:chapter
bk:simple-chapter
bk:examination
bk:fb-non-chapter
bk:glossary
bk:index
bk:introduction
bk:bibliography

We don't necessarily want all of these, but certainly a subset of them.

@mugdhapolimera mugdhapolimera merged commit 1188a5e into adsabs:main Mar 6, 2024
4 checks passed
@mugdhapolimera mugdhapolimera deleted the els branch March 15, 2024 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants