Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/3367 missing oaipmh records #2191

Merged
merged 12 commits into from Mar 17, 2023

Conversation

amdomanska
Copy link
Contributor

@amdomanska amdomanska commented Jan 16, 2023

"Missing OAIPMH records

Please don't delete any sections when completing this PR template; instead enter N/A for checkboxes or sections which are not applicable, unless otherwise stated below

See 3367

Describe the scope/purpose of the PR here in as much detail as you like

Categorisation

This PR...

  • has scripts to run
  • has migrations to run
  • adds new infrastructure
  • changes the CI pipeline
  • affects the public site (oaipmh)
  • affects the editorial area
  • affects the publisher area
  • affects the monitoring

Basic PR Checklist

Testing

Worth re-running the user's original request to check that they get the results they expect

Harvest the Architecture set: https://doaj.org/oai?verb=ListRecords&set=TENDOkFyY2hpdGVjdHVyZQ~~&metadataPrefix=oai_dc

Harvest the Fine Art set: https://doaj.org/oai?verb=ListRecords&set=TENDOkZpbmUgQXJ0cw~~&metadataPrefix=oai_dc

Look for "Urbanism. Arhitectura. Constructii".

@amdomanska amdomanska marked this pull request as ready for review January 17, 2023 09:48
s["term"]["index.schema_subject.exact"] = oai_set
q["query"]["bool"]["must"].append(s)
a = oai_set.replace(constants.SUBJECTS_SCHEMA,"")
q["query"]["bool"]["should"] = {"match":{"index.classification": a}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be on index.classification.exact ? Or does changing it from a term query to a match fix that?

Also, are we sure it should be a should query? Seems like is should be a must query like the one it is replacing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm confident that this should be a term query and a must.

Can we please lay it out the way way as it was before, so use self.set_limit and just modify it to use index.classification.exact instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. The should and match is the only way to make it work without the exact filter. When exact is applied, then it is must and term that works.

s["term"]["index.schema_subject.exact"] = oai_set
q["query"]["bool"]["must"].append(s)
a = oai_set.replace(constants.SUBJECTS_SCHEMA,"")
q["query"]["bool"]["should"] = {"match":{"index.classification": a}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm confident that this should be a term query and a must.

Can we please lay it out the way way as it was before, so use self.set_limit and just modify it to use index.classification.exact instead.

@@ -71,8 +72,9 @@ def list_records(self, from_date=None, until_date=None, oai_set=None, list_size=
if start_after is not None or from_date is not None or until_date is not None or oai_set is not None:

if oai_set is not None:
a = oai_set.replace(constants.SUBJECTS_SCHEMA,"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable a never used. I think is forgot to remove this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amdomanska can you confirm if this line should be used somewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, that line is required, but the result of it was not used. I have fixed and pushed.

Copy link
Contributor

@richard-jones richard-jones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amdomanska there was one bug which I have fixed for the sake of getting this live sooner

@@ -71,8 +72,9 @@ def list_records(self, from_date=None, until_date=None, oai_set=None, list_size=
if start_after is not None or from_date is not None or until_date is not None or oai_set is not None:

if oai_set is not None:
a = oai_set.replace(constants.SUBJECTS_SCHEMA,"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amdomanska can you confirm if this line should be used somewhere?

@@ -71,8 +72,9 @@ def list_records(self, from_date=None, until_date=None, oai_set=None, list_size=
if start_after is not None or from_date is not None or until_date is not None or oai_set is not None:

if oai_set is not None:
a = oai_set.replace(constants.SUBJECTS_SCHEMA,"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, that line is required, but the result of it was not used. I have fixed and pushed.

@Steven-Eardley Steven-Eardley merged commit 588d0af into develop Mar 17, 2023
@RK206 RK206 deleted the feature/3367_missing-oaipmh-records branch March 22, 2023 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants