Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to ingest funders #222

Closed
4 tasks
jeremyf opened this issue Dec 15, 2022 · 7 comments
Closed
4 tasks

Unable to ingest funders #222

jeremyf opened this issue Dec 15, 2022 · 7 comments
Assignees
Labels
bug Something isn't working In Progress SL-CSI Service Label: Current service incident
Milestone

Comments

@jeremyf
Copy link
Contributor

jeremyf commented Dec 15, 2022

British Library uses the crossref funder registry - each variation of the funder we've tried results in a failed import NoMethodError in Bx.

The following method is raising an exception, because the yielded hash is a String but the logic excpects a Hash.

def clean_incomplete_data_for_funder(data_hash)
return if data_hash.blank?
data_hash.each do |hash|
hash.transform_values! { |_v| nil } if hash["funder_name"].blank?
end
end

The raw metadata regarding Funder is as follows:

funder_1: Arts and Humanities Research Council (AHRC)
funder_project_reference_1: AH/S01179X/1

The parsed metadata for Funder is as follows:

funder: ["[\"Arts and Humanities Research Council (AHRC)\"]"]

In the parsed metadata there is no instance of project reference.

Testing Instructions

altered sample file (Google Drive link)

  • download the altered sample and create a Bulkrax csv importer with it
  • check the funder of the resulting import
  • observe that the funder appears on the show page
  • observe that all the funder information is on the edit page
@jeremyf jeremyf added this to the AHRC 2 milestone Dec 15, 2022
@jeremyf jeremyf added the bug Something isn't working label Dec 15, 2022
@j-basford
Copy link
Collaborator

See the examples:
https://bl-demo.iro.bl.uk/importers/97?locale=en
https://bl-demo.iro.bl.uk/importers/98?locale=en
Tried two variants of AHRC (the latter was Arts and Humanities Research Council which is what we see on the public view - perhaps we should have tried Arts and Humanities Research Council, United Kingdom, which is what you see in the lookup when selecting AHRC...)

jeremyf added a commit that referenced this issue Dec 15, 2022
Reviewing the logs I found:

```shell
ERROR -- : [c94e8538d4e27e31ce8995e3cf6af992] RSolr::Error::Http - 400 Bad Request
Error: {
  "responseHeader":{
    "zkConnected":true,
    "status":400,
    "QTime":0,
    "params":{
      "facet.field":["human_readable_type_sim",
        "resource_type_label_ssim",
        "creator_search_sim",
        "keyword_sim",
        "subject_sim",
```

The following line showed a string that was 2284 characters long and
ended in `f.member_of_collection_ids_ssim.facet.matches=%5E%24`; those
trailing characters decoded `^$` which could be a complete end of a
query (e.g. an empty string regexp match).

Related to:

- samvera/hyrax#4728
- #222
jeremyf added a commit that referenced this issue Dec 21, 2022
Reviewing the logs I found:

```shell
ERROR -- : [c94e8538d4e27e31ce8995e3cf6af992] RSolr::Error::Http - 400 Bad Request
Error: {
  "responseHeader":{
    "zkConnected":true,
    "status":400,
    "QTime":0,
    "params":{
      "facet.field":["human_readable_type_sim",
      "resource_type_label_ssim",
      "creator_search_sim",
      "keyword_sim",
      "subject_sim",
```

The following line showed a string that was 2284 characters long and
ended in `f.member_of_collection_ids_ssim.facet.matches=%5E%24`; those
trailing characters decoded `^$` which could be a complete end of a
query (e.g. an empty string regexp match).

Related to:

- samvera/hyrax#4728
- #222
- #223
- @9253f1a0c3cee81941c779765eb6a360309ea77b
@cziaarm cziaarm added the SL-CSI Service Label: Current service incident label Jan 10, 2023
jeremyf added a commit that referenced this issue Jan 11, 2023
Reviewing the logs I found:

```shell
ERROR -- : [c94e8538d4e27e31ce8995e3cf6af992] RSolr::Error::Http - 400 Bad Request
Error: {
  "responseHeader":{
    "zkConnected":true,
    "status":400,
    "QTime":0,
    "params":{
      "facet.field":["human_readable_type_sim",
        "resource_type_label_ssim",
        "creator_search_sim",
        "keyword_sim",
        "subject_sim",
```

The following line showed a string that was 2284 characters long and
ended in `f.member_of_collection_ids_ssim.facet.matches=%5E%24`; those
trailing characters decoded `^$` which could be a complete end of a
query (e.g. an empty string regexp match).

Related to:

- samvera/hyrax#4728
- #222
@ShanaLMoore ShanaLMoore modified the milestone: AHRC 2 Mar 27, 2023
@ShanaLMoore
Copy link
Contributor

related to: #355

@ShanaLMoore
Copy link
Contributor

ShanaLMoore commented Mar 27, 2023

@j-basford Would you happen to have the two files used to test this? The weird data issue on staging is making this hard to look into. Is this still a problem?

@ShanaLMoore
Copy link
Contributor

@jeremyf Could you update this ticket with testing instructions or steps to reproduce?

@kirkkwang
Copy link
Contributor

kirkkwang commented Mar 29, 2023

Looks like funder_project_reference_1 should really be fndr_project_ref_1 which is its own property and not a part of the funder hash object.

image

Here's what it looks like when I use fndr_project_ref_1 in the CSV:

image

The Funder Project Reference field does not show anywhere in the UI, but the value is stored and shows in the JSON feed.

Also, since Bulkrax does not have an API lookup for funder, the fields all need to be explicitly included in the CSV. The CSV would look like:

funder_name_1 funder_doi_1 funder_isni_1 funder_ror_1 funder_award_1
Arts and Humanities Research Council (AHRC) http://dx.doi.org/10.13039/501100000267 0000 0004 3497 6001 https://ror.org/0505m1554  

Here's the altered CSV for the failed example:
BLNewspapers_NuneatonTimes_test.csv

@laritakr
Copy link
Contributor

Passed QA.

Show page:
Screenshot 2023-04-17 at 11 55 59 AM

Edit page:
Screenshot 2023-04-17 at 11 56 10 AM

@grahamjevon
Copy link
Collaborator

Passes BL QA - funder details successfully imported. Just need to be mindful that there is no API validation. So maybe we need to include a dropdown list in the bx template (to maintain consistency).

@jillpe jillpe closed this as completed Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working In Progress SL-CSI Service Label: Current service incident
Projects
Status: Done
Development

No branches or pull requests

8 participants