Skip to content

Conversation

@sheriefvt
Copy link
Contributor

Ticket

https://openscience.atlassian.net/browse/SHARE-633

Problem

This is a potential problem with every record that we are getting from Datacite because of an issue with the datacite transformer.

Solution:

  • Update the datacite source.yaml file.
  • Update the datacite transformer.

* Update datacite source.yaml
* Fix description and tags parsing
description = tools.RunPython(
force_text,
tools.Try(ctx.record.metadata['oai_datacite'].payload.resource.descriptions.description[0])
tools.Try(ctx.record.metadata['oai_datacite'].payload.resource.descriptions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to use tools.Join here. If there are multiple descriptions it will join them all into a single string.

* Merge branch 'develop' of https://github.com/CenterForOpenScience/SHARE into bug/SHARE-633
* Use tools.Join for joining multiple descriptions
@chrisseto chrisseto requested a review from laurenbarker May 9, 2017 13:33
Copy link
Contributor

@laurenbarker laurenbarker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look like it fixes the tags and description issue :) However, the issue might be more widespread than it originally looked like. It might be best to move the text_list and force_text functions to share/transform/chain/utils.py and write a test for them. That way the same functions can be used in multiple transformers. Right now, the transformer for datacite, oai, mods, and clinicaltrial all use some variation of force_text and it seems like that's where the issue was originating from. What do you think @chrisseto? It doesn't have to be part of this ticket but it would probably be a good idea to do before we re-transform datacite records.

tools.Maybe(tools.Maybe(ctx.record, 'metadata')['oai_datacite'], 'type'),
tools.RunPython(
'text_list',
(tools.Concat(tools.Try(ctx.record.metadata['oai_datacite'].payload.resource.subjects.subject)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Concat was the problem for tags, it should be removed for the subjects as well (https://github.com/CenterForOpenScience/SHARE/pull/650/files#diff-bedec94698aaa5a42507c84db3ae44d1R549).

@laurenbarker
Copy link
Contributor

https://openscience.atlassian.net/browse/SHARE-848 has been added to address the broader concerns. As soon as the subjects field is addressed this should be good to go 🎉

@sheriefvt
Copy link
Contributor Author

@laurenbarker thanks for your feedback. I updated the subject field as requested. 🐼

@chrisseto chrisseto merged commit 617b737 into CenterForOpenScience:develop May 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants