Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added all PPR, MED and PMC retracted article identifiers I have collected so far #21

Merged
merged 4 commits into from May 2, 2023

Conversation

jmillanacosta
Copy link
Contributor

0ffd03e adds the data for the new sources.
Also, 08232bf fixes a mistake I made with my previous PR where I had confused PMC retraction notice identifiers with PMC article identifiers (they actually were MED articles, the right article identifiers are now in pubmed.tsv, sorry about that!)

@cthoyt
Copy link
Collaborator

cthoyt commented May 1, 2023

Can you please give a bit of background on the acronyms you’re using? Bioregistry does not have an entry for PPR so right now this isn’t super actionable and therefore fails CI. If you can give some background, I will add it to bioregistry.

@egonw
Copy link
Member

egonw commented May 2, 2023

Can you please give a bit of background on the acronyms you’re using?

I'll pick that up. Sorry, should have assigned it to me last night.

@egonw egonw self-assigned this May 2, 2023
@egonw
Copy link
Member

egonw commented May 2, 2023

@jmillanacosta, those PPR identifiers are preprint, correct? Then it's this resource: https://europepmc.org/Preprints

@jmillanacosta
Copy link
Contributor Author

@jmillanacosta, those PPR identifiers are preprint, correct? Then it's this resource: https://europepmc.org/Preprints

Yes: https://europepmc.org/Help in the Sources of content section

@egonw
Copy link
Member

egonw commented May 2, 2023

Yes: https://europepmc.org/Help in the Sources of content section

Okay, then it's this Bioregistry entry: https://bioregistry.io/registry/ppr

@egonw egonw merged commit 6810d5c into bridgedb:main May 2, 2023
1 check failed
@egonw
Copy link
Member

egonw commented May 2, 2023

@cthoyt, the integration test will still fail, but that's an upstream issue: biopragmatics/bioregistry#821

@egonw
Copy link
Member

egonw commented May 2, 2023

Tooted: https://fosstodon.org/@bridgedb/110297675535190330

@cthoyt
Copy link
Collaborator

cthoyt commented May 2, 2023

@cthoyt, the integration test will still fail, but that's an upstream issue: biopragmatics/bioregistry#821

I'm not super happy with how this PR was handled - more thoughts in #22 (comment).

Regardless, I'll take care of the PPR regex right now in the Bioregistry. Thanks for the suggestion!

@cthoyt
Copy link
Collaborator

cthoyt commented May 2, 2023

it also seems that the tests checking the names of the files were disregarded here, too. Are you all familiar with how to read the test logging? I think there is a note in it specifically for when there are files added whose names are not themsevles valid prefixes (e.g.., PMC is not a valid prefix, it should be pmc)

@egonw
Copy link
Member

egonw commented May 2, 2023

I think there is a note in it specifically for when there are files added whose names are not themsevles valid prefixes

I got this output:

======================================================================================================================================== FAILURES =========================================================================================================================================
__________________________________________________________________________________________________________________________ IntegrityTestCase.test_csv_integrity ___________________________________________________________________________________________________________________________

self = <tests.test_integrity.IntegrityTestCase testMethod=test_csv_integrity>

    def test_csv_integrity(self):
        """Test all files have the right columns."""
        for path in self.paths:
            prefix = path.stem
            pattern = bioregistry.get_pattern(prefix)
>           self.assertIsNotNone(pattern)
E           AssertionError: unexpectedly None

tests/test_integrity.py:38: AssertionError
____________________________________________________________________________________________________________________________ IntegrityTestCase.test_path_names ____________________________________________________________________________________________________________________________

self = <tests.test_integrity.IntegrityTestCase testMethod=test_path_names>

    def test_path_names(self):
        """Check all file names use standard prefixes."""
        for path in self.paths:
            with self.subTest(name=path.name):
                stem = path.stem
                norm_stem = bioregistry.normalize_prefix(stem)
>               self.assertEqual(norm_stem, stem, msg=f"unnormalized file name: {path.name}")
E               AssertionError: 'pmc' != 'PMC'
E               - pmc
E               + PMC
E                : unnormalized file name: PMC.tsv

tests/test_integrity.py:31: AssertionError
================================================================================================================================= short test summary info =================================================================================================================================

@egonw
Copy link
Member

egonw commented May 2, 2023

I'm not super happy with how this PR was handled

Understood and taken into account.

I stand by my point. PRs should not be held back because of an incorrect technicality. People before machines.

@egonw
Copy link
Member

egonw commented May 2, 2023

it also seems that the tests checking the names of the files were disregarded here, too.

For clarity, this test was correct. This was not disregarded. I committed a fix for that before I merged in the branch. It did not show up in the PR because I did not want to push to the main branch of a fork. I am sorry I did not comment that in the PR. The 'wrong test' was not that one, but the one about a regular expression, and only that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants