RFC Prefix normalised PubMed ids with pmid:

The following PubMed ID is not correctly detected because it is also a valid EAN8 number:
https://www.ncbi.nlm.nih.gov/pubmed/?term=26037202

``` python
>>> import idutils
>>> idutils.is_pmid('26037202')
<_sre.SRE_Match at 0x10b774608>
>>> idutils.detect_identifier_schemes('26037202')
['ean8’]
>>> idutils.detect_identifier_schemes('pmid:26037202')
['pmid']
```

I think the main problems is when scheme detection is used together with normalisation:

``` python
>>> idutils.normalize_pmid('pmid:26037202')
'26037202'
>>> idutils.detect_identifier_schemes(idutils.normalize_pmid('pmid:26037202'))
['ean8']
>>> idutils.detect_identifier_schemes('pmid:26037202')
['pmid']
```

I would propose that we change PubMed normalisation to include `pmid:` prefix so that the following holds true:

``` python
idutils.detect_identifier_schemes(idutils.normalize_pmid('pmid:26037202')) == idutils.detect_identifier_schemes('pmid:26037202')
```

This is not strictly correct, but having just integers as identifiers is a bad idea anyway.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC Prefix normalised PubMed ids with pmid: #25

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC Prefix normalised PubMed ids with pmid: #25

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions