## Trying pmc_id_converter

> "pmc_id_converter for ID converter between PMID, PMCID and DOI"

Here the demonstration of this package is done in temporary sessions served via the MyBinder service, and so if you make anything useful save it back to your local machine ASAP.

The package is already installed in sessions that come up if you launched from [here](https://github.com/fomightez/pmc_id_converter_demo-binder). Otherwise, you'd want to run `%pip install pmc-id-converter` in your Jupyter Notebook file.

--------

# Use `pmc_id_converter` via Command Line interface

NOTE: 
In these demonstrations the exclamation point is used to send commands to the command line. You would not not need this in the terminal/console. **In other words, in most cases you'll want to actually delete the exclamation point shown at the start of these commands to use on the REAL command line.**

Running `pmc_idconv --help` shows you several command examples and options for usage

In [1]:
!pmc_idconv --help

Usage: pmc_idconv [OPTIONS] [IDS]...

  [36m[1m[3mID converter between PMID, PMCID and DOI[0m

Options:
  --version           Show the version and exit.
  -o, --outfile TEXT  the output filename [stdout]
  -?, -h, --help      Show this message and exit.

  [32m

  Examples:
      pmc_idconv --help
      pmc_idconv 30003000                     [PMID]
      pmc_idconv PMC6039336                   [PMCID]
      pmc_idconv 10.1007/s13205-018-1330-z    [DOI]
      pmc_idconv 30003000 30003001 30003002   [BATCH]
      pmc_idconv 30003000 30003001 -o out.jl  [FILE]  

  [97m[3mContact: suqingdong <suqingdong1114@gmail.com>[0m [0m


Let's step through running those examples handling identifier conversion here as a way to cover the use of this on the command line.

In [2]:
!pmc_idconv 30003000

{"doi": "10.1007/s13205-018-1330-z", "pmcid": "PMC6039336", "pmid": 30003000, "requested-id": "30003000"}


In [3]:
!pmc_idconv PMC6039336

{"doi": "10.1007/s13205-018-1330-z", "pmcid": "PMC6039336", "pmid": 30003000, "requested-id": "PMC6039336"}


In [4]:
!pmc_idconv 10.1007/s13205-018-1330-z 

{"doi": "10.1007/s13205-018-1330-z", "pmcid": "PMC6039336", "pmid": 30003000, "requested-id": "10.1007/s13205-018-1330-z"}


The fourth example involving identifiers demonstrates BATCH conversion of multiple identifiers.

In [5]:
!pmc_idconv 30003000 30003001 30003002

{"doi": "10.1007/s13205-018-1330-z", "pmcid": "PMC6039336", "pmid": 30003000, "requested-id": "30003000"}
{"doi": "10.1002/open.201800095", "pmcid": "PMC6031859", "pmid": 30003001, "requested-id": "30003001"}
{"doi": "10.1002/open.201800044", "pmcid": "PMC6031856", "pmid": 30003002, "requested-id": "30003002"}


The fifth example from the usage involving identifiers is meant to demonstrate sending the the results to a file. The related example on the repo [there]((https://github.com/suqingdong/pmc_id_converter#command-line) looks more fully realized version of this and so I suggest using that instead:

```shell
# Output to a file
pmc_idconv 30003000 30003001 30003002 -o out.json
```

In [6]:
!pmc_idconv 30003000 30003001 30003002 -o out.json

[32m[2025-10-22 16:55:59[0m [34mMain[0m cli [1;30mDEBUG[0m MainThread:45] [32msave file to: out.json[0m


At this time I cannot explain why it shows `DEBUG MainThread:45` because everything seems to work so I suggest ignoring that for now.  
Run the next command to demonstratie the last example worked:

In [7]:
!head out.json

{"doi": "10.1007/s13205-018-1330-z", "pmcid": "PMC6039336", "pmid": 30003000, "requested-id": "30003000"}
{"doi": "10.1002/open.201800095", "pmcid": "PMC6031859", "pmid": 30003001, "requested-id": "30003001"}
{"doi": "10.1002/open.201800044", "pmcid": "PMC6031856", "pmid": 30003002, "requested-id": "30003002"}


That shows the content of the file generated file `out.json` are the expected results given the output from `pmc_idconv 30003000 30003001 30003002` earlier.

----------------

# Use `pmc_id_converter` via Python

In addition to the command line, the Usage notes featured in [suqingdong's GitHub repo for pmc_id_converter](https://github.com/suqingdong/pmc_id_converter#id-converter-between-pmid-pmcid-and-doi) show you can use it via Python as well.

I think this offers much more functionality than the command line if you are going to be using this with Python or Jupyter with a Python-based kernel. However, the offered demonstrations are lacking for novices not looking to dig through the code. I will expand greatly on these and give examples of how you can use this utility in your Python ecosystem. In particular, I'll include use with Pandas that is common package in data science work within the Python ecosystem.

These are the Python usage demonstrations offered by [suqingdong's GitHub repo for pmc_id_converter](https://github.com/suqingdong/pmc_id_converter#python):

```python
from pmc_id_converter import API

API.idconv('PMC3531190')
API.idconv('PMC3531190', 'PMC3531191123', 'PMC3531191')
API.idconv('23193287')
API.idconv('10.1093/nar/gks1195')
```

Let's start with the import statement and the first example. Try running this next cell to do that:

In [8]:
from pmc_id_converter import API
API.idconv('PMC3531190')

[PMC3531190]

When running that here, you'll simply get the following as output:

```text
[PMC3531190]
```

That is hardly informative. We got the same thing we put, namely the PubMed Central identifier.  
If you know Python, you may recognize that the brackets may be implying that is a list.

If we run the third offered example we may see we get something more along the line a conversion at least:

In [9]:
API.idconv('23193287')

[PMC3531190]

We didn't need to repeat the import again because it has already been imported into the current namespace.

This time we used the PMID and not get PubMed Central identifier, it seems. However, this doesn't seem as informative as the command line use and that is because why this is a list, it is not simply the identifier. Keen Python-folks may have realized though it looks like a list, it isn't a Python string.  
So what is it?
Run the following cell to check the type of each of the code examples we ran so far:

In [10]:
print(type(API.idconv('PMC3531190')))
print(type(API.idconv('23193287')))

<class 'list'>
<class 'list'>


Indeed, each example from this section so far gave a Python list.

Each list has only one item so let's check what that is by specifying the first item, i.e., the one with index zero.

In [11]:
print(type(API.idconv('PMC3531190')[0]))
print(type(API.idconv('23193287')[0]))

<class 'pmc_id_converter.core.Record'>
<class 'pmc_id_converter.core.Record'>


Now we see each result was an item of the class `pmc_id_converter.core.Record`. 

That's interesting. So what we first saw was a list with a specially define record class as the only item.

To learn about this record item, let's use try the `print()` function to see how Python has been told to display items of the record type.

In [12]:
records_of_query_results = API.idconv('PMC3531190')
for record in records_of_query_results:
    print(record)

{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': 'PMC3531190'}


Oh, that looks like what we got from the command line approach.  
Let's see what the other example yields when explored that way:

In [13]:
records_of_query_results = API.idconv('23193287')
for record in records_of_query_results:
    print(record)

{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': '23193287'}


A-ha, the same thing because that is the PMID.
Now we seem to be getting information that is much more useful.  
And since it is the same thing, we'll focus on just one example for now.

So it turns out, if you dig in the code, you'll see that the developer of the package added a data attribute to this special 'record' class defined, and so we could have gotten much the same more easily by taking the first item in the list of results and getting the data attribute for it.

In [14]:
API.idconv('PMC3531190')[0].data

{'doi': '10.1093/nar/gks1195',
 'pmcid': 'PMC3531190',
 'pmid': 23193287,
 'requested-id': 'PMC3531190'}

What from the `data` attribute of the record looks like a Python dictionary. Is it?

In [15]:
type(API.idconv('PMC3531190')[0].data)

dict

Indeed, it is.  
Since, that object we get from the `data` attribute of the record is a dictionary, we can now use standard Python ways we handle dictionaries to access the data held in it. Like so:

In [16]:
API.idconv('PMC3531190')[0].data.get('doi')

'10.1093/nar/gks1195'

Or the simple way to access the same thing in a dictionary:

In [17]:
API.idconv('PMC3531190')[0].data['doi']

'10.1093/nar/gks1195'

If you have been paying close attention, you'll also realize the last example under the Python examples, this:

```python
API.idconv('10.1093/nar/gks1195')
```

turns out to be much the same thing as the first (`API.idconv('PMC3531190')`) and third (`API.idconv('23193287')`) examples.  
Let's do that query and see what I mean:

In [18]:
records_of_query_results = API.idconv('10.1093/nar/gks1195')
for record in records_of_query_results:
    print(record)
# Indeed what we use earlier gives much the same dictionary of results:
print(API.idconv('PMC3531190')[0].data)
print(API.idconv('23193287')[0].data)

{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': '10.1093/nar/gks1195'}
{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': 'PMC3531190'}
{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': '23193287'}


Indeed, only the `'requested-id'` entry that comes from the query input is any different in each resulting dictionary.

So I'd suggest a better Python example to have offered at the repo, for all but the second example, may have been:

### Better Python Example Code for Novices

```python
from pmc_id_converter import API

query_id = 'PMC3531190'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
query_id = '23193287'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
query_id = '10.1093/nar/gks1195'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
print(record.data['pmcid'])
print(record.data['pmid'])
print(record.data['doi'])
```

Let's run that next:

In [12]:
from pmc_id_converter import API
query_id = 'PMC3531190'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
query_id = '23193287'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
query_id = '10.1093/nar/gks1195'
records_of_query_results = API.idconv(query_id)
for record in records_of_query_results:
    print(record)
print(record.data['pmcid'])
print(record.data['pmid'])
print(record.data['doi'])

{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': 'PMC3531190'}
{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': '23193287'}
{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': '10.1093/nar/gks1195'}
PMC3531190
23193287
10.1093/nar/gks1195


I think that more easily & fully illustrates how to do queries and handle the results.

The second example offered appears more complex because it is 'batch' query of several identifiers, similar to the 'batch' example from the command line section. However, we can handle the results much the same way with some adjustments.

In [8]:
# version of `API.idconv('PMC3531190', 'PMC3531191123', 'PMC3531191')`
query_strings = ('PMC3531190', 'PMC3531191123', 'PMC3531191')
records_of_query_results = API.idconv(*query_strings)
for record in records_of_query_results:
    print(record)

[2025-10-22 17:42:12 ID_CONV_API idconv ERROR MainThread:58] RecordError: Identifier not found in PMC for "PMC3531191123"


{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': 'PMC3531190'}
{'errmsg': 'Identifier not found in PMC', '_id': 'PMC3531191123'}
{'doi': '10.1093/nar/gks1163', 'pmcid': 'PMC3531191', 'pmid': 23193288, 'requested-id': 'PMC3531191'}


In [11]:
# version of `API.idconv('PMC3531190', 'PMC3531191123', 'PMC3531191')`
query_ids = 'PMC3531190,PMC3531191123,PMC3531191'
records_of_query_results = API.idconv(query_ids)
for record in records_of_query_results:
    print(record)

[2025-10-22 17:44:49 ID_CONV_API idconv ERROR MainThread:58] RecordError: Identifier not found in PMC for "PMC3531191123"


{'doi': '10.1093/nar/gks1195', 'pmcid': 'PMC3531190', 'pmid': 23193287, 'requested-id': 'PMC3531190'}
{'errmsg': 'Identifier not found in PMC', '_id': 'PMC3531191123'}
{'doi': '10.1093/nar/gks1163', 'pmcid': 'PMC3531191', 'pmid': 23193288, 'requested-id': 'PMC3531191'}


??????

------

Enjoy!