# Download citation data from NIH-OCC
NIH-OCC: Nation Institute of Health's Open Citation Collection https://icite.od.nih.gov/

## 1) Load the NIH Downloader
Download the details about a paper's citations by setting the `NIHiCiteDownloader` argument `details_cites_refs` to `"citations"`.

In [1]:
from pmidcite.icite.downloader import get_downloader

dnldr = get_downloader(details_cites_refs="citations")

Values for the `NIHiCiteDownloader` argument `details_cites_refs` include:
* `"citations"`
* `"references"`
* `"all"` (downloads details for both citations and references)

## 2) Download NIH-OCC data for one PMID

The first paper, `TOP`, is the requested paper. It is followed by a list of citations (`CIT`), then references (`REF`).

Citations are stored in two data members, `cited_by` and `cited_by_clin`. In this example, there are no clinical papers which cited the chosen paper. But we show how union can be used to merge the two sets.

In [2]:
pmid = 22882545
pmids = [pmid]
pmid2paper = dnldr.get_pmid2paper(pmids)

paper = pmid2paper[pmid]

# set of NIHiCiteEntry
all_cites = paper.cited_by.union(paper.cited_by_clin)

## 3) Default sort of NIHiCiteEntry objects is by PMIDs

In [3]:
for nih_entry in sorted(all_cites):
    print(nih_entry)

24383934 R. .AM..  60 2 2014    31  0  54 au[17](Marie Louis) Habitat-driven population structure of bottlenose dolphins, Tursiops truncatus, in the North-East Atlantic.
25052415 R. .AM..  59 2 2015    29  0  37 au[09](A E Moura) Phylogenomics of the killer whale indicates ecotype divergence in sympatry.
25244680 R. .A...  51 2 2014    27  0  59 au[10](Andre E Moura) Population genomics of the killer whale indicates ecotype evolution in sympatry involving both selection and drift.
25297864 R. .A...  53 2 2014    25  0  39 au[10](Marie Louis) Ecological opportunities and specializations shaped genetic divergence in a highly mobile marine top predator.
25738698 R. .A...  31 2 2015    10  0   5 au[06](Marta Söffker) The impact of predation by marine mammals on patagonian toothfish longline fisheries.
25883362 .. .A...  88 3 2015    76  0  85 au[02](Neil P Kelley) Vertebrate evolution. Evolutionary innovation and ecology in marine tetrapods from the Triassic to the Anthropocene.
26937049 R

## 4) Sort by NIH percentile
NIH entries that are too new to have been given a NIH percentile are set to 999 in *pmidcite*.    

It is important to highlight new papers.    

The 999 value makes the newest papers appear next to the papers having the highest NIH percentiles so the new papers are highlighted.

In [4]:
for nih_entry in sorted(all_cites, key=lambda o: o.get_dict()['nih_perc'], reverse=True):
    print(nih_entry)

35815600 R. .A...  -1 i 2023     3  0  34 au[06](James O Farlow) 'Dragons' on the landscape: Modeling the abundance of large carnivorous dinosaurs of the Upper Jurassic Morrison Formation (USA) and the Upper Cretaceous Dinosaur Park Formation (Canada).
37055915 R. .A...  -1 i 2023     3  0  25 au[16](Anaïs Remili) Quantitative fatty acid signature analysis reveals a high level of dietary specialization in killer whales across the North Atlantic.
37284666 R. HA...  -1 i 2023     1  0  22 au[04](Rowan K Jordaan) The effect of prey abundance and fisheries on the survival, reproduction, and social structure of killer whales (<i>Orcinus orca</i>) at subantarctic Marion Island.
37839906 .. .A...  -1 i 2024     0  0  53 au[07](Eamonn I F Wooster) Animal cognition and culture mediate predator-prey interactions.
38179079 R. .A...  -1 i 2024     0  0  12 au[06](Fannie W Shabangu) Killer whale acoustic patterns respond to prey abundance and environmental variability around the Prince Edward Islan

## 5) Sort by NIH group, then by year
This places the newest papers (NIH group `i`) first, followed by papers that perform well (NIH groups `2` and above). The lowest performing papers (NIH groups `0` and `1`) are last.

In [5]:
nih_cites = sorted(all_cites, key=lambda o: [o.get_dict()['nih_group'], o.get_dict()['year']], reverse=True)
for nih_entry in nih_cites:
    print(nih_entry)

37839906 .. .A...  -1 i 2024     0  0  53 au[07](Eamonn I F Wooster) Animal cognition and culture mediate predator-prey interactions.
38179079 R. .A...  -1 i 2024     0  0  12 au[06](Fannie W Shabangu) Killer whale acoustic patterns respond to prey abundance and environmental variability around the Prince Edward Islands, Southern Ocean.
37591692 R. .A...  -1 i 2024     0  0  46 au[04](R F Bennion) Craniodental ecomorphology of the large Jurassic ichthyosaurian Temnodontosaurus.
35815600 R. .A...  -1 i 2023     3  0  34 au[06](James O Farlow) 'Dragons' on the landscape: Modeling the abundance of large carnivorous dinosaurs of the Upper Jurassic Morrison Formation (USA) and the Upper Cretaceous Dinosaur Park Formation (Canada).
36917944 .. HA...  -1 i 2023     0  0  11 au[01](Janet Mann) Animal behavior: Killer whale mamas' boys.
37055915 R. .A...  -1 i 2023     3  0  25 au[16](Anaïs Remili) Quantitative fatty acid signature analysis reveals a high level of dietary specialization in kill

## 6) Print the keys which can be used for sorting
Pick out one NIH entry (NIHiCiteEntry object) and print available keys

In [6]:
nih_entry = next(iter(nih_cites))
print('\n{N} key-value pairs in an NIH entry:\n'.format(N=len(nih_entry.dct)))
for key, value in nih_entry.get_dict().items():
    print(f"{key:>27} {value}")


32 key-value pairs in an NIH entry:

                       pmid 37839906
                       year 2024
                      title Animal cognition and culture mediate predator-prey interactions.
                    authors ['Eamonn I F Wooster', 'Kaitlyn M Gaynor', 'Alexandra J R Carthey', 'Arian D Wallach', 'Lauren A Stanton', 'Daniel Ramp', 'Erick J Lundgren']
                    journal Trends Ecol Evol
        is_research_article False
    relative_citation_ratio None
             nih_percentile None
                      human 0.0
                     animal 1.0
         molecular_cellular 0.0
                        apt 0.05
                is_clinical False
             citation_count 0
         citations_per_year 0.0
expected_citations_per_year None
        field_citation_rate None
                provisional False
                    x_coord 0.8660254037844386
                    y_coord -0.5
              cited_by_clin []
                   cited_by []
                 

## 7) Expand NIH group `3` (well performing papers) to include NIH percentiles 50% or higher

In [7]:
from pmidcite.icite.nih_grouper import NihGrouper

grpr = NihGrouper(group3_min=50.0)

dnldr = get_downloader(details_cites_refs="citations", nih_grouper=grpr)
paper = dnldr.get_paper(22882545)

for nihentry in sorted(paper.cited_by, key=lambda o: [o.get_dict()['nih_group'], o.get_dict()['year']], reverse=True):
    print(nihentry)

37591692 R. .A...  -1 i 2024     0  0  46 au[04](R F Bennion) Craniodental ecomorphology of the large Jurassic ichthyosaurian Temnodontosaurus.
38179079 R. .A...  -1 i 2024     0  0  12 au[06](Fannie W Shabangu) Killer whale acoustic patterns respond to prey abundance and environmental variability around the Prince Edward Islands, Southern Ocean.
37839906 .. .A...  -1 i 2024     0  0  53 au[07](Eamonn I F Wooster) Animal cognition and culture mediate predator-prey interactions.
37284666 R. HA...  -1 i 2023     1  0  22 au[04](Rowan K Jordaan) The effect of prey abundance and fisheries on the survival, reproduction, and social structure of killer whales (<i>Orcinus orca</i>) at subantarctic Marion Island.
37339590 R. .A...  -1 i 2023     0  0   7 au[02](Michael N Weiss) Killer whales.
37055915 R. .A...  -1 i 2023     3  0  25 au[16](Anaïs Remili) Quantitative fatty acid signature analysis reveals a high level of dietary specialization in killer whales across the North Atlantic.
36917944

Copyright (C) 2019-present, DV Klopfenstein, PhD. All rights reserved.