# Explore the reference network


- [Crossref](https://www.crossref.org/) provides an API to get the references of a publication ([rest-api-doc](https://github.com/CrossRef/rest-api-doc)).


- The idea is to go down the references network for a fixed number of steps (for example 3), then keep only the articles which have been visited more than N times (for examples 3 times), and draw the upward graph starting from these articles. 


## To do:
- remove MetaData class, +clean doi_list
- resolve identical label
- deals with ghost ref, i.e. without doi
- stats:  n ref vs n citation, review vs foundational
- coloring: disconnected component for the 1st generation references, authors based cluster 
- interactive app + flask
- multi doi query: https://github.com/CrossRef/rest-api-doc/issues/301

- look at: https://en.wikipedia.org/wiki/Bibliographic_coupling

### With more data:
- use 'store' from scopus manual export + [query ref resolver](https://www.crossref.org/labs/resolving-citations-we-dont-need-no-stinkin-parser/), https://search.crossref.org/references

- get country/city/univ  get flag/favicon


### with scopus 'cited by':
- doi list cited by  + cited by count
- query each
- graph


In [1]:
import crossrefexp as exp

In [2]:
store = exp.MetaDataStore('data/cachemetadata_test000.pickle')

464 metadata loaded from `data/cachemetadata_test000.pickle`


In [3]:
doi1 = "10.1103/PhysRevA.62.012306"
doi2 = "10.1103/PhysRevA.97.022108"

In [4]:
# doi = "10.1063/1.337221"

In [5]:
# Query metadata on Crossref
store.query([doi1, doi2])
print( '\n', store.get_info(doi1) )
print( '\n', store.get_info(doi2) )

Requesing 2 metadata:
Query performed in 1.604986 s. (2 doi)
2 metadata returned for 2 asked
data/cachemetadata_test000.pickle saved.

 Electron-spin-resonance transistors for quantum computing in silicon-german...
(2000) Physical Review A
Rutger Vrijen, Eli Yablonovitch, Kang Wang, Hong Wen Jiang, Alex Balandin, Vwani Roychowdhury, Tal Mor, David DiVincenzo
35 references - 31 with doi


 Zeno effect of an open quantum system in the presence of 
1/f
 noise
(2018) Physical Review A
Shu He, Chen Wang, Li-Wei Duan, Qing-Hu Chen
59 references - 57 with doi



In [6]:
doi3 = '10.1038/ncomms2411'

In [7]:
print( '\n', store.get_info(doi3) )


 A polymer tandem solar cell with 10.6% power conversion efficiency
(2013) Nature Communications
Jingbi You, Letian Dou, Ken Yoshimura, Takehito Kato, Kenichiro Ohya, Tom Moriarty, Keith Emery, Chun-Chao Chen, Jing Gao, Gang Li, Yang Yang
55 references - 53 with doi



In [8]:
# Build the graph and everything:
gr = store.get_refgraphviz( [doi1, doi2, doi3], gen=3, top=4, draw_secondary_links=False )
gr

growth achieved - 143 nodes in the graph. The last generation number is 1.
growth achieved - 2303 nodes in the graph. The last generation number is 2.
Requesing 1840 metadata:
Query performed in 2.571434 s. (92 doi)
Query performed in 1.535776 s. (92 doi)
Query performed in 1.143509 s. (92 doi)
Query performed in 0.978379 s. (92 doi)
Query performed in 0.999028 s. (92 doi)
Query performed in 1.061934 s. (92 doi)
Query performed in 1.306046 s. (92 doi)
Query performed in 1.035474 s. (92 doi)
Query performed in 1.497732 s. (92 doi)
Query performed in 1.073576 s. (92 doi)


KeyboardInterrupt: 

In [None]:
help(store.get_refgraphviz)

In [None]:
# List of the top cited refs.
gr = store.build_a_refgraph( doi1, gen=2 )
print('-- Top cited: --')
for doi, citedby_count in gr.most_cited()[:20]:
    metadata = store.get(doi)
    print( '{}\t cited {} times [gen{}]  {}'.format(metadata.label(),
                                                   citedby_count, 
                                                   gr[doi]['gen'],
                                                   metadata.get('URL')) )