Scopus (Elsevier and ScienceDirect) should be another good source of metadata for our research references. I need to keep working with the various API end points to come up with the right combination of queries and data structures, and some of these are going to have to be run from inside the USGS network. Scopus uses a combination of originating IP and a registered API key to authorize some of the APIs based on institution subscription.

Initially, this notebook just runs through the cases where we've already determined a DOI and grabs a basic Scopus record for future reference.

In [1]:
import requests
from IPython.display import display
from datetime import datetime
from bis import dd
from bis import rrl

In [2]:
bis = dd.getDB("bis")
collection_rrl = bis["RRL"]

In [4]:
count = 0
for record in collection_rrl.find({"$and":[{"$or":[{"CrossRef.Record.DOI":{"$exists":True}},{"Link Metadata.Link Response.DOI":{"$exists":True}}]},{"Scopus.ScopusRecord":{"$exists":False}}]}):
    try:
        recordDOI = record["Link Metadata"]["Link Response"]["DOI"]
    except:
        recordDOI = record["CrossRef"]["Record"]["DOI"]
    
    scopusSearch = rrl.ResearchReferenceLibrary.lookup_scopus_by_doi(recordDOI)
    scopusRecord = {}
    scopusRecord["Date Retrieved"] = datetime.utcnow().isoformat()
    scopusRecord["Search Term"] = "doi"
    scopusRecord["Search Value"] = recordDOI

    if scopusSearch["search-results"]["opensearch:totalResults"] == "1":
        scopusRecord["Cached Result"] = scopusSearch["search-results"]["entry"][0]
        scopus = {"Success":True,"Scopus Record":scopusRecord}
    else:
        scopus = {"Success":False,"Scopus Record":scopusRecord}

    print (count, collection_rrl.update_one({"_id":record["_id"]},{"$set":{"Scopus":scopus}}))
    
    count = count + 1
    

0 <pymongo.results.UpdateResult object at 0x11ea701f8>
1 <pymongo.results.UpdateResult object at 0x114847e58>
2 <pymongo.results.UpdateResult object at 0x1132727e0>
3 <pymongo.results.UpdateResult object at 0x11ea89678>
4 <pymongo.results.UpdateResult object at 0x113272708>
5 <pymongo.results.UpdateResult object at 0x112a5b120>
6 <pymongo.results.UpdateResult object at 0x11487b318>
7 <pymongo.results.UpdateResult object at 0x113272708>
8 <pymongo.results.UpdateResult object at 0x111561ee8>
9 <pymongo.results.UpdateResult object at 0x112f7f0d8>
10 <pymongo.results.UpdateResult object at 0x111561ee8>
11 <pymongo.results.UpdateResult object at 0x111561ee8>
12 <pymongo.results.UpdateResult object at 0x1132727e0>
13 <pymongo.results.UpdateResult object at 0x1132727e0>
14 <pymongo.results.UpdateResult object at 0x112f7f0d8>
15 <pymongo.results.UpdateResult object at 0x113272708>
16 <pymongo.results.UpdateResult object at 0x112f7f0d8>
17 <pymongo.results.UpdateResult object at 0x112f7f0d8>
18

146 <pymongo.results.UpdateResult object at 0x11ea89678>
147 <pymongo.results.UpdateResult object at 0x1137dc558>
148 <pymongo.results.UpdateResult object at 0x11ea89678>
149 <pymongo.results.UpdateResult object at 0x119f41e58>
150 <pymongo.results.UpdateResult object at 0x119f193f0>
151 <pymongo.results.UpdateResult object at 0x119f196c0>
152 <pymongo.results.UpdateResult object at 0x119f36240>
153 <pymongo.results.UpdateResult object at 0x119f34318>
154 <pymongo.results.UpdateResult object at 0x11ea77af8>
155 <pymongo.results.UpdateResult object at 0x1143efa20>
156 <pymongo.results.UpdateResult object at 0x114410750>
157 <pymongo.results.UpdateResult object at 0x11440e678>
158 <pymongo.results.UpdateResult object at 0x1143ef708>
159 <pymongo.results.UpdateResult object at 0x11ce40a68>
160 <pymongo.results.UpdateResult object at 0x115920990>
161 <pymongo.results.UpdateResult object at 0x115931750>
162 <pymongo.results.UpdateResult object at 0x115931120>
163 <pymongo.results.UpdateResu

290 <pymongo.results.UpdateResult object at 0x1136b12d0>
291 <pymongo.results.UpdateResult object at 0x113b21438>
292 <pymongo.results.UpdateResult object at 0x113b216c0>
293 <pymongo.results.UpdateResult object at 0x113aed048>
294 <pymongo.results.UpdateResult object at 0x113b21ab0>
295 <pymongo.results.UpdateResult object at 0x113b215a0>
296 <pymongo.results.UpdateResult object at 0x113afc708>
297 <pymongo.results.UpdateResult object at 0x113afcfc0>
298 <pymongo.results.UpdateResult object at 0x113b00b88>
299 <pymongo.results.UpdateResult object at 0x111e276c0>
300 <pymongo.results.UpdateResult object at 0x111e21b40>
301 <pymongo.results.UpdateResult object at 0x111e17828>
302 <pymongo.results.UpdateResult object at 0x111e17288>
303 <pymongo.results.UpdateResult object at 0x111e17dc8>
304 <pymongo.results.UpdateResult object at 0x111e17f78>
305 <pymongo.results.UpdateResult object at 0x111dfe048>
306 <pymongo.results.UpdateResult object at 0x111dfec60>
307 <pymongo.results.UpdateResu

434 <pymongo.results.UpdateResult object at 0x1154e20d8>
435 <pymongo.results.UpdateResult object at 0x1154ddcf0>
436 <pymongo.results.UpdateResult object at 0x11548e798>
437 <pymongo.results.UpdateResult object at 0x1154876c0>
438 <pymongo.results.UpdateResult object at 0x1154b4360>
439 <pymongo.results.UpdateResult object at 0x1154ad558>
440 <pymongo.results.UpdateResult object at 0x1153714c8>
441 <pymongo.results.UpdateResult object at 0x115371438>
442 <pymongo.results.UpdateResult object at 0x115371ca8>
443 <pymongo.results.UpdateResult object at 0x115371990>
444 <pymongo.results.UpdateResult object at 0x115371090>
445 <pymongo.results.UpdateResult object at 0x115371438>
446 <pymongo.results.UpdateResult object at 0x1153714c8>
447 <pymongo.results.UpdateResult object at 0x115371e10>
448 <pymongo.results.UpdateResult object at 0x1153711f8>
449 <pymongo.results.UpdateResult object at 0x115228750>
450 <pymongo.results.UpdateResult object at 0x115214120>
451 <pymongo.results.UpdateResu

578 <pymongo.results.UpdateResult object at 0x113fab048>
579 <pymongo.results.UpdateResult object at 0x113fba708>
580 <pymongo.results.UpdateResult object at 0x113b58120>
581 <pymongo.results.UpdateResult object at 0x113fbad38>
582 <pymongo.results.UpdateResult object at 0x113c515a0>
583 <pymongo.results.UpdateResult object at 0x113c3c090>
584 <pymongo.results.UpdateResult object at 0x113c3aa20>
585 <pymongo.results.UpdateResult object at 0x113c3aea0>
586 <pymongo.results.UpdateResult object at 0x113c3ae58>
587 <pymongo.results.UpdateResult object at 0x113c3a798>
588 <pymongo.results.UpdateResult object at 0x113c3af78>
589 <pymongo.results.UpdateResult object at 0x113c3a048>
590 <pymongo.results.UpdateResult object at 0x1135b7318>
591 <pymongo.results.UpdateResult object at 0x1135b7a68>
592 <pymongo.results.UpdateResult object at 0x1135c3048>
593 <pymongo.results.UpdateResult object at 0x1135b7b40>
594 <pymongo.results.UpdateResult object at 0x111c7ac60>
595 <pymongo.results.UpdateResu

722 <pymongo.results.UpdateResult object at 0x113a7d240>
723 <pymongo.results.UpdateResult object at 0x11422eb40>
724 <pymongo.results.UpdateResult object at 0x11422ec60>
725 <pymongo.results.UpdateResult object at 0x11422ecf0>
726 <pymongo.results.UpdateResult object at 0x11422e828>
727 <pymongo.results.UpdateResult object at 0x11422e8b8>
728 <pymongo.results.UpdateResult object at 0x11422e0d8>
729 <pymongo.results.UpdateResult object at 0x11422ecf0>
730 <pymongo.results.UpdateResult object at 0x11422ed80>
731 <pymongo.results.UpdateResult object at 0x1134360d8>
732 <pymongo.results.UpdateResult object at 0x1134363f0>
733 <pymongo.results.UpdateResult object at 0x11343a2d0>
734 <pymongo.results.UpdateResult object at 0x11343afc0>
735 <pymongo.results.UpdateResult object at 0x11343a510>
736 <pymongo.results.UpdateResult object at 0x113445c60>
737 <pymongo.results.UpdateResult object at 0x113445fc0>
738 <pymongo.results.UpdateResult object at 0x113445b88>
739 <pymongo.results.UpdateResu

866 <pymongo.results.UpdateResult object at 0x11e8d6828>
867 <pymongo.results.UpdateResult object at 0x11547c1b0>
868 <pymongo.results.UpdateResult object at 0x11544dd38>
869 <pymongo.results.UpdateResult object at 0x11547daf8>
870 <pymongo.results.UpdateResult object at 0x115440708>
871 <pymongo.results.UpdateResult object at 0x115706e58>
872 <pymongo.results.UpdateResult object at 0x11571a9d8>
873 <pymongo.results.UpdateResult object at 0x11571aee8>
874 <pymongo.results.UpdateResult object at 0x11571aea0>
875 <pymongo.results.UpdateResult object at 0x11571a7e0>
876 <pymongo.results.UpdateResult object at 0x11574a6c0>
877 <pymongo.results.UpdateResult object at 0x11574a510>
878 <pymongo.results.UpdateResult object at 0x115751288>
879 <pymongo.results.UpdateResult object at 0x1155ae048>
880 <pymongo.results.UpdateResult object at 0x11574afc0>
881 <pymongo.results.UpdateResult object at 0x11574a0d8>
882 <pymongo.results.UpdateResult object at 0x11559a5e8>
883 <pymongo.results.UpdateResu

# Next Steps
The next thing I'll try once I get back into the office is to grab the citation summary for each of these and work with the search APIs to see where I can fill in blanks. Scopus search will be a little more difficult than CrossRef search in that I need to get a reasonable parsing of the citation metadata to use its parts instead of just sending along the messy string. So, I'll need to revisit the parsing mechanisms again to see how good I can make those.