__PyScopus__: Quick Start

__PyScopus__ is a Python wrapper of [Elsevier Scopus API](https://dev.elsevier.com/). More details of this Python package can be found [here](http://zhiyzuo.github.io/python-scopus/).

<hr>

Import `Scopus` class and initialize with your own __API Key__

In [4]:
import pyscopus
pyscopus.__version__

'1.0.3a1'

In [32]:
from pyscopus import Scopus

In [33]:
key = 'YOUR_OWN_API'

In [34]:
scopus = Scopus(key)

<hr>

### General Search

In [8]:
search_df = scopus.search("KEY(topic modeling)", count=30)

In [9]:
print(search_df.head(10))

     scopus_id                                              title  \
0  85059802501  A hybrid IT framework for identifying high-qua...   
1  85059563587  Knowledge empowered prominent aspect extractio...   
2  85056216742  Subevents detection through topic modeling in ...   
3  85059799573  An overview of statistical methods for handlin...   
4  85056478111  Content features of tweets for effective commu...   
5  85059561429  Variational-based latent generalized Dirichlet...   
6  85055869272  Integrating Topic, Sentiment, and Syntax for M...   
7  85058006073  Identifying spatial interaction patterns of ve...   
8  85057796067  Topical Co-Attention Networks for hashtag reco...   
9  85055701263    High-level event identification in social media   

                                    publication_name      issn  isbn  \
0    International Journal of Information Management  02684012  None   
1              Information Processing and Management  03064573  None   
2                 Future

#### Full text link

In [10]:
full_text_link_arr = search_df.full_text.values
full_text_link_arr

array(['https://api.elsevier.com/content/article/eid/1-s2.0-S026840121830834X',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0306457318305193',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0167739X18307611',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0895435618304943',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0268401217308265',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0925231218315030',
       None,
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0198971518302898',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0925231218314012',
       None, None,
       'https://api.elsevier.com/content/article/eid/1-s2.0-S1047320319300057',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S1047320319300094',
       'https://api.elsevier.com/content/article/eid/1-s2.0-S0020025518307667',
       None,
       'https://api.elsevier.com/content/article/eid/1-s2.0-S09574174183061

For those with full text links, you are able to get all the text by calling `scopus.retrieve_full_text()`

In [11]:
full_text = scopus.retrieve_full_text(full_text_link_arr[2])

In [12]:
start = 39500
full_text[start:start+10000]

'any times they are implicitly described in the text. Our goal is to address the subevent detection in an unsupervised way so it can find unknown subevents while also producing explanatory labels so a user could identify the main aspects of the subevent without having to know beforehand their details. The main challenge in subevents detection is the similarity of vocabulary since every subevent shares some of the key terms present in the main event. They are also usually in a larger quantity and occurs in small timeframes when comparing to main events. We are not resorting to classification and supervised methods since social networks events are unpredictable. Also, we are concerned about giving an understanding of the event to the user without requiring a field specialist via phrases that can capture the thematic of the social discussions. This way, many events occurring could be deeply investigated at the level of subevents and be used to help in a variety of situations such as emerg

<hr>

#### Search for a specific author

In [13]:
author_result_df = scopus.search_author("AUTHLASTNAME(Zuo) and AUTHFIRST(Zhiya) and AFFIL(Iowa)")

In [14]:
print(author_result_df)

     author_id       name  document_count         affiliation affiliation_id
0  57189222659  Zhiya Zuo               9  University of Iowa       60024324


Then we can retrieve more detailed info about the author we are looking for using his/her __author_id__:

In [15]:
zuo_info_dict = scopus.retrieve_author('57189222659')

In [16]:
zuo_info_dict.keys()

dict_keys(['author-id', 'eid', 'document-count', 'cited-by-count', 'citation-count', 'name', 'last', 'first', 'indexed-name', 'publication-range', 'affiliation-current', 'journal-history', 'affiliation-history'])

In [17]:
print('\n'.join(zuo_info_dict['affiliation-history'].name.values))

University of Iowa
University of Iowa, Interdisciplinary Graduate Program in Informatics
China Academy of Chinese Medical Sciences, Data Center of TCM
Tongji University, Department of Control Science and Engineering


#### Search for his publications explicitly

In [18]:
zuo_pub_df = scopus.search_author_publication('57189222659')

In [19]:
zuo_pub_df[['title', 'cover_date', 'publication_name', 'scopus_id']].sort_values('cover_date').reset_index(drop=True)

Unnamed: 0,title,cover_date,publication_name,scopus_id
0,Subhealth state classification with AdaBoost l...,2013-01-01,International Journal of Functional Informatic...,84892146718
1,The evolution of user roles in online health c...,2015-01-01,Pacific Asia Conference on Information Systems...,85011032472
2,The evolution and diffusion of user roles in o...,2015-12-08,Proceedings - 2015 IEEE International Conferen...,84966453564
3,Investigating regional prejudice in China thro...,2016-01-01,Lecture Notes in Computer Science (including s...,84995379893
4,Systematic investigation of sex disparity in m...,2017-01-01,Proceedings of the Association for Information...,85040762441
5,Comparing of feature selection and classificat...,2017-01-17,Proceedings - 2016 IEEE International Conferen...,85013231650
6,The state and evolution of U.S. iSchools: From...,2017-05-01,Journal of the Association for Information Sci...,85004154180
7,A Graphical Model for Topical Impact over Time,2018-05-23,Proceedings of the ACM/IEEE Joint Conference o...,85048881340
8,The more multidisciplinary the better? - The p...,2018-08-01,Journal of Informetrics,85049552190


---

### Abstract retrieval

If the 2nd argument `download_path` is not given, the JSON response would not be saved

In [20]:
pub_info = scopus.retrieve_abstract('85049552190', './')

In [21]:
pub_info

{'srctype': 'j',
 'eid': '2-s2.0-85049552190',
 'prism:coverDate': '2018-08-01',
 'prism:aggregationType': 'Journal',
 'prism:url': 'https://api.elsevier.com/content/abstract/scopus_id/85049552190',
 'source-id': '5100155103',
 'pii': 'S1751157718300452',
 'citedby-count': '1',
 'prism:volume': '12',
 'subtype': 'ar',
 'openaccess': '0',
 'prism:issn': '18755879 17511577',
 'prism:issueIdentifier': '3',
 'subtypeDescription': 'Article',
 'prism:publicationName': 'Journal of Informetrics',
 'prism:pageRange': '736-756',
 'prism:endingPage': '756',
 'openaccessFlag': 'false',
 'prism:doi': '10.1016/j.joi.2018.06.006',
 'prism:startingPage': '736',
 'dc:publisher': 'Elsevier Ltd',
 'scopus-id': '85049552190',
 'abstract': '© 2018 Elsevier Ltd. All rights reserved. Scientific research is increasingly relying on collaborations to address complex real-world problems. Many researchers, policymakers, and administrators consider a multidisciplinary environment an important factor for fostering 

In [22]:
cat 85049552190.json

{"abstracts-retrieval-response": {"item": {"ait:process-info": {"ait:status": {"@state": "update", "@type": "core", "@stage": "S300"}, "ait:date-delivered": {"@day": "29", "@timestamp": "2018-09-29T21:25:59.000059-04:00", "@year": "2018", "@month": "09"}, "ait:date-sort": {"@day": "01", "@year": "2018", "@month": "08"}}, "xocs:meta": {"xocs:funding-list": {"@pui-match": "primary", "@has-funding-info": "1", "xocs:funding-addon-generated-timestamp": "2018-07-12T21:30:32.702Z"}}, "bibrecord": {"head": {"author-group": {"affiliation": {"country": "United States", "@afid": "60024324", "@country": "usa", "organization": {"$": "University of Iowa"}, "affiliation-id": {"@afid": "60024324"}, "@affiliation-instance-id": "S1751157718300452-767136c5842968c97d8ce9c2ef3b8b2b"}, "author": [{"ce:given-name": "Zhiya", "preferred-name": {"ce:given-name": "Zhiya", "ce:initials": "Z.", "ce:surname": "Zuo", "ce:indexed-name": "Zuo Z."}, "@author-instance-id": "S1751157718300452-d600411a3d86bb2e0b1f62d25e5f

<hr>

Note that __Searching for articles in specific journals (venues) is not supported anymore since this can be easily done by `general search`__.

<hr>

### Citation count retrieval

__Note that the use of `citation overview API` needs to be approved by Elsevier.__

In [23]:
pub_citations_df = scopus.retrieve_citation(scopus_id_array=['85049552190', '85004154180'],
                                            year_range=[2016, 2018])

In [24]:
print(pub_citations_df)

     scopus_id previous_citation 2016 2017 2018 later_citation total_citation
0  85004154180                 0    0    1    2              0              3
1  85049552190                 0    0    0    1              0              1


---

### Serial Title Metadata

If interested in meta information and metrics at publication venue level (e.g., journal/conference), we can now use `search_serial` or `retrieve_serial`

#### Search by title

In [25]:
meta_df, citescore_df, sj_rank_df = scopus.search_serial('informetrics')
meta_df

Unnamed: 0,dc:publisher,dc:title,oaAllowsAuthorPaid,openArchiveArticle,openaccess,openaccessArticle,openaccessStartDate,openaccessType,prism:aggregationType,prism:issn,source-id,subject-area
0,Elsevier BV,Journal of Informetrics,True,False,0,False,,,journal,1751-1577,5100155103,"[1706, 3309]"


See more about [CiteScore](https://www.cwts.nl/blog?article=n-q2y254)

In [26]:
citescore_df

Unnamed: 0,citationCount,citeScore,docType,percentCited,scholarlyOutput,status,year,source-id,prism:issn
0,1010,3.44,all,77,294,In-Progress,2018,5100155103,1751-1577
1,985,3.52,all,80,280,Complete,2017,5100155103,1751-1577
2,835,2.99,all,73,279,Complete,2016,5100155103,1751-1577
3,706,2.6,all,74,272,Complete,2015,5100155103,1751-1577
4,713,2.89,all,82,247,Complete,2014,5100155103,1751-1577
5,938,4.38,all,87,214,Complete,2013,5100155103,1751-1577
6,782,4.55,all,90,172,Complete,2012,5100155103,1751-1577
7,550,3.96,all,79,139,Complete,2011,5100155103,1751-1577


The last dataframe below lists the rank/percentile of this serial in each subject area it is assigned to across years 
- More about subject area code in Scopus [link](https://service.elsevier.com/app/answers/detail/a_id/15181/supporthub/scopus/related/1/)

In [27]:
sj_rank_df.head(2)

Unnamed: 0,percentile,rank,subjectCode,year,source-id,prism:issn
0,82,102,1706,2018,5100155103,1751-1577
1,92,16,3309,2018,5100155103,1751-1577


#### Retrieve by ISSN

Given a ISSN, we can use `retrieve_serial`:

In [28]:
meta_df, citescore_df, sj_rank_df = scopus.retrieve_serial('2330-1643')
meta_df

Unnamed: 0,dc:publisher,dc:title,oaAllowsAuthorPaid,openArchiveArticle,openaccess,openaccessArticle,openaccessStartDate,openaccessType,prism:aggregationType,prism:eIssn,prism:issn,source-id,subject-area
0,John Wiley and Sons Ltd,Journal of the Association for Information Sci...,,,,,,,journal,2330-1643,2330-1635,21100307484,"[1710, 1705, 1802, 3309]"


In [29]:
citescore_df

Unnamed: 0,citationCount,citeScore,docType,percentCited,scholarlyOutput,status,year,source-id,prism:issn
0,2225,3.48,all,76,640,In-Progress,2018,21100307484,2330-1635
1,2088,3.36,all,75,621,Complete,2017,21100307484,2330-1635
2,1657,2.74,all,71,605,Complete,2016,21100307484,2330-1635
3,364,2.25,all,73,162,Complete,2015,21100307484,2330-1635


In [30]:
sj_rank_df.head(2)

Unnamed: 0,percentile,rank,subjectCode,year,source-id,prism:issn
0,82,48,1710,2018,21100307484,2330-1635
1,83,47,1705,2018,21100307484,2330-1635


---

### Affiliation

In [31]:
uiowa = scopus.retrieve_affiliation('60024324')
uiowa

{'eid': '10-s2.0-60024324',
 'affiliation-name': 'University of Iowa',
 'address': None,
 'city': 'Iowa City',
 'country': 'United States',
 'org-type': 'univ',
 'org-domain': 'uiowa.edu',
 'org-URL': 'http://www.uiowa.edu/',
 'date-created': '02/02/2008',
 'aff_id': '60024324'}