Network Analysis Interface for Literature Studies
by Juho Salminen, Antti Knutas and Arash Hajikhani
at Lappeenranta University of Technology
What Is It?
This site shares our experiments and tools for performing Social Network Analysis (SNA) on citation data. As the amount of publications grows on any given field, automatic tools for this sort of analysis are becoming increasingly important prior to starting research on new fields.
SNA is an interesting way for researchers to map large datasets and get insights from new angles. The steps for downloading data from Web of Knowledge and using our tools to process it are detailed below. The set of tools which are required to perform the analyses are free and need a minimum amount of installation. Furthermore, there is a web-based analysis server HAMMER available so that you can process the data without needing to do any installation or manual processing steps.
We are working on an alternative, R package version for use by R programmers. For now the project is a work in progress and not fully usable.
The basic design and bibliometric principles of the system have been published in a research article:
Antti Knutas, Arash Hajikhani, Juho Salminen, Jouni Ikonen, and Jari Porras. 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. In Proceedings of the 16th International Conference on Computer Systems and Technologies (CompSysTech '15). DOI: 10.1145/2812428.2812442
If you use the software in your scientific work, please consider citing us.
How to Use
These scripts can be used to complete an exploratory literature review using data downloaded from Web of Knowledge. You can follow the steps below or view a brief video tutorial on how to get started.
- Go to Web of Knowledge website and select Web of Science Core Collection from the dropdown menu.
- Search for literature.
- Download data. Select Save to Other File Formats from the dropdown menu, enter the range of records (max 500 records for one download), and download Full Record and Cited References. File format should be Tab-delimited (Win) or Tab-delimited (Mac). If you need more than 500 records, repeat the download.
- Put the downloaded files into the input folder.
- Open exploration.Rmd with RStudio and press Knit HTML -button. The script will combine the downloaded data into a single file, process it and create visualizations. The results are saved as a HTML-file exploration.html.
The script also creates node and edge tables for author and citation networks that can be loaded to Gephi for further exploration.
See further instructions for manual usage at https://sites.google.com/site/bibliometricdatavisualization/instructions or follow the video tutorial.
For now the project verifiedly works on R version 3.3.3 and RStudio 1.0.136.
You need the following R packages (tested 13.3.2017): splitstackshape, reshape, plyr, stringr, tm, SnowballC, lda, LDAvis, igraph, ggplot2 (also "knitr" separately installed if not using RStudio)
For the LDA topic modeling feature you need the following packages: stm, tm, topicmodels, dplyr, stringi
We are open source and free software
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. See LICENSE file for more information.
What does it mean? We are free as in freedom. You may run the software as you wish, for any purpose; you are free to study how the program works, and change it as you wish; you are free to redistribute copies; and you are free to distribute copies of modified versions to others. You may not distribute this software in a non-free manner or add additional restrictions. The only limitations are that you have to follow the free software license, retain the original copyright notices and acknowledgement texts in the program output (section 7b). See links above for more information. If you edit and improve the software, we would love to hear back from you.