This repository contains two Python scripts that allow you to convert Web of Science bibliographic data between the following formats:
TabDelimited.txt
→WOS.xlsx
andPlainText.txt
Filtered WOS.xlsx
→Filtered TabDelimited.txt
andFiltered PlainText.txt
These tools are particularly useful when you:
- Need to extract cited references (which the default WOS Excel export omits).
- Apply document filtering using tools like PRISMA and need to export the cleaned dataset.
Purpose: Converts the WOS TabDelimited.txt
export file to:
- A full-featured Excel file (
WOS.xlsx
) - A plain-text Web of Science format file (
PlainText.txt
) that includes full tagging (e.g.,AU
,CR
, etc.)
Use this script right after downloading from Web of Science.
Purpose: After filtering your Excel file (e.g., manually or via PRISMA), use this script to:
- Reconstruct the tab-delimited format (
TabDelimited.txt
) - Recreate the tagged plain-text format (
PlainText.txt
) for further processing
Use this after filtering your Excel (
WOS_Filtered.xlsx
) to retain only relevant records.
-
Clone this repository:
git clone https://github.com/yourusername/wos-format-converter.git cd wos-format-converter
-
Install dependencies:
pip install pandas
-
Run the appropriate script based on your workflow:
python WOS_Converter_TabDelimited_to_xlsx_PlainText.py # OR python WOS_Converter_Filtered_xlsx_to_TabDelimitedText_PlainText.py
Initial Conversion:
- Input:
TabDelimited.txt
- Output:
WOS.xlsx
,PlainText.txt
After Filtering:
- Input:
WOS_Filtered.xlsx
- Output:
TabDelimited_Filtered.txt
,PlainText_Filtered.txt
The TabDelimited.txt
file generated by this tool is suitable for VOSviewer, a software tool for constructing and visualizing bibliometric networks.
Cite VOSviewer as:
Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.
https://www.vosviewer.com
The PlainText.txt
output is compatible with the Bibliometrix R package and its web-based interface Biblioshiny for comprehensive science mapping analysis.
Cite Bibliometrix as:
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975.
https://www.bibliometrix.org
The PlainText.txt
output is also compatible with CiteSpace, a Java-based application for visualizing and analyzing trends and patterns in scientific literature.
To ensure compatibility with CiteSpace:
- Export records in Plain Text format with Full Record and Cited References
- Name files as
download_*.txt
(e.g.,download_1.txt
) - Use data from supported sources like Web of Science or Scopus
Cite CiteSpace as:
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359–377.
https://doi.org/10.1002/asi.20317
- The conversion preserves essential Web of Science tags (
AU
,TI
,CR
, etc.) - Cited references (
CR
) are correctly included in the plain-text output, unlike in the native WOS Excel export - Column mapping is based on the official WOS format standard
Created by Nasser Khalili
If you use this tool in your research, feel free to give a ⭐ or cite the repository.
This project is licensed under the MIT License - see the LICENSE file for details.