The repository contains code, data, and Jupyter Notebooks for the case studies for the book "Text Mining for Information Professionals: An Uncharted Territory". This book focuses on a basic theoretical framework dealing with the problems, solutions, and applications of text mining and its various facets in a very practical form of case studies, use cases, and stories.
The book contains 11 chapters with 14 case studies showing 8 different text mining and visualization approaches, and 17 stories. In addition, both a website and a Github account are also maintained for the book. They contain the code, data, and notebooks for the case studies; a summary of all the stories shared by the librarians/faculty; and hyperlinks to open an interactive virtual RStudio/Jupyter Notebook environment. The interactive virtual environment runs case studies based on the R programming language for hands-on practice in the cloud without installing any software.
From understanding different types and forms of data to case studies showing the application of each text mining approaches on data retrieved from various resources, this book is a must-read for all library professionals interested in text mining and its application in libraries. Additionally, this book will also be helpful to archivists, digital curators, or any other humanities and social science professionals who want to understand the basic theory behind text data, text mining, and various tools and techniques available to solve and visualize their research problems.
- 🔭 Springer Website: https://www.springer.com/in/book/9783030850845
- 🔭 Authors' Book Website: https://textmining-infopros.github.io/
- 👯 Twitter: @lamba_manika
- 📫 How to reach me: lambamanika07@gmail.com
- 😄 Pronouns: She/her
- Chapter 1: The Computational Library
- Chapter 2: Text Data and Where to Find Them?
- Chapter 3: Text Pre-Processing
- Case Study: An Analysis of Tolkien's Books
- Chapter 4: Topic Modeling
- Chapter 5: Network Text Analysis
- Chapter 6: Burst Detection
- Chapter 7: Sentiment Analysis
- Chapter 8: Predictive Modeling
- Chapter 9: Information Visualization
- Chapter 10: Tools and Techniques for Text Mining and Visualizations
- Chapter 11: Text Data and Mining Ethics
- Appendix A: Online Repositories Available for Text Mining
- Appendix B: Language Corpora Available for Text Mining
- Appendix C: Text Data and Mining Licensing Conditions
!! BONUS -- Curated Datasets: This repository contains some of the additional datasets which are in open-access and can be used to practice or teach text mining. The goal of this repository is to act as a collection of textual data set to be used for training and practice in text mining/NLP.
Dr. Manika Lamba is a Postdoctoral Research Associate at the University of Illinois, Urbana-Champaign. Previously, she was a Lecturer at the Department of Library & Information Science, School of Open Learning (SOL), University of Delhi for the academic year 2022-2023. She earned her Ph.D. in Library & Information Science from the University of Delhi. She is currently serving as the Editor-in-Chief of the International Journal of Library and Information Services (IJLIS), and the Chair of Professional Development Sub-Committee & Elected Standing Committee Member at IFLA Science and Technology Libraries Section. She was the Newsletter Officer & Webmaster at ASIS&T South Asia Chapter, Secretary at ASIS&T SIG-DL, and Cabinet Representative at ASIS&T SIG-OIM from 2021-2022. She was Editor-at-large for dh+lib (an ACRL Digital Humanities Interest Group project) and was featured in the Information Professionals Share their Top Tips for 2019 blog by the Copyright Clearance Center (CCC). She is an active reviewer for more than 20 international journals, including IEEE Access, Scientometrics, Library Hi-Tech, and the Journal of Information Science. Her research focuses on information retrieval, digital libraries, social informatics, and scholarly communication using text mining, natural language processing, and machine learning techniques. She has worked extensively with textual data. Her work combines qualitative and quantitative methods, including focus groups, survey and field experiments, and computational approaches.
Dr. Margam Madhusudhan is currently working as a Professor in the Department of Library and Information Science, University of Delhi, India. He has worked as Deputy Dean Academics and Member of Academic Council at the University of Delhi. He is a member of many academic bodies, editorial board of national and international LIS journals. He is the recipient of the "Award for Excellence" (Highly Commended) in 2019, “Excellence in Research” in 2017, P.V. Verghese Award in 2013. He has 22 years of teaching, administration, and research experience at the university level.
I was invited by Association for Information Science and Technology (ASIS&T), a preeminent professional association in information science, to give a talk on the book! The webinar will take place on 7th December 2023. Please register here: https://t.co/IKjjK1Yhl7
Libraries International Federation of Library Associations & Institutions (IFLA), an international body of libraries and information professionals, invited the following faculty/librarians/researchers who submitted their practical experience of using text mining in their libraries or research in the story section of the book:
- Leverage Undergraduate Student Researchers to Deliver Text Data Mining Services to Library Users
- TDM Working Group at the University Libraries at Virginia Tech
- Visualizing Topic Networks in Personal Correspondence: The Richards/Turner Letters
- Natural Language Processing (NLP) for Sensory Science
The book was listed in the University of Toronto's major Text and Data Mining (TDM) platforms and collections!
The book was included in the Big Book of R which is a collection of almost 400 R programming books.