Text Mining with Machine Learning and Python [Video]
This is the code repository for Text Mining with Machine Learning and Python [Video], published by Packt. It contains all the supporting project files necessary to work through the video course from start to finish.
About the Video Course
Text is one of the most actively researched and widely spread types of data in the Data Science field today. New advances in machine learning and deep learning techniques now make it possible to build fantastic data products on text sources. New exciting text data sources pop up all the time. You'll build your own toolbox of know-how, packages, and working code snippets so you can perform your own text mining analyses.
You'll start by understanding the fundamentals of modern text mining and move on to some exciting processes involved in it. You'll learn how machine learning is used to extract meaningful information from text and the different processes involved in it. You will learn to read and process text features. Then you'll learn how to extract information from text and work on pre-trained models, while also delving into text classification, and entity extraction and classification. You will explore the process of word embedding by working on Skip-grams, CBOW, and X2Vec with some additional and important text mining processes. By the end of the course, you will have learned and understood the various aspects of text mining with ML aText is one of the most actively researched and widely spread types of data in the Data Science field today. New advances in machine learning and deep learning techniques now make it possible to build fantastic data products on text sources. New exciting text data sources pop out all the time like tulips in the spring. This course aims to you the first steps into this expertise. To build up your toolbox of know-how, packages and working code snippets to perform your own Text Mining analysis.
Starting from the basics of preprocessing text features, we’ll take a look at how we can extract relevant features from text and classify documents through Machine Learning. Since Word Embeddings have become indispensable in today’s NLP world, we’ll dive deeper into their inner workings and have a go at training our own embedding models.
By the end of the course, you will have a high-level understanding of the various components involved in a current-day NLP pipeline, and a set of working code to build further upon. nd the important processes involved in it, and will have begun your journey as an effective text miner.
What You Will Learn
- Refine and clean your text
- Extract important data from text
- Classify text into types
- Apply modern ML and DL techniques on the text
- Work on pre-trained models
- Important text mining processes
- Analyze text in the best and most effective way
Instructions and Navigation
To fully benefit from the coverage included in this course, you will need:
● Working experience with Python and Jupyter Notebooks
● First experience with doing data analytics in Python
● First encounter with Machine Learning (scikit-learn experience is a plus)
This course has the following software requirements:
● Anaconda distribution of latest Python 3
● Separate conda env with Python 3 installed
○ available to set up once Anaconda is installed
● Jupyter notebook
○ available to activate once Anaconda is installed
● Extra packages:
○ NLTK (pip install nltk==3.2.2) ○ Spacy (pip install spacy==2.0.3) ○ Gensim (pip install gensim==3.3.0) ○ Scikit-learn (pip install scikit-learn==0.19.1) ○ Tensorflow (for CPU) (pip install tensorflow==1.4.0) ○ Keras (pip install keras==2.1.3) ○ python-crfsuite (pip install python-crfsuite==0.9.5)
This course has been tested on the following system configuration:
● OS: Windows 10
● Processor: Quad Core 2.8 Ghz
● Memory: 16GB
● Hard Disk Space: 3GB