Created Date: 8 March 2019

Natural Langauge Processing (NLP) using Spacy

SpaCy is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython. The library is published under the MIT license. Today we’ll be talking about how to get started with NLP using Spacy. But before starting, make sure that you have Python and Spacy installed in your system.

To install Spacy and English Model:
sudo pip install spacy python -m spacy download en

In spacy, the object “nlp” is used to create documents, access linguistic annotations and different nlp properties. The default model which is english-core-web, for which we load the “en” model.

import spacy 
nlp = spacy.load(“en”)

WORD TOKENIZE
Tokenize words to get the tokens of the text i.e breaking the sentences into words.
SENTENCE TOKENIZE
Tokenize sentences if the there are more than 1 sentence i.e breaking the sentences to list of sentence.
STOP WORDS REMOVAL
Remove irrelevant words using nltk stop words like is,the,a etc from the sentences as they don’t carry any information.
Lemma
lemmatize the text so as to get its root form eg: functions,funtionality as function
Get word frequency
counting the word occurrence using FreqDist library. Word frequency helps us to determine how important the word is in the document by knowing how many times the word is being used.
POS tags
POS tag helps us to know the tags of each word like whether a word is noun, adjective etc.
NER
NER(Named Entity Recognition) is the process of getting the entity names

BLOG: https://medium.com/@pemagrg/nlp-for-beninners-using-spacy-6161cf48a229

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Natural Langauge Processing (NLP) using Spacy

Files

README.md

Latest commit

History

README.md

File metadata and controls

Natural Langauge Processing (NLP) using Spacy