<a href="https://colab.research.google.com/github/shfarhaan/NLP/blob/main/spaCy/spaCy_Tutorial_for_Natural_Language_Processing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **spaCy Tutorial for Natural Language Processing**

## **Introduction**

Python has emerged as a popular language for Natural Language Processing (NLP) due to its simplicity and powerful libraries. One such library is spaCy, which provides easy-to-use and efficient tools for various NLP tasks. This tutorial aims to introduce beginners to spaCy and cover essential NLP tasks using this library.


### Introduction to spaCy

#### What is spaCy?
spaCy is an open-source library used for advanced NLP in Python. It is designed with the goal of being fast, streamlined, and simple to use. spaCy offers features for tokenization, named entity recognition (NER), part-of-speech tagging, dependency parsing, and more.


#### Installation
To install spaCy, use pip:

```bash
pip install spacy
```

In [None]:
!pip install spacy

In [4]:
import spacy

# Load the spaCy NLP object
nlp = spacy.load("en_core_web_sm")

# Preprocess the text
text = "This is a sample text."
doc = nlp(text)

# Print the tokens
for token in doc:
    print(token.text)

This
is
a
sample
text
.



Replace `en_core_web_sm` with the language model you want to download. This example uses the English language model.


### Text Preprocessing with spaCy

#### Tokenization
Tokenization breaks text into individual words or tokens. Here's how to tokenize text using spaCy:

```python
import spacy

nlp = spacy.load("en_core_web_sm")
text = "Tokenization breaks text into tokens."
doc = nlp(text)

for token in doc:
    print(token.text)
```


#### Lemmatization
Lemmatization reduces words to their base or root form. Here's an example:

```python
for token in doc:
    print(token.text, token.lemma_)
```

#### Part-of-Speech Tagging
Identifying the grammatical parts of a sentence using spaCy:

```python
for token in doc:
    print(token.text, token.pos_)
```

### Named Entity Recognition (NER)

Named Entity Recognition identifies entities in text, such as names, organizations, locations, etc. Example:

```python
text = "Apple is situated in California."
doc = nlp(text)

for ent in doc.ents:
    print(ent.text, ent.label_)
```

### 8. Dependency Parsing

Dependency Parsing reveals the grammatical structure of a sentence. Example:

```python
for token in doc:
    print(token.text, token.dep_, token.head.text, token.head.pos_)
```



### 9. Text Classification with spaCy

Text classification categorizes text into predefined classes or categories. Here's a simple example:

```python
# Training data preparation
train_texts = ["Text 1", "Text 2", "Text 3"]
train_labels = ["Label 1", "Label 2", "Label 3"]

# Train a text classification model
textcat = nlp.create_pipe("textcat")
nlp.add_pipe(textcat, last=True)

textcat.add_label("Label 1")
textcat.add_label("Label 2")
textcat.add_label("Label 3")

train_data = list(zip(train_texts, [{"cats": {label: 1.0 if label == true_label else 0.0 for label in train_labels}} for true_label in train_labels]))

for text, annotations in train_data:
    doc = nlp.make_doc(text)
    example = Example.from_dict(doc, annotations)
    nlp.update([example], losses={textcat: losses.CategoricalCrossentropy()})

# Classify new text
new_text = "New text to classify"
doc = nlp(new_text)
print(doc.cats)

```

### 10. Practical Examples and Projects

#### Project 1: Sentiment Analysis
Perform sentiment analysis on a dataset using spaCy for text classification.

#### Project 2: Information Extraction
Extract specific information, like dates or quantities, from a set of documents using spaCy.









### 11. Conclusion

In this tutorial, we covered the basics of Python and spaCy for NLP tasks. We explored text preprocessing, named entity recognition, dependency parsing, text classification, and presented practical examples and projects. To further advance your understanding, continue exploring spaCy's documentation, practice on different datasets, and engage in real-world NLP projects. With consistent practice, you'll become proficient in NLP using Python and spaCy.

Remember, NLP is a vast field, and this tutorial only scratches the surface. Continual learning and hands-on experience will enhance your skills and understanding.

I hope this tutorial serves as a solid foundation for your journey into NLP with spaCy and Python.