 Apply NER to identify entities such as names, organizations, and locations in a given text. Perform sentence segmentation on a paragraph and explain its importance in NLP tasks.

In [1]:
import spacy
import pandas as pd
from collections import Counter

In [2]:
nlp = spacy.load("en_core_web_sm")

In [3]:
text="""
Natural language programming (NLP) is an ontology-assisted way of programming in terms of natural language sentences, e.g. English.[1] A structured document with Content, sections and subsections for explanations of sentences forms a NLP document, which is actually a computer program. Natural language programming is not to be mixed up with natural language interfacing or voice control where a program is first written and then communicated with through natural language using an interface added on. In NLP the functionality of a program is organised only for the definition of the meaning of sentences. For instance, NLP can be used to represent all the knowledge of an autonomous robot. Having done so, its tasks can be scripted by its users so that the robot can execute them autonomously while keeping to prescribed rules of behaviour as determined by the robot's user. Such robots are called transparent robots [2] as their reasoning is transparent to users and this develops trust in robots. Natural language use and natural language user interfaces include Inform 7, a natural programming language for making interactive fiction, Shakespeare, an esoteric natural programming language in the style of the plays of William Shakespeare, and Wolfram Alpha, a computational knowledge engine, using natural-language input.[citation needed] Some methods for program synthesis are based on natural-language programming.[3]"""

In [4]:
print("Total Words:", len(text.split()))

Total Words: 214


In [5]:
doc = nlp(text)

In [6]:
entities = []
for ent in doc.ents:
    entities.append([ent.text, ent.label_])

# Convert to DataFrame
df_ner = pd.DataFrame(entities, columns=["Entity", "Label"])
entity_counts = Counter(df_ner["Label"])

print("\nEntity Type Counts:")
for label, count in entity_counts.items():
    print(f"{label}: {count}")
print("\nSample Named Entities:")
print(df_ner.head(25))


Entity Type Counts:
ORG: 5
ORDINAL: 1
CARDINAL: 1
PERSON: 4

Sample Named Entities:
                 Entity     Label
0                   NLP       ORG
1               Content       ORG
2                   NLP       ORG
3                 first   ORDINAL
4                   NLP       ORG
5                   NLP       ORG
6                     2  CARDINAL
7              Inform 7    PERSON
8           Shakespeare    PERSON
9   William Shakespeare    PERSON
10        Wolfram Alpha    PERSON


Visualization

In [7]:
from spacy import displacy

In [8]:
displacy.render(doc, style="dep", jupyter=True)