# Spacy NER

## Loading Model

You'd need to have spacy and the model

```
pip install spacy
python -m spacy download en_core_web_sm
```

In [1]:
import spacy

In [2]:
nlp = spacy.load("en_core_web_sm")


In [10]:
sample_text = """Operation Goodwood was a series of air raids launched from aircraft carriers of the British Home Fleet against the German battleship Tirpitz in Kaafjord, Norway. It was the Royal Navy's last attack on Tirpitz, which posed a significant threat to the Allied convoys travelling to the Soviet Union. The Fleet departed its base on 18 August 1944 and first launched air raids against Kaafjord on the morning and evening of 22 August. Further attacks were made on 24 and 29 August. All of these attacks failed, and only two bombs struck Tirpitz. German forces suffered the loss of 12 aircraft and damage to 7 other ships. The British lost 17 aircraft and a frigate. HMS Nabob, an escort carrier, was also badly damaged. Historians attribute Operation Goodwood's failure to shortcomings of the Fleet Air Arm's aircraft and armament. The mission to sink Tirpitz was subsequently transferred to the Royal Air Force."""

## Extracting Noun Chunks

In [5]:
doc = nlp(sample_text)

In [7]:
for chunk in doc.noun_chunks:
    print(chunk.text, chunk.root.text, chunk.root.dep_,
            chunk.root.head.text)

Operation Goodwood Goodwood nsubj was
a series series attr was
air raids raids pobj of
aircraft carriers carriers pobj from
the British Home Fleet Fleet pobj of
the German battleship Tirpitz Tirpitz pobj against
Kaafjord Kaafjord pobj in
Norway Norway appos Kaafjord
It It nsubj was
the Royal Navy's last attack attack attr was
Tirpitz Tirpitz pobj on
a significant threat threat dobj posed
the Allied convoys convoys pobj to
the Soviet Union Union pobj to
The Fleet Fleet nsubj departed
its base base dobj departed
18 August August pobj on
air raids raids dobj launched
Kaafjord Kaafjord pobj against
the morning morning pobj on
evening evening conj morning
22 August August pobj of
Further attacks attacks nsubjpass made
24 and 29 August August pobj on
these attacks attacks pobj of
only two bombs bombs nsubj struck
Tirpitz Tirpitz dobj struck
German forces forces nsubj suffered
the loss loss dobj suffered
12 aircraft aircraft pobj of
damage damage conj aircraft
7 other ships ships pobj to
The 

## Extracting NER

In [8]:
for ent in doc.ents:
    print(ent.text, ent.start_char, ent.end_char, ent.label_)

Operation Goodwood 0 18 ORG
the British Home Fleet 80 102 ORG
German 115 121 NORP
Tirpitz 133 140 GPE
Kaafjord 144 152 GPE
Norway 154 160 GPE
the Royal Navy's 169 185 ORG
Tirpitz 201 208 ORG
Allied 250 256 ORG
the Soviet Union 279 295 GPE
Fleet 301 306 ORG
18 August 1944 328 342 DATE
first 347 352 ORDINAL
Kaafjord 380 388 GPE
the morning and evening of 22 August 392 428 TIME
24 and 29 August 459 475 DATE
only two 510 518 CARDINAL
Tirpitz 532 539 GPE
German 541 547 NORP
12 576 578 CARDINAL
7 602 603 CARDINAL
British 621 628 NORP
17 634 636 CARDINAL
Historians 715 725 NORP
Operation Goodwood's 736 756 ORG
the Fleet Air Arm's 784 803 FAC
Tirpitz 847 854 PERSON
the Royal Air Force 887 906 ORG
